DRL_PROJ/README.md

# DRL_PROJ — DeepFake Detection

Deep learning project for binary deepfake detection on the DeepFakeFace dataset.

## Project structure

```
DRL_PROJ/
  classifier/       ← discriminative model (real vs. fake classifier)
    src/            ← model definitions, training, evaluation, preprocessing
    configs/        ← experiment configs organised by phase
      phase1/       ← baseline models (SimpleCNN, ResNet18)
      phase2/       ← architecture sweep (ResNet variants, face-crop)
      phase3/       ← EfficientNet, ViT, frequency-aware training
      phase4/       ← ensemble strategies
    tools/          ← analyse.py, ensemble.py, inference.py, facecrop.py
    notebooks/      ← EDA, preprocessing, evaluation, GradCAM
    outputs/        ← models, logs, figures (gitignored except .pt/.json)
    run.py          ← main training entry point
  generator/        ← generative model (GAN / VAE / diffusion) — in progress
  pipeline/         ← Vast.ai ephemeral GPU orchestration
  data/             ← dataset root (gitignored)
  cropped/          ← MTCNN pre-cropped faces (gitignored)
    classifier/     ← bbox crops for the classifier
    generator/      ← landmark-aligned crops for the generator
```

## Setup

Create a local environment when you want to run the code directly on a machine you control:

```bash
python3 -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip setuptools wheel
python -m pip install -r requirements.txt
```

## Local Training

```bash
python3 classifier/run.py classifier/configs/phase2/p2_resnet18_facecrop.json
python3 classifier/run.py classifier/configs/phase3/p3_efficientnet_b0.json
```

## Ephemeral Vast.ai Pipeline

The deployment/orchestration path now lives under [`pipeline/`](/run/host/mnt/shared/UP/DRL/DRL_PROJ/pipeline/README.md).

One-time setup:

```bash
cat > pipeline/.env <<'EOF'
VAST_API_KEY=<your-api-key>
VAST_SSH_PRIVATE_KEY=/home/your-user/.ssh/id_ed25519
EOF
```

End-to-end ephemeral run:

```bash
python3 -m pipeline run classifier/configs/phase2/p2_resnet18_facecrop.json --upload-data
```

Interactive offer selection:

```bash
python3 -m pipeline offers --select-offer
```

You can override the ranking mode per run:

```bash
python3 -m pipeline offers --sort price
python3 -m pipeline offers --sort performance
python3 -m pipeline offers --sort performance --price 0.14
```

You can also filter by region:

```bash
python3 -m pipeline offers --select-offer --region europe
python3 -m pipeline offers --select-offer --region Portugal
python3 -m pipeline offers --select-offer --region US
python3 -m pipeline offers --select-offer --region europe --price 0.14
```

To inspect which region strings are currently available from the search results:

```bash
python3 -m pipeline offers --list-regions
```

That command:
- ensures your SSH public key is registered with Vast.ai
- searches offers using the filters in `pipeline/defaults/vast.json`
- creates an instance
- waits for SSH readiness
- syncs the repo
- uploads `data/` when `--upload-data` is set
- runs `python3 classifier/run.py ...`
- downloads `classifier/outputs/`
- for generator runs, rsyncs `generator/outputs/` back every 50 epochs and again at completion
- destroys the instance automatically unless `--keep-on-failure` is set

Useful commands:

```bash
python3 -m pipeline up
python3 -m pipeline status <instance_id>
python3 -m pipeline down <instance_id>
```

To override the default Vast search/runtime settings, copy `pipeline/defaults/vast.json`, edit it, and pass:

```bash
python3 -m pipeline run classifier/configs/phase3/p3_efficientnet_b0.json --pipeline-config /path/to/vast.override.json
```

The default policy in `pipeline/defaults/vast.json` now targets:
- `1x` GPU
- `RTX 3090` or `RTX 3090 Ti`
- `<= $0.20/hour`
- sorted by `dlperf` descending
- uses `vastai/pytorch:latest` as the default image