DRL_PROJ — DeepFake Detection

Deep learning project for binary deepfake detection on the DeepFakeFace dataset.

Project structure

DRL_PROJ/
  classifier/       ← discriminative model (real vs. fake classifier)
    src/            ← model definitions, training, evaluation, preprocessing
    configs/        ← experiment configs organised by phase
      phase1/       ← baseline models (SimpleCNN, ResNet18)
      phase2/       ← architecture sweep (ResNet variants, face-crop)
      phase3/       ← EfficientNet, ViT, frequency-aware training
      phase4/       ← ensemble strategies
    tools/          ← analyse.py, ensemble.py, inference.py, facecrop.py
    notebooks/      ← EDA, preprocessing, evaluation, GradCAM
    outputs/        ← models, logs, figures (gitignored except .pt/.json)
    run.py          ← main training entry point
  generator/        ← generative model (GAN / VAE / diffusion) — in progress
  pipeline/         ← Vast.ai ephemeral GPU orchestration
  data/             ← dataset root (gitignored)
  cropped/          ← MTCNN pre-cropped faces (gitignored)
    classifier/     ← bbox crops for the classifier
    generator/      ← landmark-aligned crops for the generator

Setup

Create a local environment when you want to run the code directly on a machine you control:

python3 -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip setuptools wheel
python -m pip install -r requirements.txt

Local Training

python3 classifier/run.py classifier/configs/phase2/p2_resnet18_facecrop.json
python3 classifier/run.py classifier/configs/phase3/p3_efficientnet_b0.json

Ephemeral Vast.ai Pipeline

The deployment/orchestration path now lives under pipeline/.

One-time setup:

cat > pipeline/.env <<'EOF'
VAST_API_KEY=<your-api-key>
VAST_SSH_PRIVATE_KEY=/home/your-user/.ssh/id_ed25519
EOF

End-to-end ephemeral run:

python3 -m pipeline run classifier/configs/phase2/p2_resnet18_facecrop.json --upload-data

Interactive offer selection:

python3 -m pipeline offers --select-offer

You can override the ranking mode per run:

python3 -m pipeline offers --sort price
python3 -m pipeline offers --sort performance
python3 -m pipeline offers --sort performance --price 0.14

You can also filter by region:

python3 -m pipeline offers --select-offer --region europe
python3 -m pipeline offers --select-offer --region Portugal
python3 -m pipeline offers --select-offer --region US
python3 -m pipeline offers --select-offer --region europe --price 0.14

To inspect which region strings are currently available from the search results:

python3 -m pipeline offers --list-regions

That command:

ensures your SSH public key is registered with Vast.ai
searches offers using the filters in pipeline/defaults/vast.json
creates an instance
waits for SSH readiness
syncs the repo
uploads data/ when --upload-data is set
runs python3 classifier/run.py ...
downloads classifier/outputs/
for generator runs, rsyncs generator/outputs/ back every 50 epochs and again at completion
destroys the instance automatically unless --keep-on-failure is set

Useful commands:

python3 -m pipeline up
python3 -m pipeline status <instance_id>
python3 -m pipeline down <instance_id>

To override the default Vast search/runtime settings, copy pipeline/defaults/vast.json, edit it, and pass:

python3 -m pipeline run classifier/configs/phase3/p3_efficientnet_b0.json --pipeline-config /path/to/vast.override.json

The default policy in pipeline/defaults/vast.json now targets:

1x GPU
RTX 3090 or RTX 3090 Ti
<= $0.20/hour
sorted by dlperf descending
uses vastai/pytorch:latest as the default image

3.8 KiB Raw Blame History

DRL_PROJ — DeepFake Detection

Project structure

Setup

Local Training

Ephemeral Vast.ai Pipeline

3.8 KiB

Raw Blame History