Files
DRL_PROJ/README.md
T
2026-04-30 03:21:49 +01:00

126 lines
3.8 KiB
Markdown

# DRL_PROJ — DeepFake Detection
Deep learning project for binary deepfake detection on the DeepFakeFace dataset.
## Project structure
```
DRL_PROJ/
classifier/ ← discriminative model (real vs. fake classifier)
src/ ← model definitions, training, evaluation, preprocessing
configs/ ← experiment configs organised by phase
phase1/ ← baseline models (SimpleCNN, ResNet18)
phase2/ ← architecture sweep (ResNet variants, face-crop)
phase3/ ← EfficientNet, ViT, frequency-aware training
phase4/ ← ensemble strategies
tools/ ← analyse.py, ensemble.py, inference.py, facecrop.py
notebooks/ ← EDA, preprocessing, evaluation, GradCAM
outputs/ ← models, logs, figures (gitignored except .pt/.json)
run.py ← main training entry point
generator/ ← generative model (GAN / VAE / diffusion) — in progress
pipeline/ ← Vast.ai ephemeral GPU orchestration
data/ ← dataset root (gitignored)
cropped/ ← MTCNN pre-cropped faces (gitignored)
classifier/ ← bbox crops for the classifier
generator/ ← landmark-aligned crops for the generator
```
## Setup
Create a local environment when you want to run the code directly on a machine you control:
```bash
python3 -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip setuptools wheel
python -m pip install -r requirements.txt
```
## Local Training
```bash
python3 classifier/run.py classifier/configs/phase2/p2_resnet18_facecrop.json
python3 classifier/run.py classifier/configs/phase3/p3_efficientnet_b0.json
```
## Ephemeral Vast.ai Pipeline
The deployment/orchestration path now lives under [`pipeline/`](/run/host/mnt/shared/UP/DRL/DRL_PROJ/pipeline/README.md).
One-time setup:
```bash
cat > pipeline/.env <<'EOF'
VAST_API_KEY=<your-api-key>
VAST_SSH_PRIVATE_KEY=/home/your-user/.ssh/id_ed25519
EOF
```
End-to-end ephemeral run:
```bash
python3 -m pipeline run classifier/configs/phase2/p2_resnet18_facecrop.json --upload-data
```
Interactive offer selection:
```bash
python3 -m pipeline offers --select-offer
```
You can override the ranking mode per run:
```bash
python3 -m pipeline offers --sort price
python3 -m pipeline offers --sort performance
python3 -m pipeline offers --sort performance --price 0.14
```
You can also filter by region:
```bash
python3 -m pipeline offers --select-offer --region europe
python3 -m pipeline offers --select-offer --region Portugal
python3 -m pipeline offers --select-offer --region US
python3 -m pipeline offers --select-offer --region europe --price 0.14
```
To inspect which region strings are currently available from the search results:
```bash
python3 -m pipeline offers --list-regions
```
That command:
- ensures your SSH public key is registered with Vast.ai
- searches offers using the filters in `pipeline/defaults/vast.json`
- creates an instance
- waits for SSH readiness
- syncs the repo
- uploads `data/` when `--upload-data` is set
- runs `python3 classifier/run.py ...`
- downloads `classifier/outputs/`
- for generator runs, rsyncs `generator/outputs/` back every 50 epochs and again at completion
- destroys the instance automatically unless `--keep-on-failure` is set
Useful commands:
```bash
python3 -m pipeline up
python3 -m pipeline status <instance_id>
python3 -m pipeline down <instance_id>
```
To override the default Vast search/runtime settings, copy `pipeline/defaults/vast.json`, edit it, and pass:
```bash
python3 -m pipeline run classifier/configs/phase3/p3_efficientnet_b0.json --pipeline-config /path/to/vast.override.json
```
The default policy in `pipeline/defaults/vast.json` now targets:
- `1x` GPU
- `RTX 3090` or `RTX 3090 Ti`
- `<= $0.20/hour`
- sorted by `dlperf` descending
- uses `vastai/pytorch:latest` as the default image