Final polish

2026-05-14 21:16:03 +01:00
parent 3bff7eefb0
commit afd26f47d2
732 changed files with 4149 additions and 79134 deletions
@@ -1,4 +1,4 @@
-# Deep Learning Face Project
+# Deep learning face project

 This repository contains a two-part deep learning project on the
 DeepFakeFace (DFF) dataset:
@@ -10,7 +10,7 @@ The project is written as an experimental report. The notebooks are the main
 deliverable: they show the pipeline, the intermediate failures, the ablations,
 the decisions, and the final models. Read them in order.

-## Project Story
+## Project story

 The work follows the same principle in both parts: start with a simple
 baseline, inspect what fails, change one important factor at a time, and keep
@@ -19,7 +19,7 @@ the evidence tied to saved logs and saved artifacts.
 For the **classifier**, the story moves from dataset understanding to
 preprocessing, baseline models, controlled ablations, Grad-CAM inspection,
 stronger model families, and data scaling. The final practical classifier is a
-ResNet50-style pipeline using face crops, 224x224 inputs, ImageNet/default
+ResNet50-style pipeline using face crops, 224×224 inputs, ImageNet/default
 normalization, and no stochastic augmentation at validation/test time.

 For the **generator**, the story starts with raw baseline failures, then locks
@@ -28,14 +28,14 @@ GAN, VAE, and DDPM. The final comparison keeps quality versus speed central:
 DDPM gives the best saved FID and visual quality, GAN is the best
 quality-speed compromise, and VAE is the fastest but smoothest option.

-## How To Read The Project
+## How to read the project

 Start with the classifier notebooks, then read the generator notebooks. The
 generator has one linear setup stage followed by three parallel branches:
 GAN, VAE, and DDPM. Those branches are numbered in reading order, but they are
 conceptually parallel experiments after the pipeline is selected.

-### Classifier Notebooks
+### Classifier notebooks

 Read these first:

@@ -57,7 +57,7 @@ Read these first:
 7. `classifier/notebooks/07_phase4_data_scaling_analysis.ipynb`  
   Data scaling for strong backbones and the final classifier decision.

-### Generator Notebooks
+### Generator notebooks

 Read these after the classifier:

@@ -67,12 +67,12 @@ Read these after the classifier:
   Controlled pipeline ablations: resolution, alignment, augmentation, and
   raw/aligned mixing.
 3. `generator/notebooks/03_gan_stability_progression.ipynb`  
-   GAN branch: DCGAN -> WGAN-GP -> spectral normalization + GroupNorm +
-   self-attention -> 128x128 check.
+   GAN branch: DCGAN → WGAN-GP → spectral normalization + GroupNorm +
+   self-attention → 128×128 check.
 4. `generator/notebooks/04_vae_loss_progression.ipynb`  
-   VAE branch: MSE + KL -> perceptual loss -> PatchGAN adversarial loss.
+   VAE branch: MSE + KL → perceptual loss → PatchGAN adversarial loss.
 5. `generator/notebooks/05_ddpm_recipe_progression.ipynb`  
-   DDPM branch: linear schedule -> cosine schedule -> v-prediction -> wider
+   DDPM branch: linear schedule → cosine schedule → v-prediction → wider
   backbone.
 6. `generator/notebooks/06_final_family_comparison.ipynb`  
   Final comparison of the selected GAN, VAE, and DDPM recipes under saved
@@ -81,7 +81,7 @@ Read these after the classifier:
   Curated final sample examples from saved outputs. This is qualitative
   showcase material, not a replacement for FID.

-## What The Notebooks Do
+## What the notebooks do

 The notebooks are analysis/report chapters. They load existing configs, logs,
 figures, saved sample grids, checkpoints, and prediction summaries. They are
@@ -97,7 +97,7 @@ When a notebook shows a plot or image grid, the surrounding markdown explains:
 This is important because the project is evaluated not only by final
 performance, but by the documented evolution of the solution.

-## Repository Layout
+## Repository layout

 ```text
 DRL_PROJ/
@@ -106,6 +106,7 @@ DRL_PROJ/
    notebooks/     classifier report notebooks
    outputs/       saved logs, figures, Grad-CAM panels, checkpoints
    src/           classifier data, models, training, evaluation
+    tests/         unit and smoke tests
    tools/         facecrop, Grad-CAM, inference, reevaluation helpers

  generator/
@@ -113,6 +114,7 @@ DRL_PROJ/
    notebooks/     generator report notebooks and notebook builder
    outputs/       saved logs, sample grids, final showcase artifacts
    src/           generator data, models, training, metrics
+    tests/         unit and smoke tests
    tools/         sampling and utility scripts

  data/            original DFF dataset root, not committed
@@ -121,11 +123,11 @@ DRL_PROJ/
  pipeline/        optional remote/GPU orchestration helpers
 ```

-## Rebuilding The Generator Notebooks
+## Rebuilding the generator notebooks

 The generator notebooks are generated from a single source file:

-```powershell
+```bash
 cd generator/notebooks
 python _build.py
 ```
@@ -133,24 +135,120 @@ python _build.py
 That builder writes the numbered generator notebooks listed above. It uses
 existing saved logs and artifacts; it does not train models.

-## Running The Code
+## Setup

-Create an environment and install the project requirements:
+Create a conda environment and install the project requirements:

-```powershell
-python -m venv .venv
-.\.venv\Scripts\python.exe -m pip install --upgrade pip setuptools wheel
-.\.venv\Scripts\python.exe -m pip install -r requirements.txt
+```bash
+conda create -n drl python=3.12
+conda activate drl
+python -m pip install --upgrade pip setuptools wheel
+python -m pip install -r requirements.txt
 ```

+Use **Python 3.12**; some dependencies (for example `facenet-pytorch`) are
+unreliable on 3.13+.
+
 The raw dataset should be placed under `data/`. Preprocessed crops are stored
-under `cropped/`. These folders are intentionally not committed.
+under `cropped/`. These folders are intentionally not committed. To download
+and extract the dataset:

-Execution entry points exist in `classifier/run.py` and `generator/run.py` for
-reproducibility, but the report notebooks should be read as analysis over
-already saved results.
+```bash
+python classifier/tools/fetch_ds.py
+python classifier/tools/fetch_ds.py --data-dir /path/to/DFF
+```

-## Final Takeaway
+Expected layout under the data root: `wiki/<identity>/*.jpg`,
+`inpainting/...`, `text2img/...`, `insight/...`.
+
+## Classifier — training
+
+From the repository root:
+
+```bash
+# CPU (slow but valid)
+python classifier/run.py classifier/configs/phase4/p4_convnext_tiny_100pct.json
+
+# GPU when CUDA is available
+python classifier/run.py classifier/configs/phase4/p4_convnext_tiny_100pct.json --use-gpu
+```
+
+Training uses 5-fold stratified group cross-validation. Per-fold checkpoints
+are saved as `classifier/outputs/models/{run_name}_fold{k}_best.pt` (and
+`_final.pt`). Override data or output locations with `--data-dir` and
+`--output-root`.
+
+**Primary delivery model** (best Phase 4 detector): config
+`classifier/configs/phase4/p4_convnext_tiny_100pct.json` with per-fold
+weights `classifier/outputs/models/p4_convnext_tiny_100pct_fold*_best.pt`.
+
+## Classifier — inference
+
+Classify a single image as real or fake:
+
+```bash
+python classifier/tools/inference.py image.jpg classifier/configs/phase4/p4_convnext_tiny_100pct.json
+```
+
+This loads the config and the matching checkpoint, runs the image through the
+model, and prints a result like:
+
+```
+Image : image.jpg
+Model : p4_convnext_tiny_100pct (convnext_tiny)
+Device: cuda
+Result: FAKE  (confidence: 74.7%)
+P(fake): 0.7466   P(real): 0.2534
+```
+
+If you omit `--checkpoint`, the tool automatically looks for a saved
+checkpoint under `classifier/outputs/models/` — first the single-run
+`{run_name}_best.pt`, then CV fold files `{run_name}_fold{k}_best.pt`, then
+`{run_name}_fold{k}_final.pt`. To use a specific fold:
+
+```bash
+python classifier/tools/inference.py image.jpg classifier/configs/phase4/p4_convnext_tiny_100pct.json \
+  --checkpoint classifier/outputs/models/p4_convnext_tiny_100pct_fold0_best.pt
+```
+
+## Generator — training
+
+From the repository root:
+
+```bash
+python generator/run.py generator/configs/phase0/p0_vae.json
+python generator/run.py generator/configs/phase0/p0_ddpm.json
+```
+
+Generator training expects real-face images (default source is `wiki`); use
+`--data-dir` to point at your dataset tree. Checkpoints are saved under
+`generator/outputs/models/{run_name}_final_ema.pt` (EMA shadow) and
+`{run_name}_best_ema.pt` (lowest-FID snapshot).
+
+## Generator — inference (sampling)
+
+Generate 4×4 sample grids from Phase 5 EMA checkpoints:
+
+```bash
+python generator/tools/sampling.py --models p5_gan p5_vae p5_ddpm --samples 10
+```
+
+Options:
+
+- `--models` — which models to sample from (`p5_gan`, `p5_vae`, `p5_ddpm`;
+  defaults to all three).
+- `--samples` — number of grids per model (default 10).
+- `--output-dir` — where to write the PNGs (default
+  `generator/outputs/samples/final_comparison/`).
+- `--truncation` — optional latent truncation for the GAN (lower = less
+  diversity but sharper).
+- `--device` — `cuda` or `cpu` (default: auto-detect).
+
+Each grid is a 4×4 PNG of 16 images sampled from the model's EMA weights.
+GAN samples are drawn from random latent vectors, VAE samples decode from the
+learned prior, and DDPM samples use 50-step DDIM.
+
+## Final takeaway

 The project is best understood as a sequence of controlled decisions: