6.6 KiB
Deep Learning Face Project
This repository contains a two-part deep learning project on the DeepFakeFace (DFF) dataset:
- Classifier: detect whether a face image is real or fake.
- Generator: train generative models that produce new fake face images.
The project is written as an experimental report. The notebooks are the main deliverable: they show the pipeline, the intermediate failures, the ablations, the decisions, and the final models. Read them in order.
Project Story
The work follows the same principle in both parts: start with a simple baseline, inspect what fails, change one important factor at a time, and keep the evidence tied to saved logs and saved artifacts.
For the classifier, the story moves from dataset understanding to preprocessing, baseline models, controlled ablations, Grad-CAM inspection, stronger model families, and data scaling. The final practical classifier is a ResNet50-style pipeline using face crops, 224x224 inputs, ImageNet/default normalization, and no stochastic augmentation at validation/test time.
For the generator, the story starts with raw baseline failures, then locks the data pipeline before comparing three parallel model-family branches: GAN, VAE, and DDPM. The final comparison keeps quality versus speed central: DDPM gives the best saved FID and visual quality, GAN is the best quality-speed compromise, and VAE is the fastest but smoothest option.
How To Read The Project
Start with the classifier notebooks, then read the generator notebooks. The generator has one linear setup stage followed by three parallel branches: GAN, VAE, and DDPM. Those branches are numbered in reading order, but they are conceptually parallel experiments after the pipeline is selected.
Classifier Notebooks
Read these first:
classifier/notebooks/01_eda.ipynb
Dataset composition, real/fake source mapping, image statistics, and shortcut risks.classifier/notebooks/02_preprocessing.ipynb
Deterministic preprocessing, train-only augmentation, face crops, and normalization.classifier/notebooks/03_phase1_analysis.ipynb
SimpleCNN and ResNet18 controlled baselines.classifier/notebooks/04_phase2_analysis.ipynb
Resolution, normalization, source holdouts, facecrop, and augmentation ablations.classifier/notebooks/05_gradcam_analysis.ipynb
Qualitative localization analysis across the classifier pipeline.classifier/notebooks/06_phase3_model_family_analysis.ipynb
Stronger pretrained model families and the ResNet50 practical choice.classifier/notebooks/07_phase4_data_scaling_analysis.ipynb
Data scaling for strong backbones and the final classifier decision.
Generator Notebooks
Read these after the classifier:
generator/notebooks/01_baseline_sanity_check.ipynb
Raw baseline failures and why the data pipeline must be fixed first.generator/notebooks/02_pipeline_selection.ipynb
Controlled pipeline ablations: resolution, alignment, augmentation, and raw/aligned mixing.generator/notebooks/03_gan_stability_progression.ipynb
GAN branch: DCGAN -> WGAN-GP -> spectral normalization + GroupNorm + self-attention -> 128x128 check.generator/notebooks/04_vae_loss_progression.ipynb
VAE branch: MSE + KL -> perceptual loss -> PatchGAN adversarial loss.generator/notebooks/05_ddpm_recipe_progression.ipynb
DDPM branch: linear schedule -> cosine schedule -> v-prediction -> wider backbone.generator/notebooks/06_final_family_comparison.ipynb
Final comparison of the selected GAN, VAE, and DDPM recipes under saved Phase 5 conditions.generator/notebooks/07_final_sample_showcase.ipynb
Curated final sample examples from saved outputs. This is qualitative showcase material, not a replacement for FID.
What The Notebooks Do
The notebooks are analysis/report chapters. They load existing configs, logs, figures, saved sample grids, checkpoints, and prediction summaries. They are not intended to launch new training runs.
When a notebook shows a plot or image grid, the surrounding markdown explains:
- what the artifact shows;
- why it is needed;
- how it supports the phase decision;
- what limitation remains.
This is important because the project is evaluated not only by final performance, but by the documented evolution of the solution.
Repository Layout
DRL_PROJ/
classifier/
configs/ experiment configs by phase
notebooks/ classifier report notebooks
outputs/ saved logs, figures, Grad-CAM panels, checkpoints
src/ classifier data, models, training, evaluation
tools/ facecrop, Grad-CAM, inference, reevaluation helpers
generator/
configs/ generator configs by phase/family
notebooks/ generator report notebooks and notebook builder
outputs/ saved logs, sample grids, final showcase artifacts
src/ generator data, models, training, metrics
tools/ sampling and utility scripts
data/ original DFF dataset root, not committed
cropped/ preprocessed face crops, not committed
docs/ project statement and supporting documents
pipeline/ optional remote/GPU orchestration helpers
Rebuilding The Generator Notebooks
The generator notebooks are generated from a single source file:
cd generator/notebooks
python _build.py
That builder writes the numbered generator notebooks listed above. It uses existing saved logs and artifacts; it does not train models.
Running The Code
Create an environment and install the project requirements:
python -m venv .venv
.\.venv\Scripts\python.exe -m pip install --upgrade pip setuptools wheel
.\.venv\Scripts\python.exe -m pip install -r requirements.txt
The raw dataset should be placed under data/. Preprocessed crops are stored
under cropped/. These folders are intentionally not committed.
Execution entry points exist in classifier/run.py and generator/run.py for
reproducibility, but the report notebooks should be read as analysis over
already saved results.
Final Takeaway
The project is best understood as a sequence of controlled decisions:
- cleanly define the data and preprocessing;
- establish simple baselines;
- improve one factor at a time;
- compare model families using saved evidence;
- report both performance and limitations.
The classifier becomes reliable through source-aware preprocessing, stronger pretrained backbones, and scaling. The generator improves by first locking the face-aligned pipeline and then selecting the best recipe inside each model family before the final GAN/VAE/DDPM comparison.