{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Phase 1 — Pipeline Selection (DCGAN ablations)\n", "\n", "Goal: with a cheap proxy (vanilla DCGAN at 64×64, 50 epochs), isolate which **data-pipeline\n", "choices** matter so phases 2–4 can train on the best preprocessing without burning compute\n", "on dead-end variants.\n", "\n", "Four ablations, one factor each:\n", "\n", "- **1A** — Resolution: 64×64 vs 128×128\n", "- **1B** — Face crop + alignment: full image vs MTCNN-aligned\n", "- **1C** — Augmentation: H-flip only vs H-flip + rotation + colour jitter\n", "- **1D** — Combined dataset: aligned only vs aligned + raw mixed\n", "\n", "**Headline result:** `p1c_dcgan_full_aug` — **FID@50 = 33.4**. The pipeline carried\n", "forward into all later phases is the one this experiment selected: MTCNN-aligned crops,\n", "64×64, full augmentation for GANs (H-flip-only kept as a safer default for VAE/DDPM),\n", "aligned-only (no mixing).\n" ] }, { "cell_type": "markdown", "id": "d385d01c", "metadata": {}, "source": [ "### Reference: phase 0 baseline (same family)\n", "\n", "The phase 0 WGAN-GP (`p0_wgan`) trained on raw un-aligned images for 200 epochs\n", "without any pipeline tuning — also collapsed. Phase 1 below uses the same model class\n", "with the data pipeline systematically varied; the architecture limitation is constant.\n" ] }, { "cell_type": "code", "execution_count": 1, "id": "27b83467", "metadata": {}, "outputs": [], "source": [ "import json\n", "from pathlib import Path\n", "\n", "import matplotlib.pyplot as plt\n", "import matplotlib.image as mpimg\n", "import numpy as np\n", "import pandas as pd\n", "\n", "plt.rcParams.update({\"figure.dpi\": 120, \"font.size\": 10})\n", "\n", "OUTPUTS = Path(\"../outputs\")\n", "LOGS = OUTPUTS / \"logs\"\n", "SAMPLES = OUTPUTS / \"samples\"\n", "\n", "\n", "def load_log(name):\n", " p = LOGS / f\"{name}.json\"\n", " return json.load(open(p)) if p.exists() else None\n", "\n", "def get_fid(log, epoch):\n", " fid = log.get(\"history\", {}).get(\"fid\", {})\n", " return fid.get(str(epoch))\n", "\n", "def fid_series(log):\n", " fid = log.get(\"history\", {}).get(\"fid\", {})\n", " items = sorted((int(k), v) for k, v in fid.items())\n", " return [e for e, _ in items], [v for _, v in items]\n" ] }, { "cell_type": "markdown", "id": "2d78c763", "metadata": {}, "source": [ "## 1. Load all experiment logs" ] }, { "cell_type": "code", "execution_count": 2, "id": "34f4810d", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Loaded 7 experiments:\n", " p1a_dcgan_128\n", " p1a_dcgan_64\n", " p1b_dcgan_aligned\n", " p1b_dcgan_full\n", " p1c_dcgan_full_aug\n", " p1c_dcgan_hflip\n", " p1d_dcgan_combined\n" ] } ], "source": [ "run_names = sorted(p.stem for p in LOGS.glob(\"p1*.json\"))\n", "runs = {name: load_log(name) for name in run_names}\n", "runs = {k: v for k, v in runs.items() if v}\n", "\n", "print(f\"Loaded {len(runs)} experiments:\")\n", "for name in run_names: print(f\" {name}\")\n" ] }, { "cell_type": "code", "execution_count": 3, "id": "2568b652", "metadata": {}, "outputs": [], "source": [ "experiment_groups = {\n", " \"1A — Resolution\": {\"p1a_dcgan_64\": \"64×64 (raw)\",\n", " \"p1a_dcgan_128\": \"128×128 (raw)\"},\n", " \"1B — Alignment\": {\"p1b_dcgan_full\": \"Full image (raw)\",\n", " \"p1b_dcgan_aligned\": \"MTCNN-aligned\"},\n", " \"1C — Augmentation\": {\"p1c_dcgan_hflip\": \"H-flip only\",\n", " \"p1c_dcgan_full_aug\": \"H-flip + rot + colour\"},\n", " \"1D — Dataset mixing\": {\"p1b_dcgan_aligned\": \"Aligned only\",\n", " \"p1d_dcgan_combined\": \"Aligned + raw mixed\"},\n", "}\n" ] }, { "cell_type": "markdown", "id": "76ca9d9f", "metadata": {}, "source": [ "## 2. FID comparison table" ] }, { "cell_type": "code", "execution_count": 4, "id": "9a3a3153", "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "
| \n", " | Experiment | \n", "Size | \n", "Augment | \n", "FID@25 | \n", "FID@50 | \n", "G loss (ep50) | \n", "D loss (ep50) | \n", "
|---|---|---|---|---|---|---|---|
| 4 | \n", "p1c_dcgan_full_aug | \n", "64×64 | \n", "True | \n", "48.0 | \n", "33.4 | \n", "3.480 | \n", "0.412 | \n", "
| 5 | \n", "p1c_dcgan_hflip | \n", "64×64 | \n", "False | \n", "48.9 | \n", "37.9 | \n", "3.739 | \n", "0.392 | \n", "
| 2 | \n", "p1b_dcgan_aligned | \n", "64×64 | \n", "False | \n", "47.5 | \n", "42.0 | \n", "3.965 | \n", "0.312 | \n", "
| 1 | \n", "p1a_dcgan_64 | \n", "64×64 | \n", "False | \n", "120.9 | \n", "86.7 | \n", "4.019 | \n", "0.283 | \n", "
| 6 | \n", "p1d_dcgan_combined | \n", "64×64 | \n", "False | \n", "95.5 | \n", "87.4 | \n", "5.265 | \n", "0.198 | \n", "
| 3 | \n", "p1b_dcgan_full | \n", "64×64 | \n", "False | \n", "109.6 | \n", "89.0 | \n", "3.960 | \n", "0.370 | \n", "
| 0 | \n", "p1a_dcgan_128 | \n", "128×128 | \n", "False | \n", "143.1 | \n", "115.0 | \n", "5.013 | \n", "0.185 | \n", "