{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Phase 2 — GAN Evolution\n", "\n", "With the data pipeline locked (phase 1), iterate on the **GAN itself**: objective,\n", "normalisation, attention, resolution. All four runs use the aligned-64 pipeline; only\n", "the model and training recipe change.\n", "\n", "| Run | Step |\n", "|-------------------------|------------------------------------------------|\n", "| `p2_1_dcgan` | Phase 1 best, retrained 100 epochs |\n", "| `p2_2_wgan` | BCE → Wasserstein-GP (n_critic=2, β=(0,0.9)) |\n", "| `p2_3_wgan_sn_attn` | + spectral norm + GroupNorm + self-attention |\n", "| `p2_4_wgan_sn_attn_128` | same as 2.3 but at 128×128 |\n", "\n", "**Headline result:** `p2_3_wgan_sn_attn` — **best FID = 110.1** at 100 epochs.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> ### ⚠ FID is not comparable across phases\n", ">\n", "> Phase 1's \"best\" was FID 33 (`p1c_dcgan_full_aug`). Phase 2's \"best\" is FID 110.\n", "> **This is not a regression.** The two numbers were computed under different\n", "> protocols:\n", ">\n", "> - Phase 1 used a quick proxy FID for fast pipeline ablation, with a smaller\n", "> real-image reference set, on the un-augmented validation split.\n", "> - Phase 2 uses the project's standard FID protocol — 5000 aligned 64×64 real\n", "> images from the matched augmentation pipeline (`fid_n_real: 5000`).\n", ">\n", "> Within phase 2 the deltas are meaningful (2.2 → 2.3 = **−311 FID** is a real\n", "> architecture jump). Don't compare phase 1 vs phase 2 numbers absolutely —\n", "> only compare within a phase, or against phase 5 which uses the same protocol.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Reference: phase 0 baseline (same family)\n", "\n", "`p0_wgan` was the un-aligned, no-augmentation, basic-architecture WGAN-GP — face blobs\n", "with no recognisable features (no FID logged). Phase 2 below shows what happens once\n", "the pipeline is fixed and the model is allowed to evolve.\n" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import json\n", "from pathlib import Path\n", "\n", "import matplotlib.pyplot as plt\n", "import matplotlib.image as mpimg\n", "import numpy as np\n", "import pandas as pd\n", "\n", "plt.rcParams.update({\"figure.dpi\": 120, \"font.size\": 10})\n", "\n", "OUTPUTS = Path(\"../outputs\")\n", "LOGS = OUTPUTS / \"logs\"\n", "SAMPLES = OUTPUTS / \"samples\"\n", "\n", "\n", "def load_log(name):\n", " p = LOGS / f\"{name}.json\"\n", " return json.load(open(p)) if p.exists() else None\n", "\n", "def get_fid(log, epoch):\n", " fid = log.get(\"history\", {}).get(\"fid\", {})\n", " return fid.get(str(epoch))\n", "\n", "def fid_series(log):\n", " fid = log.get(\"history\", {}).get(\"fid\", {})\n", " items = sorted((int(k), v) for k, v in fid.items())\n", " return [e for e, _ in items], [v for _, v in items]\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1. Load experiment logs" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " p2_1_dcgan: 100 epochs\n", " p2_2_wgan: 100 epochs\n", " p2_3_wgan_sn_attn: 100 epochs\n", " p2_4_wgan_sn_attn_128: 100 epochs\n" ] } ], "source": [ "run_names = [\"p2_1_dcgan\", \"p2_2_wgan\", \"p2_3_wgan_sn_attn\", \"p2_4_wgan_sn_attn_128\"]\n", "run_labels = {\n", " \"p2_1_dcgan\": \"2.1 DCGAN (BCE)\",\n", " \"p2_2_wgan\": \"2.2 WGAN-GP\",\n", " \"p2_3_wgan_sn_attn\": \"2.3 + SN + Attn\",\n", " \"p2_4_wgan_sn_attn_128\": \"2.4 + 128×128\",\n", "}\n", "runs = {name: load_log(name) for name in run_names}\n", "runs = {k: v for k, v in runs.items() if v}\n", "for n in run_names:\n", " if n in runs: print(f\" {n}: {len(runs[n]['history']['g_loss'])} epochs\")\n", " else: print(f\" {n}: MISSING\")\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2. FID comparison table" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "
| \n", " | Run | \n", "FID@25 | \n", "FID@50 | \n", "FID@100 | \n", "Best FID | \n", "Train (min) | \n", "
|---|---|---|---|---|---|---|
| 2 | \n", "2.3 + SN + Attn | \n", "274.4 | \n", "223.2 | \n", "110.1 | \n", "110.1 | \n", "39.0 | \n", "
| 3 | \n", "2.4 + 128×128 | \n", "428.6 | \n", "264.3 | \n", "186.0 | \n", "186.0 | \n", "97.7 | \n", "
| 1 | \n", "2.2 WGAN-GP | \n", "489.6 | \n", "474.6 | \n", "421.3 | \n", "421.3 | \n", "27.1 | \n", "
| 0 | \n", "2.1 DCGAN (BCE) | \n", "444.3 | \n", "438.9 | \n", "429.3 | \n", "429.3 | \n", "17.8 | \n", "