{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Phase 5 — Cross-Family Comparison\n", "\n", "Take the best recipe from each family (phases 2/3/4) and train each on identical data\n", "to the same epoch budget. Per-family iteration analyses live in their own notebooks\n", "(phase 2 for GAN, phase 3 for VAE, phase 4 for DDPM); this notebook is **only** about\n", "comparing the three families head-to-head.\n", "\n", "**Headline FIDs (best across training):** p5_ddpm=19.0, p5_gan=22.0, p5_vae=46.2.\n" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import json\n", "from pathlib import Path\n", "\n", "import matplotlib.pyplot as plt\n", "import matplotlib.image as mpimg\n", "import numpy as np\n", "import pandas as pd\n", "\n", "plt.rcParams.update({\"figure.dpi\": 120, \"font.size\": 10})\n", "\n", "OUTPUTS = Path(\"../outputs\")\n", "LOGS = OUTPUTS / \"logs\"\n", "SAMPLES = OUTPUTS / \"samples\"\n", "\n", "\n", "def load_log(name):\n", " p = LOGS / f\"{name}.json\"\n", " return json.load(open(p)) if p.exists() else None\n", "\n", "def get_fid(log, epoch):\n", " fid = log.get(\"history\", {}).get(\"fid\", {})\n", " return fid.get(str(epoch))\n", "\n", "def fid_series(log):\n", " fid = log.get(\"history\", {}).get(\"fid\", {})\n", " items = sorted((int(k), v) for k, v in fid.items())\n", " return [e for e, _ in items], [v for _, v in items]\n", "\n", "\n", "FAMILIES = {\n", " \"GAN\": {\"p5\": \"p5_gan\", \"color\": \"#5B8DB8\", \"label\": \"WGAN-GP + SN + Attn\"},\n", " \"VAE\": {\"p5\": \"p5_vae\", \"color\": \"#E8B85A\", \"label\": \"VAE + Perceptual + PatchGAN\"},\n", " \"DDPM\": {\"p5\": \"p5_ddpm\", \"color\": \"#E8705A\", \"label\": \"DDPM cosine v-pred wider\"},\n", "}\n", "\n", "logs_p5 = {fam: load_log(info[\"p5\"]) for fam, info in FAMILIES.items()}\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1. Quantitative summary" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "
| \n", " | Family | \n", "Architecture | \n", "Resolution | \n", "Epochs | \n", "Best FID | \n", "Last FID | \n", "Train (min) | \n", "
|---|---|---|---|---|---|---|---|
| 2 | \n", "DDPM | \n", "DDPM cosine v-pred wider | \n", "64×64 | \n", "175 | \n", "19.0 | \n", "20.1 | \n", "667.1 | \n", "
| 0 | \n", "GAN | \n", "WGAN-GP + SN + Attn | \n", "64×64 | \n", "850 | \n", "22.0 | \n", "22.0 | \n", "316.9 | \n", "
| 1 | \n", "VAE | \n", "VAE + Perceptual + PatchGAN | \n", "64×64 | \n", "150 | \n", "46.2 | \n", "48.6 | \n", "50.2 | \n", "