Correcoes 5 notebooks

2026-05-06 17:45:55 +01:00
parent 580808d9ad
commit b5313e3320
20 changed files with 785 additions and 837 deletions
@@ -380,7 +380,7 @@
   "metadata": {},
   "source": [
    "<!-- phase1-protocol -->\n",
-    "**Readout:** both Phase 1 runs share seed `42`, `5` folds, batch size `32`, learning rate `1e-4`, weight decay `1e-4`, cosine `T_max=15`, and early-stopping patience `5`. The only intended comparison is model capacity/pretraining: SimpleCNN from scratch versus pretrained ResNet18.\n"
+    "Both Phase 1 runs use the same protocol: seed `42`, `5` folds, batch size `32`, learning rate `1e-4`, weight decay `1e-4`, cosine `T_max=15`, and early-stopping patience `5`. The intended comparison is therefore model capacity and pretraining: SimpleCNN from scratch versus pretrained ResNet18.\n"
   ]
  },
  {
@@ -757,7 +757,7 @@
   "id": "97cfc057",
   "metadata": {},
   "source": [
-    "**Readout:** SimpleCNN improves slowly and its train/validation AUC curves stay close together, which suggests limited capacity rather than severe overfitting. ResNet18 learns much faster and reaches very high training AUC, but validation AUC plateaus around the low 0.93 range after the first few epochs. The gap between train and validation AUC means the pretrained model has enough capacity to fit the training folds more strongly than it generalizes to validation. This is not a failure, because ResNet18 still has much better validation and test performance than SimpleCNN, but it is a warning that later improvements must be checked on held-out folds and source-wise metrics rather than training curves alone.\n"
+    "SimpleCNN improves slowly and its train/validation AUC curves stay close together, which points more to limited capacity than severe overfitting. ResNet18 learns much faster and reaches very high training AUC, while validation AUC plateaus around the low `0.93` range after the first few epochs. That gap means the pretrained model can fit the training folds more strongly than it generalizes, so later improvements need to be checked on held-out folds and source-wise metrics, not training curves alone.\n"
   ]
  },
  {
@@ -815,7 +815,7 @@
   "id": "2ddecd94",
   "metadata": {},
   "source": [
-    "**Readout:** The confusion matrices show the same pattern as the AUC results, but in error-count form. SimpleCNN correctly classifies about `71%` of real images and `70%` of fake images, so it misses many examples in both directions. ResNet18 improves both sides: about `81%` of real images are kept real, and about `88%` of fake images are detected as fake. The biggest practical gain is fewer fake images predicted as real (`30%` -> `12%`), which matters because false negatives are the dangerous failure mode for a deepfake detector. ResNet18 still has some false alarms on real images (`19%`), so it is stronger but not perfect.\n"
+    "The confusion matrices show the AUC story in error-count form. SimpleCNN correctly classifies about `71%` of real images and `70%` of fake images, so it misses many examples in both directions. ResNet18 improves both sides: about `81%` of real images are kept real, and about `88%` of fake images are detected as fake. The most important practical gain is fewer fake images predicted as real (`30%` -> `12%`), although the model still produces some false alarms on real images (`19%`).\n"
   ]
  },
  {