Notebooks todos sem resultados fase 4

This commit is contained in:
DiogoCosta18
2026-05-06 20:28:29 +01:00
parent b5313e3320
commit 69666d6aa0
16 changed files with 2312 additions and 533 deletions
+10 -6
View File
@@ -7,17 +7,19 @@
"source": [
"# 05 - Grad-CAM Interpretability Analysis\n",
"\n",
"This final classifier notebook adds qualitative evidence. It does not train, tune, or reevaluate models. It loads existing configs, logs, and checkpoints, selects deterministic fold-0 examples, and renders fake-logit Grad-CAM overlays. Metrics reported in the report remain the canonical log values; checkpoint-derived candidate scores in this notebook are only used to choose visual examples.\n",
"This interpretability notebook adds qualitative evidence after the Phase 2 ablations. It does not train, tune, or reevaluate models. It loads existing configs, logs, and checkpoints, selects deterministic fold-0 examples, and renders fake-logit Grad-CAM overlays. Metrics reported in the report remain the canonical log values; checkpoint-derived candidate scores in this notebook are only used to choose visual examples.\n",
"\n",
"Grad-CAM answers a limited question: which spatial regions most support the model's fake-class logit for a selected image? It is useful for sanity checking localization, but it is not proof of causality and it should not override held-out metrics.\n",
"\n",
"A note on resolution: the visible Grad-CAM grid comes from the target convolutional feature map, not from the original image. ResNet18's final convolution is very coarse at 224x224 input, so its last-layer CAM is upsampled from a small grid and appears blockier than some SimpleCNN maps. That block size is architectural granularity, not model confidence. The notebook keeps the canonical last-conv CAM and also adds a finer ResNet18 diagnostic view from an earlier layer for readability.\n",
"\n",
"Story questions:\n",
"- Does the selected final run focus on facial evidence rather than background shortcuts?\n",
"- Does the selected Phase 2 run focus on facial evidence rather than background shortcuts?\n",
"- Does facecrop change what the model can attend to?\n",
"- Do augmentation and source-holdout runs reveal instability in attention?\n",
"- Are errors visually plausible, or do they suggest shortcut behavior?\n"
"- Are errors visually plausible, or do they suggest shortcut behavior?\n",
"\n",
"Roadmap link: after this qualitative check, `06_phase3_model_family_analysis.ipynb` compares stronger pretrained backbones and `07_phase4_data_scaling_analysis.ipynb` records the planned data-scaling analysis.\n"
]
},
{
@@ -1693,11 +1695,13 @@
"id": "7a682e64",
"metadata": {},
"source": [
"## Report-ready conclusion\n",
"## Conclusion\n",
"\n",
"Grad-CAM provides a qualitative final check on the classifier story. The selected metric setting remains `p2c_resnet18_facecrop`: 224x224 input, facecrop enabled, no augmentation, and ImageNet/default normalization. The overlays are most reassuring when they concentrate on facial regions, and most cautionary when errors or source-holdout examples show diffuse, background, or artifact-specific attention.\n",
"Grad-CAM provides a qualitative check on the Phase 2 classifier story. The selected metric setting remains `p2c_resnet18_facecrop`: 224x224 input, facecrop enabled, no augmentation, and ImageNet/default normalization. The overlays are most reassuring when they concentrate on facial regions, and most cautionary when errors or source-holdout examples show diffuse, background, or artifact-specific attention.\n",
"\n",
"The key limitation from Phase 2 still stands: high in-distribution AUC does not guarantee source-agnostic generalization. The Grad-CAM panels help make that limitation visible, but the source-holdout pairwise AUC values are the primary quantitative evidence.\n"
"The key limitation from Phase 2 still stands: high in-distribution AUC does not guarantee source-agnostic generalization. The Grad-CAM panels help make that limitation visible, but the source-holdout pairwise AUC values are the primary quantitative evidence.\n",
"\n",
"Next: `06_phase3_model_family_analysis.ipynb` asks whether stronger pretrained model families improve on the selected Phase 2 pipeline.\n"
]
}
],