Notebooks todos sem resultados fase 4
This commit is contained in:
@@ -7,17 +7,19 @@
|
||||
"source": [
|
||||
"# 05 - Grad-CAM Interpretability Analysis\n",
|
||||
"\n",
|
||||
"This final classifier notebook adds qualitative evidence. It does not train, tune, or reevaluate models. It loads existing configs, logs, and checkpoints, selects deterministic fold-0 examples, and renders fake-logit Grad-CAM overlays. Metrics reported in the report remain the canonical log values; checkpoint-derived candidate scores in this notebook are only used to choose visual examples.\n",
|
||||
"This interpretability notebook adds qualitative evidence after the Phase 2 ablations. It does not train, tune, or reevaluate models. It loads existing configs, logs, and checkpoints, selects deterministic fold-0 examples, and renders fake-logit Grad-CAM overlays. Metrics reported in the report remain the canonical log values; checkpoint-derived candidate scores in this notebook are only used to choose visual examples.\n",
|
||||
"\n",
|
||||
"Grad-CAM answers a limited question: which spatial regions most support the model's fake-class logit for a selected image? It is useful for sanity checking localization, but it is not proof of causality and it should not override held-out metrics.\n",
|
||||
"\n",
|
||||
"A note on resolution: the visible Grad-CAM grid comes from the target convolutional feature map, not from the original image. ResNet18's final convolution is very coarse at 224x224 input, so its last-layer CAM is upsampled from a small grid and appears blockier than some SimpleCNN maps. That block size is architectural granularity, not model confidence. The notebook keeps the canonical last-conv CAM and also adds a finer ResNet18 diagnostic view from an earlier layer for readability.\n",
|
||||
"\n",
|
||||
"Story questions:\n",
|
||||
"- Does the selected final run focus on facial evidence rather than background shortcuts?\n",
|
||||
"- Does the selected Phase 2 run focus on facial evidence rather than background shortcuts?\n",
|
||||
"- Does facecrop change what the model can attend to?\n",
|
||||
"- Do augmentation and source-holdout runs reveal instability in attention?\n",
|
||||
"- Are errors visually plausible, or do they suggest shortcut behavior?\n"
|
||||
"- Are errors visually plausible, or do they suggest shortcut behavior?\n",
|
||||
"\n",
|
||||
"Roadmap link: after this qualitative check, `06_phase3_model_family_analysis.ipynb` compares stronger pretrained backbones and `07_phase4_data_scaling_analysis.ipynb` records the planned data-scaling analysis.\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -1693,11 +1695,13 @@
|
||||
"id": "7a682e64",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Report-ready conclusion\n",
|
||||
"## Conclusion\n",
|
||||
"\n",
|
||||
"Grad-CAM provides a qualitative final check on the classifier story. The selected metric setting remains `p2c_resnet18_facecrop`: 224x224 input, facecrop enabled, no augmentation, and ImageNet/default normalization. The overlays are most reassuring when they concentrate on facial regions, and most cautionary when errors or source-holdout examples show diffuse, background, or artifact-specific attention.\n",
|
||||
"Grad-CAM provides a qualitative check on the Phase 2 classifier story. The selected metric setting remains `p2c_resnet18_facecrop`: 224x224 input, facecrop enabled, no augmentation, and ImageNet/default normalization. The overlays are most reassuring when they concentrate on facial regions, and most cautionary when errors or source-holdout examples show diffuse, background, or artifact-specific attention.\n",
|
||||
"\n",
|
||||
"The key limitation from Phase 2 still stands: high in-distribution AUC does not guarantee source-agnostic generalization. The Grad-CAM panels help make that limitation visible, but the source-holdout pairwise AUC values are the primary quantitative evidence.\n"
|
||||
"The key limitation from Phase 2 still stands: high in-distribution AUC does not guarantee source-agnostic generalization. The Grad-CAM panels help make that limitation visible, but the source-holdout pairwise AUC values are the primary quantitative evidence.\n",
|
||||
"\n",
|
||||
"Next: `06_phase3_model_family_analysis.ipynb` asks whether stronger pretrained model families improve on the selected Phase 2 pipeline.\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
|
||||
Reference in New Issue
Block a user