Full retrain pipeline + hybrid policy set
Ran end-to-end clean retrain + gym eval + 24-cell Webots sweep
(tools/full_pipeline.sh). Results:
Differential — all 16 cells pen N/N. Updated policies committed.
Mecanum — new training stochastically regressed (only 2/8 cells
vs the v2 baseline's 4/8). v2 baseline mec policies
are RESTORED in this commit (training/runs/{bc,rl}_
mecanum_*) — they remain the deliverable.
The retrain pipeline itself is committed for reproducibility
(tools/full_pipeline.sh: clean → train_all → eval_all → 24-cell
Webots sweep). The v2 mec policies are also backed up locally to
_backup_pretrain/mec_v2_baseline/ (gitignored).
Verified after restore:
bc mec field_round n=10 → 10/10 in 147 s sim
rl diff field n=5 → 5/5 in 137 s sim
This commit is contained in:
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Reference in New Issue
Block a user