TIR_PROJ

Author	SHA1	Message	Date
Johnny Fernandes	7ab69ab0f3	Rename multi-segment functions to two-concept names; polish docstrings Naming pass: rename functions whose third+ segment is redundant or implementation-detail, sticking to the codebase's preferred ``noun_verb`` / ``verb_noun`` two-concept idiom. Renames are atomic across definitions, callers, and tests. is_penned_position → is_penned modulate_speed_near_sheep → modulate_speed mecanum_kinematics_step → mecanum_step policy_forward_mean → forward_mean Two-concept patterns like ``velocity_to_wheels`` / ``detections_from_scan`` / ``make_strombom_predictor`` are left alone — they're idiomatic converters / factories that read as a single concept, and the longer form aids grep-ability. Docstring polish: * ``herding/config.py`` header drops the "previously lived as a module-level literal" historical framing — we ship as a single thing, so the refactor anecdote no longer earns its keep. The usage examples now mention both ``HERDING_WEBOTS`` and ``HERDING_MEC_WEBOTS`` presets. 126 pytest cases still pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-17 01:58:15 +00:00
Johnny Fernandes	a584a034e9	Project-wide cleanup: gitignore, dead code, stale artifacts, README Repo hygiene pass after a long working session. Files removed: * stage1_train.log — runtime training log (~125 KB), shouldn't have been tracked. * training/bc/demos.npz — orphan default-name demos file from before the world+drive-suffixed naming convention took over; no script references it. * training/runs/bc_dagger{1,2}_differential_field/policy.zip — failed DAgger experiment artifacts. Per `memory/dagger_results.md` the whole DAgger experiment hit 0/5 on Webots transfer; these checkpoints have no consumers. Untracked-but-deleted (no git change) — also cleaned from disk: * Root-level runtime logs (43 .log files, all unused — gitignored now). training/bc/{combined,dagger}.npz (5 huge demo blobs, 2.6 GB reclaimed; not committed). training/bc/v1/ (2.6 GB backup of pre-DAgger demos; reclaimed). * training/runs/at_20260426_/ (orphan timestamped runs; reclaimed). All __pycache__/. Dead code removed: * `herding/control/strombom.py::compute_action_debug` — no callers anywhere in the repo. * `herding/control/sequential.py::compute_action_debug` — same. * `herding/control/universal.py::compute_action_diff` — same. .gitignore extended to cover: * All .log files (training/eval/webots logs are runtime artifacts). training/bc/.npz (re-collectable on demand by `make bc_demos`). training/bc/v1/. * .pytest_cache, .pyc, .claude/. README refreshed: Mecanum + round-world coverage in the headline. * Quick-start updated for DRIVE/WORLD-suffixed Makefile targets, GT-bypass example, and the mecanum-retrain caveat. * Layout reflects the actual current tree (config.py, both protos, both worlds, all tools). * Results table replaced with the Webots end-to-end numbers from the 2026-05-16 sweep (8/8 diff combos + LiDAR/GT comparison). Verification: 126 pytest cases still pass (was 126 going in — no test-coverage regression from the dead-code removal). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-17 01:38:19 +00:00
Johnny Fernandes	1c197e0ff7	Enable consensus tracker by default + round-world Strömbom fix Two changes that together raise diff/round gym success ~52%→88% (BC) and ~68%→88% (RL) without retraining; diff/field stays at 100%. * TrackerConfig.consensus_k default 1 → 3 (radius 0.5 m, max_age 15 frames). The same candidate-promotion mechanism that closed the Webots LiDAR gap also filters gym tracker phantoms — they show up on the round field where sheep run further between detection cycles than GATE_M, so each new position spawns a fresh track while the stale one persists in memory. SheepTracker() called with no tracker_cfg keeps the legacy pass-through behaviour for backwards compatibility. * Strömbom + universal teachers now detect when the natural "behind the flock" drive target leaves the curved boundary and fall back to pushing the flock radially inward toward the centre. Breaks the wall-circling pattern that previously trapped both the analytical baselines and the trained policies. A/B numbers (n_sheep ∈ {1,2,3,5,10}, 5 seeds each, max_steps=15000): diff/field bc: baseline 100% consensus 100% diff/field rl: baseline 100% consensus 100% diff/round bc: baseline 52% consensus 88% diff/round rl: baseline 68% consensus 88% Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-16 21:09:25 +00:00
Johnny Fernandes	dd5ac669e5	Webots sim-to-real fixes, DAgger pipeline, 360° proto variant Today's session worked across the full Webots delivery stack — found and fixed a cluster of bugs blocking the BC/RL transfer, then explored training-side mitigations for the residual perception gap. Bug fixes: - Makefile FP_RATE default 2.0 → 0.0: BC demos used fp_rate=0 but RL fine-tune defaulted to fp_rate=2, poisoning the BC obs distribution and stalling PPO at 0% success across 1.46M+ steps. - controllers/{shepherd_dog,sheep}/runtime.ini: Webots was launching controllers under system python3 (no numpy) and they were crashing silently. Pinned to the conda tir env. - herding/config.py HERDING_WEBOTS preset: pen_latch_depth 0.5 → 2.0, max_new_tracks_per_step 3 → 1, static_reject 0.8 → 1.2. Stops phantom FPs near the gate from latching as permanently-penned tracks. - herding/perception/sheep_tracker.py: penned tracks now decay at forget_steps × 8 instead of living forever. Adds get_positions min_freshness filter for deploy-time use. Training/eval matches deployment: - training/bc/collect.py: --dagger-policy flag for DAgger rollouts (policy drives, teacher labels) + --use-webots-preset for matched 140° tracker + DR config. - controllers/shepherd_dog/shepherd_dog.py: scan-fallback (0, 0.6) when BC/RL sees empty sheep_positions — recovers from FOV gaps. Tooling: - tools/dagger_round.sh: one-shot DAgger round (collect + concat + bc). - tools/webots_sweep_gt.sh: full sweep with HERDING_USE_GT=1 for the perception-gap diagnosis matrix. - protos/ShepherdDog360.proto: 360° FOV variant for the FOV-ablation comparison. Canonical proto stays at 140° per project spec. Artifacts: v1 BC/RL policies for all 4 (drive × world) combos trained in clean gym (success: diff/field 90-100%, diff/round 58%, mec/field 60-100%, mec/round 50-100%). DAgger r1/r2 BCs for diff/field show 12%→38% progression on gym HERDING_WEBOTS proxy but did not close to actual Webots LiDAR (0/5 throughout). Next: LSTM policy or learned tracker per the project-state memory. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-16 17:21:02 +00:00
Johnny Fernandes	683de740af	Checkpoint 9	2026-05-13 13:46:50 +01:00
Johnny Fernandes	5c2ee4bba5	Checkpoint 8	2026-05-12 22:41:03 +01:00
Johnny Fernandes	a01a5c9cef	Checkpoint 7	2026-05-11 12:21:51 +01:00
Johnny Fernandes	fce0e0c786	Checkpoint 6	2026-05-11 10:35:48 +01:00

8 Commits