TIR_PROJ

Author	SHA1	Message	Date
Johnny Fernandes	27c0f65722	Mecanum Webots via Supervisor kinematic injection Replace the failing ODE-rolled mecanum chassis dynamics with a Supervisor.setVelocity call that uses the gym mecanum forward kinematics formula directly. Wheel motors still spin (visual); chassis motion comes from the gym model so training and deployment match by construction. Results (seed=42, n=10 sheep): BC + RL mecanum pen 10/10 in both field and field_round. n=5 mecanum cells still 0/5 due to tracker phantoms anchored to wall corners under the 360° LiDAR — documented in docs/status.md as the remaining gap. Cleanup: drop deploy-time hacks (HERDING_HEADING_, HERDING_OMEGA_CLAMP, HERDING_TRACKER_) that were workarounds for the old ODE chaos; revert the proto inertiaMatrix, roller dampingConstant, and reduced motor torque since they no longer carry load; refresh comments around the mecanum config presets.	2026-05-18 22:46:37 +00:00
Johnny Fernandes	7ab69ab0f3	Rename multi-segment functions to two-concept names; polish docstrings Naming pass: rename functions whose third+ segment is redundant or implementation-detail, sticking to the codebase's preferred ``noun_verb`` / ``verb_noun`` two-concept idiom. Renames are atomic across definitions, callers, and tests. is_penned_position → is_penned modulate_speed_near_sheep → modulate_speed mecanum_kinematics_step → mecanum_step policy_forward_mean → forward_mean Two-concept patterns like ``velocity_to_wheels`` / ``detections_from_scan`` / ``make_strombom_predictor`` are left alone — they're idiomatic converters / factories that read as a single concept, and the longer form aids grep-ability. Docstring polish: * ``herding/config.py`` header drops the "previously lived as a module-level literal" historical framing — we ship as a single thing, so the refactor anecdote no longer earns its keep. The usage examples now mention both ``HERDING_WEBOTS`` and ``HERDING_MEC_WEBOTS`` presets. 126 pytest cases still pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-17 01:58:15 +00:00
Johnny Fernandes	10c01a938e	Drop versioning vocabulary, polish docstrings, fix world-aware policy resolution User-facing pass after the project was decided to be a single submission with no inner iterations. * Remove every "v1"/"v2"/"versioning" reference from the docs: - README mecanum section trims the "v1 predates the rewrite" prose in favour of a self-contained retrain recipe. - The 3.2 GB `training/runs/v1_clean/` backup directory is deleted. * Refresh control-layer docstrings: - `sheep_tracker.py` header now describes the three actual pipeline stages (consensus, prediction, pen latching) instead of layering the consensus stage on top of a stale "predictive mode" preamble. - `controllers/shepherd_dog/shepherd_dog.py` mode list is up-to-date — adds `universal`, removes outdated single-policy default paths, mentions `HERDING_USE_GT=1` as the perception ablation. * Refresh training command examples: - `training/bc/collect.py` and `training/bc/pretrain.py` usage snippets show the world-suffixed paths the Makefile actually uses; the `--out` arg is now required so old "demos.npz" invocations error loudly instead of silently overwriting. - `training/README.md` rewritten — drops the legacy `runs/bc` diagram, documents the per-(drive, world) pipeline, and adds the mecanum retraining caveat. * Fix policy-directory resolution end-to-end: - `tools/run_webots.sh` now tries `training/runs/{bc,rl}_<drive>_<world>` first, then the drive- only path, then the bare-mode legacy path — matching the actual on-disk layout. Previously it looked for `bc_<drive>` (no world) and silently fell back to `bc`, masking the world selection. - `controllers/shepherd_dog/shepherd_dog.py:_resolve_policy_dir` has the same fix plus a latent NameError unmasked: it referenced `DRIVE_MODE` before that variable was set at module load. The block is restructured so MODE/DRIVE_MODE/WORLD are resolved first, then the function uses them as explicit arguments. 126 pytest cases still pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-17 01:50:54 +00:00
Johnny Fernandes	a584a034e9	Project-wide cleanup: gitignore, dead code, stale artifacts, README Repo hygiene pass after a long working session. Files removed: * stage1_train.log — runtime training log (~125 KB), shouldn't have been tracked. * training/bc/demos.npz — orphan default-name demos file from before the world+drive-suffixed naming convention took over; no script references it. * training/runs/bc_dagger{1,2}_differential_field/policy.zip — failed DAgger experiment artifacts. Per `memory/dagger_results.md` the whole DAgger experiment hit 0/5 on Webots transfer; these checkpoints have no consumers. Untracked-but-deleted (no git change) — also cleaned from disk: * Root-level runtime logs (43 .log files, all unused — gitignored now). training/bc/{combined,dagger}.npz (5 huge demo blobs, 2.6 GB reclaimed; not committed). training/bc/v1/ (2.6 GB backup of pre-DAgger demos; reclaimed). * training/runs/at_20260426_/ (orphan timestamped runs; reclaimed). All __pycache__/. Dead code removed: * `herding/control/strombom.py::compute_action_debug` — no callers anywhere in the repo. * `herding/control/sequential.py::compute_action_debug` — same. * `herding/control/universal.py::compute_action_diff` — same. .gitignore extended to cover: * All .log files (training/eval/webots logs are runtime artifacts). training/bc/.npz (re-collectable on demand by `make bc_demos`). training/bc/v1/. * .pytest_cache, .pyc, .claude/. README refreshed: Mecanum + round-world coverage in the headline. * Quick-start updated for DRIVE/WORLD-suffixed Makefile targets, GT-bypass example, and the mecanum-retrain caveat. * Layout reflects the actual current tree (config.py, both protos, both worlds, all tools). * Results table replaced with the Webots end-to-end numbers from the 2026-05-16 sweep (8/8 diff combos + LiDAR/GT comparison). Verification: 126 pytest cases still pass (was 126 going in — no test-coverage regression from the dead-code removal). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-17 01:38:19 +00:00
Johnny Fernandes	3b4c99a6c4	Training pipelines auto-select mecanum-Webots preset * training/bc/collect.py: --use-webots-preset now picks the drive-matched variant. Mecanum drives get HERDING_MEC_WEBOTS (with the Webots-calibrated strafe efficiency and bleed) so the collected demos reflect the imperfect physical mecanum the deployed policy will see. Differential drives still use HERDING_WEBOTS (no behaviour change there). * training/rl/train.py: mecanum fine-tune now unconditionally applies the HERDING_MEC_WEBOTS robot config to the PPO env (the policy must update against the same imperfect kinematics it deploys on). Diff fine-tune unchanged. To retrain a mecanum policy end-to-end against the new proto: python -m training.bc.collect --drive-mode mecanum --world field \ --use-webots-preset \ --out training/bc/demos_mecanum_field_v2.npz python -m training.bc.pretrain --demos training/bc/demos_mecanum_field_v2.npz \ --out training/runs/bc_mecanum_field_v2 ... python -m training.rl.train --bc training/runs/bc_mecanum_field_v2 \ --out training/runs/rl_mecanum_field_v2 \ --drive-mode mecanum --world field --use-webots-preset The same flow for field_round / mecanum/round. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-17 01:12:06 +00:00
Johnny Fernandes	dd5ac669e5	Webots sim-to-real fixes, DAgger pipeline, 360° proto variant Today's session worked across the full Webots delivery stack — found and fixed a cluster of bugs blocking the BC/RL transfer, then explored training-side mitigations for the residual perception gap. Bug fixes: - Makefile FP_RATE default 2.0 → 0.0: BC demos used fp_rate=0 but RL fine-tune defaulted to fp_rate=2, poisoning the BC obs distribution and stalling PPO at 0% success across 1.46M+ steps. - controllers/{shepherd_dog,sheep}/runtime.ini: Webots was launching controllers under system python3 (no numpy) and they were crashing silently. Pinned to the conda tir env. - herding/config.py HERDING_WEBOTS preset: pen_latch_depth 0.5 → 2.0, max_new_tracks_per_step 3 → 1, static_reject 0.8 → 1.2. Stops phantom FPs near the gate from latching as permanently-penned tracks. - herding/perception/sheep_tracker.py: penned tracks now decay at forget_steps × 8 instead of living forever. Adds get_positions min_freshness filter for deploy-time use. Training/eval matches deployment: - training/bc/collect.py: --dagger-policy flag for DAgger rollouts (policy drives, teacher labels) + --use-webots-preset for matched 140° tracker + DR config. - controllers/shepherd_dog/shepherd_dog.py: scan-fallback (0, 0.6) when BC/RL sees empty sheep_positions — recovers from FOV gaps. Tooling: - tools/dagger_round.sh: one-shot DAgger round (collect + concat + bc). - tools/webots_sweep_gt.sh: full sweep with HERDING_USE_GT=1 for the perception-gap diagnosis matrix. - protos/ShepherdDog360.proto: 360° FOV variant for the FOV-ablation comparison. Canonical proto stays at 140° per project spec. Artifacts: v1 BC/RL policies for all 4 (drive × world) combos trained in clean gym (success: diff/field 90-100%, diff/round 58%, mec/field 60-100%, mec/round 50-100%). DAgger r1/r2 BCs for diff/field show 12%→38% progression on gym HERDING_WEBOTS proxy but did not close to actual Webots LiDAR (0/5 throughout). Next: LSTM policy or learned tracker per the project-state memory. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-16 17:21:02 +00:00
Johnny Fernandes	0f807003a5	Results from last checkpoint	2026-05-13 20:26:18 +00:00
Johnny Fernandes	be58ad2054	Results from last checkpoinr	2026-05-13 07:49:17 +00:00
Johnny Fernandes	5c2ee4bba5	Checkpoint 8	2026-05-12 22:41:03 +01:00
Johnny Fernandes	a01a5c9cef	Checkpoint 7	2026-05-11 12:21:51 +01:00

10 Commits