Commit Graph

7 Commits

Author SHA1 Message Date
Johnny Fernandes e86fee5ae8 Per-sheep pen-time metrics, seed support, make webots → menu
* `controllers/shepherd_dog/shepherd_dog.py`
  - Tracks the first step at which each sheep crosses the gate; on
    auto-finish (all sheep penned) prints a `[results]` summary
    block: mode/drive/world/lidar/dogs/seed line, total simulated
    time, per-sheep penning order with absolute step + seconds since
    sim start, and the gate spread between the first and last
    penning.
  - Reads `HERDING_SEED` (env / runtime cfg) and seeds the
    controller's RNG when set. Empty = time-based default = old
    non-deterministic behaviour.
* `controllers/sheep/sheep.py` reads `HERDING_SEED` the same way
  (loading `herding_runtime.cfg` itself so it works even when
  Webots strips env vars) and seeds Python's RNG XOR'd with the
  sheep's name hash, so a fixed seed gives a reproducible flock
  trajectory without all sheep starting from identical wander state.
* `tools/run_webots.sh` writes `HERDING_SEED` into the runtime cfg
  (empty when unset so existing scripts keep their stochastic
  behaviour).
* `tools/webots_menu.sh` gains a Seed prompt (random / fixed
  integer); the launch summary box shows the choice next to the
  perception row.
* `Makefile`
  - `make webots`  now fires the interactive picker (replacing the
    old positional invocation).
  - `make webots_quick MODE=… DRIVE=… WORLD=… N=…` is the old
    positional path, kept for batch / scripted use.

Smoke-tested: menu renders Mode → Drive → World → LiDAR → Dogs
→ Sheep → Perception → Seed → Headless prompts and shows the
selected Seed value in the launch summary. 126 pytest cases still
pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 10:33:34 +00:00
Johnny Fernandes dd5ac669e5 Webots sim-to-real fixes, DAgger pipeline, 360° proto variant
Today's session worked across the full Webots delivery stack — found and
fixed a cluster of bugs blocking the BC/RL transfer, then explored
training-side mitigations for the residual perception gap.

Bug fixes:
- Makefile FP_RATE default 2.0 → 0.0: BC demos used fp_rate=0 but RL
  fine-tune defaulted to fp_rate=2, poisoning the BC obs distribution
  and stalling PPO at 0% success across 1.46M+ steps.
- controllers/{shepherd_dog,sheep}/runtime.ini: Webots was launching
  controllers under system python3 (no numpy) and they were crashing
  silently. Pinned to the conda tir env.
- herding/config.py HERDING_WEBOTS preset: pen_latch_depth 0.5 → 2.0,
  max_new_tracks_per_step 3 → 1, static_reject 0.8 → 1.2. Stops phantom
  FPs near the gate from latching as permanently-penned tracks.
- herding/perception/sheep_tracker.py: penned tracks now decay at
  forget_steps × 8 instead of living forever. Adds get_positions
  min_freshness filter for deploy-time use.

Training/eval matches deployment:
- training/bc/collect.py: --dagger-policy flag for DAgger rollouts
  (policy drives, teacher labels) + --use-webots-preset for matched
  140° tracker + DR config.
- controllers/shepherd_dog/shepherd_dog.py: scan-fallback (0, 0.6) when
  BC/RL sees empty sheep_positions — recovers from FOV gaps.

Tooling:
- tools/dagger_round.sh: one-shot DAgger round (collect + concat + bc).
- tools/webots_sweep_gt.sh: full sweep with HERDING_USE_GT=1 for the
  perception-gap diagnosis matrix.
- protos/ShepherdDog360.proto: 360° FOV variant for the FOV-ablation
  comparison. Canonical proto stays at 140° per project spec.

Artifacts: v1 BC/RL policies for all 4 (drive × world) combos trained
in clean gym (success: diff/field 90-100%, diff/round 58%, mec/field
60-100%, mec/round 50-100%). DAgger r1/r2 BCs for diff/field show
12%→38% progression on gym HERDING_WEBOTS proxy but did not close
to actual Webots LiDAR (0/5 throughout). Next: LSTM policy or
learned tracker per the project-state memory.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-16 17:21:02 +00:00
Johnny Fernandes c61df91950 Checkpoint 10 2026-05-13 23:22:17 +01:00
Johnny Fernandes aa598fcb83 Checkpoint 10 2026-05-13 23:14:16 +01:00
Johnny Fernandes 683de740af Checkpoint 9 2026-05-13 13:46:50 +01:00
Johnny Fernandes 5c2ee4bba5 Checkpoint 8 2026-05-12 22:41:03 +01:00
Johnny Fernandes a01a5c9cef Checkpoint 7 2026-05-11 12:21:51 +01:00