Two small features.
(1) Portable interpreter
* `tools/setup_env.sh` exports HERDING_PYTHON (default points to the
project's conda env; override in your shell to retarget).
* Both `controllers/*/runtime.ini` files now use Webots' env-var
expansion: `COMMAND = $(HERDING_PYTHON)` so the Webots-launched
controllers pick up the same interpreter as the bash scripts.
* `tools/run_webots.sh`, `tools/webots_sweep{,_gt}.sh` and
`tools/calibrate_mecanum.sh` all source `setup_env.sh` at the top
instead of hard-coding `/home/jalf/miniconda3/envs/tir/bin`.
The hard-coded conda path is now exactly one line in `setup_env.sh`'s
fallback default — a single place to edit on a new machine, or
override-once via `export HERDING_PYTHON=...`.
(2) 360° LiDAR FOV ablation
* New `LIDAR_WEBOTS_360` preset matches the existing
`protos/ShepherdDog360.proto` (360 rays / 2π FOV / 15 m range).
* `tools/run_webots.sh` reads `HERDING_LIDAR=140|360` and swaps the
diff-drive proto accordingly (mecanum keeps 140° — the
ShepherdDogMecanum proto has its own LiDAR section). The variant
is written into `herding_runtime.cfg` so the controller can read
it even when Webots strips env vars.
* `controllers/shepherd_dog/shepherd_dog.py` picks the matching
`lidar_cfg` (`HERDING_WEBOTS.lidar` for 140°, `LIDAR_WEBOTS_360`
otherwise) and feeds it to `detections_from_scan` so the
perception pipeline interprets ray angles + max range correctly.
Smoke test: `HERDING_LIDAR=360 tools/run_webots.sh 5 strombom
differential field` launches with `ShepherdDog360.proto`, the
controller logs the new mode/drive/world line, and the dog is
penning sheep through 360° perception (4/5 at step 19200 before I
killed the test). No retraining required because the gym already
trains under `LIDAR_FULL` (360° preset).
126 pytest cases still pass.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
User-facing pass after the project was decided to be a single
submission with no inner iterations.
* Remove every "v1"/"v2"/"versioning" reference from the docs:
- README mecanum section trims the "v1 predates the rewrite" prose
in favour of a self-contained retrain recipe.
- The 3.2 GB `training/runs/v1_clean/` backup directory is deleted.
* Refresh control-layer docstrings:
- `sheep_tracker.py` header now describes the three actual pipeline
stages (consensus, prediction, pen latching) instead of layering
the consensus stage on top of a stale "predictive mode" preamble.
- `controllers/shepherd_dog/shepherd_dog.py` mode list is
up-to-date — adds `universal`, removes outdated single-policy
default paths, mentions `HERDING_USE_GT=1` as the perception
ablation.
* Refresh training command examples:
- `training/bc/collect.py` and `training/bc/pretrain.py` usage
snippets show the world-suffixed paths the Makefile actually
uses; the `--out` arg is now required so old "demos.npz"
invocations error loudly instead of silently overwriting.
- `training/README.md` rewritten — drops the legacy `runs/bc`
diagram, documents the per-(drive, world) pipeline, and adds
the mecanum retraining caveat.
* Fix policy-directory resolution end-to-end:
- `tools/run_webots.sh` now tries
`training/runs/{bc,rl}_<drive>_<world>` first, then the drive-
only path, then the bare-mode legacy path — matching the actual
on-disk layout. Previously it looked for `bc_<drive>` (no
world) and silently fell back to `bc`, masking the world
selection.
- `controllers/shepherd_dog/shepherd_dog.py:_resolve_policy_dir`
has the same fix plus a latent NameError unmasked: it referenced
`DRIVE_MODE` before that variable was set at module load. The
block is restructured so MODE/DRIVE_MODE/WORLD are resolved
first, then the function uses them as explicit arguments.
126 pytest cases still pass.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Each wheel is now a hub solid + 8 passive HingeJoint rollers (capsules
tilted 45° in body xy plane at the bottom contact point) instead of
a single plain Cylinder. The rollers free-spin around their tilt axes
so the wheel exhibits mecanum X-pattern behaviour: gym-frame strafe
commands now produce body strafe in Webots, where before they
produced wrong-direction motion (the plain cylinders behaved as 4-
wheel skid-steer).
Calibration on flat field, 200 steps each:
gym predict webots out err
vx=0.5 vy=0 1.33 m/s +x 1.19 m/s +x 10.9% +x
0 m/s +y -0.10 m/s +y ~clean
vx=0 vy=0.5 1.33 m/s +y 0.50 m/s +y 62.1% +y
0 m/s +x -0.37 m/s +x noticeable
mecanum
coupling
Strafe is imperfect (-x bleed-through, magnitude under-shoot) but
direction is correct and the platform is now omnidirectional. Forward
motion is high-fidelity. Tilt signs assigned so diagonal pairs FL+RR
and FR+RL share the same body-frame roller orientation (the standard
X pattern). Two contact-material names "MecanumWheelA/B" are kept for
diagnostic separation; both use the same isotropic Coulomb friction
of 2.0 with forceDependentSlip 0.005.
tools/run_webots.sh ships the matching contactProperties block on
every mecanum launch (re-emitted into the temporary world copy).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
`active=$(grep -c '^Sheep' "$DST")` returns 0 with exit code 1 when
no sheep are left in the world, which fires set -e and kills the
script before it can launch Webots. Wrap with `|| true` so the
calibration mode (N=0) can actually run.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Today's session worked across the full Webots delivery stack — found and
fixed a cluster of bugs blocking the BC/RL transfer, then explored
training-side mitigations for the residual perception gap.
Bug fixes:
- Makefile FP_RATE default 2.0 → 0.0: BC demos used fp_rate=0 but RL
fine-tune defaulted to fp_rate=2, poisoning the BC obs distribution
and stalling PPO at 0% success across 1.46M+ steps.
- controllers/{shepherd_dog,sheep}/runtime.ini: Webots was launching
controllers under system python3 (no numpy) and they were crashing
silently. Pinned to the conda tir env.
- herding/config.py HERDING_WEBOTS preset: pen_latch_depth 0.5 → 2.0,
max_new_tracks_per_step 3 → 1, static_reject 0.8 → 1.2. Stops phantom
FPs near the gate from latching as permanently-penned tracks.
- herding/perception/sheep_tracker.py: penned tracks now decay at
forget_steps × 8 instead of living forever. Adds get_positions
min_freshness filter for deploy-time use.
Training/eval matches deployment:
- training/bc/collect.py: --dagger-policy flag for DAgger rollouts
(policy drives, teacher labels) + --use-webots-preset for matched
140° tracker + DR config.
- controllers/shepherd_dog/shepherd_dog.py: scan-fallback (0, 0.6) when
BC/RL sees empty sheep_positions — recovers from FOV gaps.
Tooling:
- tools/dagger_round.sh: one-shot DAgger round (collect + concat + bc).
- tools/webots_sweep_gt.sh: full sweep with HERDING_USE_GT=1 for the
perception-gap diagnosis matrix.
- protos/ShepherdDog360.proto: 360° FOV variant for the FOV-ablation
comparison. Canonical proto stays at 140° per project spec.
Artifacts: v1 BC/RL policies for all 4 (drive × world) combos trained
in clean gym (success: diff/field 90-100%, diff/round 58%, mec/field
60-100%, mec/round 50-100%). DAgger r1/r2 BCs for diff/field show
12%→38% progression on gym HERDING_WEBOTS proxy but did not close
to actual Webots LiDAR (0/5 throughout). Next: LSTM policy or
learned tracker per the project-state memory.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>