Files
TIR_PROJ/docs/status.md
T
Johnny Fernandes 62ea811655 Fix _h_ema NameError; add status + article-draft notes
- shepherd_dog: a leftover reference to the removed HERDING_HEADING_EMA
  helper raised NameError on every controller startup. Drop it.
- docs/status.md: expand the n=5 mecanum failure-mode discussion with
  the four phantom-suppression attempts that didn't pan out, and the
  honest workaround (Webots reports n=10 only, n=5 covered by gym
  results).
- docs/article_draft.md: project-report outline with section structure,
  results tables, and the mecanum sim-to-real narrative for the
  formal writeup.
2026-05-19 01:11:49 +00:00

87 lines
6.2 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Status — 2026-05-18
Current snapshot of what works in Webots, and what design choices got us here.
## Results matrix (Webots, seed=42)
Differential drive — `bash tools/run_webots.sh N MODE differential WORLD`:
| controller | field n=5 | field n=10 | field_round n=5 | field_round n=10 |
|----------------|:---------:|:----------:|:---------------:|:----------------:|
| BC | 5/5 | 10/10 | 5/5 | 10/10 |
| RL | 5/5 | 10/10 | 5/5 | 10/10 |
| Strömbom | 5/5 | 10/10 | 5/5 | 10/10 |
| Sequential | 5/5 | 10/10 | 5/5 | 10/10 |
Mecanum drive — `bash tools/run_webots.sh N MODE mecanum WORLD HERDING_LIDAR=360`:
| controller | field n=5 | field n=10 | field_round n=5 | field_round n=10 |
|------------|:---------:|:----------:|:---------------:|:----------------:|
| BC | 0/5 | 10/10 | 0/5 | 10/10 |
| RL | 0/5 | 10/10 | 0/5 | 10/10 |
Extra-merit:
- **360° LiDAR ablation** — `HERDING_LIDAR=360` works in all four diff cells.
- **Dual-dog axis-split** — `HERDING_NDOGS=2 HERDING_AXIS_LEAK=0.3` pens 5/5 on diff.
## Architecture decisions and why
### Differential drive — full ODE simulation
Standard Webots physics with two wheel motors and a caster. No special handling needed; the chassis is dynamically stable, and the trained policies transfer directly to Webots.
### Mecanum drive — kinematic Supervisor injection
The mecanum proto uses physical 8-roller wheels for visual fidelity, but the chassis is moved by `Supervisor.setVelocity()` using the gym mecanum forward-kinematics formula (see `controllers/shepherd_dog/shepherd_dog.py::drive_mecanum`).
We explored two other paths before settling here:
1. **Plain cylinder wheels + anisotropic ContactProperties.** Tried `frictionRotation ±0.7854` on the wheel contact frame. Strafe motion came out the wrong direction and diagonals zeroed out. Discarded.
2. **Full ODE simulation on 32 physical roller hinges.** The free-spinning rollers coupled chaotically through the body, producing ±150° yaw drift over 200 control steps. Even with `inertiaMatrix` overrides, `dampingConstant` on every roller, and a 6× cut to motor torque, dynamic policy commands kept producing tumbling. Discarded.
3. **Kinematic Supervisor injection (current).** ODE physics on the wheels is kept for visuals only; the chassis velocity is set directly each step from the gym forward-kinematics formula. Gym training and Webots deployment produce identical body motion. Yaw drift is zero by construction.
This is not a hack — it matches how most academic mecanum sims work (e.g., Gazebo's mecanum plugins use kinematic models by default; ODE's contact solver does not handle the rolling-without-slipping constraint cleanly for 32 free hinges).
### Why n=5 mecanum fails (and n=10 passes)
The 360° LiDAR consistently produces 08 detections per frame at n=5 — 5 from real sheep plus 13 "phantom" clusters from gate posts, wall fragments, and pen rails. The tracker's consensus filter promotes a candidate to "active" after `consensus_k=3` hits within 20 steps, and phantoms satisfy that easily because they're spatially consistent.
With n=10 real sheep the 10 active slots fill with real sheep before phantoms compete. With n=5 there are ~5 free slots and the phantoms occupy them; the policy then chases ghosts (verified: with `HERDING_USE_GT=1` perception bypass, n=5 pens 5/5 in 76 s).
We tried four fixes; none unlocked n=5:
| attempt | result |
|-----------------------------------------------------|-------------------------------------------------|
| Tighten consensus to `consensus_k=5` | no change, `tracks_active=10` 70% of frames |
| Tighten `wall_reject=0.9`, `static_reject=1.5` | no change |
| Static-phantom drop (track displacement from spawn) | phantoms are *not* spatially static — debug logs showed phantom tracks bouncing 422 m across the field as data association reassigned them each frame |
| Merge near-duplicate detections (≤0.5 m) | phantoms aren't fragmentation either |
The phantom tracks are caused by **data-association noise**: when the tracker has more slots than real sheep, the leftover tracks attach themselves to whatever cluster is closest each frame, even if that cluster has nothing to do with their original spawn position. The fix would need either parallax-aware tracking (require multi-vantage confirmation before promotion) or training with simulated phantom noise. Both are real surgery; out of scope for the 2026-06-11 deadline.
**Workaround for the demo:** running n=10 in Webots always pens 10/10; the n=5 cells produce identical kinematic behaviour and can be reported from the gym evaluation (success rate, time-to-pen) where the gym tracker doesn't accumulate phantoms.
## File map (what changed in this push)
```
herding/config.py mecanum presets keep matched
strafe scaling (strafe_eff=0.26,
bleed=-0.40) for kinematic injection
controllers/shepherd_dog/shepherd_dog.py
Supervisor() + drive_mecanum kinematic
injection via _self_node.setVelocity
protos/ShepherdDogMecanum.proto supervisor TRUE; physics tuning
protos/ShepherdDogMecanum360.proto reverted (ODE no longer load-bearing)
tools/gen_mecanum_wheels.py wheels regen-script (clean)
tools/run_webots.sh contact-properties comment cleaned
training/{bc/collect,rl/train}.py comment cleanup; preset selection unchanged
```
## Options for the remaining cleanup
1. **Keep matched preset (0.26, -0.40)**. Policies trained against these values; controller applies them at deploy; no retrain. *Current state*.
2. **Switch preset to textbook (1.0, 0.0) and retrain mecanum BC+RL** (~6h). Cleaner story (textbook mecanum throughout); same kinematic-injection mechanism.
Either is defensible. (1) ships faster; (2) is more "pure".