TIR_PROJ/docs/status.md

# Status — 2026-05-18

Current snapshot of what works in Webots, and what design choices got us here.

## Results matrix (Webots, seed=42)

Differential drive — `bash tools/run_webots.sh N MODE differential WORLD`:

| controller     | field n=5 | field n=10 | field_round n=5 | field_round n=10 |
|----------------|:---------:|:----------:|:---------------:|:----------------:|
| BC             | 5/5       | 10/10      | 5/5             | 10/10            |
| RL             | 5/5       | 10/10      | 5/5             | 10/10            |
| Strömbom       | 5/5       | 10/10      | 5/5             | 10/10            |
| Sequential     | 5/5       | 10/10      | 5/5             | 10/10            |

Mecanum drive — `bash tools/run_webots.sh N MODE mecanum WORLD HERDING_LIDAR=360`:

| controller | field n=5 | field n=10 | field_round n=5 | field_round n=10 |
|------------|:---------:|:----------:|:---------------:|:----------------:|
| BC         | 0/5       | 10/10      | 0/5             | 10/10            |
| RL         | 0/5       | 10/10      | 0/5             | 10/10            |

Extra-merit:

- **360° LiDAR ablation** — `HERDING_LIDAR=360` works in all four diff cells.
- **Dual-dog axis-split** — `HERDING_NDOGS=2 HERDING_AXIS_LEAK=0.3` pens 5/5 on diff.

## Architecture decisions and why

### Differential drive — full ODE simulation

Standard Webots physics with two wheel motors and a caster. No special handling needed; the chassis is dynamically stable, and the trained policies transfer directly to Webots.

### Mecanum drive — kinematic Supervisor injection

The mecanum proto uses physical 8-roller wheels for visual fidelity, but the chassis is moved by `Supervisor.setVelocity()` using the gym mecanum forward-kinematics formula (see `controllers/shepherd_dog/shepherd_dog.py::drive_mecanum`).

We explored two other paths before settling here:

1. **Plain cylinder wheels + anisotropic ContactProperties.** Tried `frictionRotation ±0.7854` on the wheel contact frame. Strafe motion came out the wrong direction and diagonals zeroed out. Discarded.
2. **Full ODE simulation on 32 physical roller hinges.** The free-spinning rollers coupled chaotically through the body, producing ±150° yaw drift over 200 control steps. Even with `inertiaMatrix` overrides, `dampingConstant` on every roller, and a 6× cut to motor torque, dynamic policy commands kept producing tumbling. Discarded.
3. **Kinematic Supervisor injection (current).** ODE physics on the wheels is kept for visuals only; the chassis velocity is set directly each step from the gym forward-kinematics formula. Gym training and Webots deployment produce identical body motion. Yaw drift is zero by construction.

This is not a hack — it matches how most academic mecanum sims work (e.g., Gazebo's mecanum plugins use kinematic models by default; ODE's contact solver does not handle the rolling-without-slipping constraint cleanly for 32 free hinges).

### Why n=5 mecanum fails (and n=10 passes)

The 360° LiDAR consistently produces 0–8 detections per frame at n=5 — 5 from real sheep plus 1–3 "phantom" clusters from gate posts, wall fragments, and pen rails. The tracker's consensus filter promotes a candidate to "active" after `consensus_k=3` hits within 20 steps, and phantoms satisfy that easily because they're spatially consistent.

With n=10 real sheep the 10 active slots fill with real sheep before phantoms compete. With n=5 there are ~5 free slots and the phantoms occupy them; the policy then chases ghosts (verified: with `HERDING_USE_GT=1` perception bypass, n=5 pens 5/5 in 76 s).

We tried four fixes; none unlocked n=5:

| attempt                                             | result                                          |
|-----------------------------------------------------|-------------------------------------------------|
| Tighten consensus to `consensus_k=5`                | no change, `tracks_active=10` 70% of frames     |
| Tighten `wall_reject=0.9`, `static_reject=1.5`      | no change                                       |
| Static-phantom drop (track displacement from spawn) | phantoms are *not* spatially static — debug logs showed phantom tracks bouncing 4–22 m across the field as data association reassigned them each frame |
| Merge near-duplicate detections (≤0.5 m)            | phantoms aren't fragmentation either            |

The phantom tracks are caused by **data-association noise**: when the tracker has more slots than real sheep, the leftover tracks attach themselves to whatever cluster is closest each frame, even if that cluster has nothing to do with their original spawn position. The fix would need either parallax-aware tracking (require multi-vantage confirmation before promotion) or training with simulated phantom noise. Both are real surgery; out of scope for the 2026-06-11 deadline.

**Workaround for the demo:** running n=10 in Webots always pens 10/10; the n=5 cells produce identical kinematic behaviour and can be reported from the gym evaluation (success rate, time-to-pen) where the gym tracker doesn't accumulate phantoms.

## File map (what changed in this push)

```
herding/config.py                      mecanum presets keep matched
                                       strafe scaling (strafe_eff=0.26,
                                       bleed=-0.40) for kinematic injection
controllers/shepherd_dog/shepherd_dog.py
                                       Supervisor() + drive_mecanum kinematic
                                       injection via _self_node.setVelocity
protos/ShepherdDogMecanum.proto        supervisor TRUE; physics tuning
protos/ShepherdDogMecanum360.proto     reverted (ODE no longer load-bearing)
tools/gen_mecanum_wheels.py            wheels regen-script (clean)
tools/run_webots.sh                    contact-properties comment cleaned
training/{bc/collect,rl/train}.py      comment cleanup; preset selection unchanged
```

## Options for the remaining cleanup

1. **Keep matched preset (0.26, -0.40)**. Policies trained against these values; controller applies them at deploy; no retrain. *Current state*.
2. **Switch preset to textbook (1.0, 0.0) and retrain mecanum BC+RL** (~6h). Cleaner story (textbook mecanum throughout); same kinematic-injection mechanism.

Either is defensible. (1) ships faster; (2) is more "pure".