Commit Graph

45 Commits

Author SHA1 Message Date
Johnny Fernandes 7ab69ab0f3 Rename multi-segment functions to two-concept names; polish docstrings
Naming pass: rename functions whose third+ segment is redundant or
implementation-detail, sticking to the codebase's preferred
``noun_verb`` / ``verb_noun`` two-concept idiom. Renames are atomic
across definitions, callers, and tests.

  is_penned_position        →  is_penned
  modulate_speed_near_sheep →  modulate_speed
  mecanum_kinematics_step   →  mecanum_step
  policy_forward_mean       →  forward_mean

Two-concept patterns like ``velocity_to_wheels`` / ``detections_from_scan``
/ ``make_strombom_predictor`` are left alone — they're idiomatic
converters / factories that read as a single concept, and the longer
form aids grep-ability.

Docstring polish:
* ``herding/config.py`` header drops the "previously lived as a
  module-level literal" historical framing — we ship as a single
  thing, so the refactor anecdote no longer earns its keep. The
  usage examples now mention both ``HERDING_WEBOTS`` and
  ``HERDING_MEC_WEBOTS`` presets.

126 pytest cases still pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 01:58:15 +00:00
Johnny Fernandes ee77c8606c Gym mecanum kinematics matching to Webots roller-hinge proto
Mecanum proto rewrite in b3cf990 made the wheels truly omnidirectional
in Webots, but with asymmetric slip: forward command produces ~89% of
textbook speed while strafe produces only ~38% plus a consistent
~28% backward bleed-through. v1 BC/RL trained on perfect mecanum
gym kinematics could not herd the new dynamics. To unblock that:

* `mecanum_kinematics_step` gains two parameters that scale the
  realised motion to match a deployed-platform calibration:
    - strafe_efficiency  ∈ (0, 1]  default 1.0
    - strafe_to_forward_bleed     default 0.0
  Forward motion is untouched (textbook X-pattern continues to apply
  to vx_body); only the lateral channel is scaled and bleed is added.
* `RobotConfig` exposes both as drive-config fields with the same
  pass-through defaults so existing diff-drive code and existing
  mecanum training pipelines see no behaviour change.
* `HERDING_MEC_WEBOTS` preset bakes in the values measured against the
  current Webots mecanum proto (strafe_efficiency=0.4,
  strafe_to_forward_bleed=-0.28). Training mecanum BC/RL with this
  preset produces policies that compensate for the imperfect
  physical mecanum at deploy.
* `HerdingEnv` plumbs `RobotConfig.strafe_*` through to
  `mecanum_kinematics_step` so the preset takes effect.
* tools/gen_mecanum_wheels.py is added so the proto's 32 roller
  hinges can be regenerated by editing a single set of constants
  rather than hand-editing 1500+ lines of VRML.

Tests:
* 4 new mecanum_kinematics_step tests (default pass-through, strafe
  scaling, backward bleed, forward unaffected by strafe params).
* 3 new RobotConfig tests (defaults, validation, preset shape).
* Sanity check: gym strafe with HERDING_MEC_WEBOTS over 100 steps
  reproduces the Webots calibration to 2 decimal places.

126 unit tests pass (was 120).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 01:09:47 +00:00
Johnny Fernandes dd5ac669e5 Webots sim-to-real fixes, DAgger pipeline, 360° proto variant
Today's session worked across the full Webots delivery stack — found and
fixed a cluster of bugs blocking the BC/RL transfer, then explored
training-side mitigations for the residual perception gap.

Bug fixes:
- Makefile FP_RATE default 2.0 → 0.0: BC demos used fp_rate=0 but RL
  fine-tune defaulted to fp_rate=2, poisoning the BC obs distribution
  and stalling PPO at 0% success across 1.46M+ steps.
- controllers/{shepherd_dog,sheep}/runtime.ini: Webots was launching
  controllers under system python3 (no numpy) and they were crashing
  silently. Pinned to the conda tir env.
- herding/config.py HERDING_WEBOTS preset: pen_latch_depth 0.5 → 2.0,
  max_new_tracks_per_step 3 → 1, static_reject 0.8 → 1.2. Stops phantom
  FPs near the gate from latching as permanently-penned tracks.
- herding/perception/sheep_tracker.py: penned tracks now decay at
  forget_steps × 8 instead of living forever. Adds get_positions
  min_freshness filter for deploy-time use.

Training/eval matches deployment:
- training/bc/collect.py: --dagger-policy flag for DAgger rollouts
  (policy drives, teacher labels) + --use-webots-preset for matched
  140° tracker + DR config.
- controllers/shepherd_dog/shepherd_dog.py: scan-fallback (0, 0.6) when
  BC/RL sees empty sheep_positions — recovers from FOV gaps.

Tooling:
- tools/dagger_round.sh: one-shot DAgger round (collect + concat + bc).
- tools/webots_sweep_gt.sh: full sweep with HERDING_USE_GT=1 for the
  perception-gap diagnosis matrix.
- protos/ShepherdDog360.proto: 360° FOV variant for the FOV-ablation
  comparison. Canonical proto stays at 140° per project spec.

Artifacts: v1 BC/RL policies for all 4 (drive × world) combos trained
in clean gym (success: diff/field 90-100%, diff/round 58%, mec/field
60-100%, mec/round 50-100%). DAgger r1/r2 BCs for diff/field show
12%→38% progression on gym HERDING_WEBOTS proxy but did not close
to actual Webots LiDAR (0/5 throughout). Next: LSTM policy or
learned tracker per the project-state memory.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-16 17:21:02 +00:00
Johnny Fernandes 5c2ee4bba5 Checkpoint 8 2026-05-12 22:41:03 +01:00
Johnny Fernandes a01a5c9cef Checkpoint 7 2026-05-11 12:21:51 +01:00
Johnny Fernandes fce0e0c786 Checkpoint 6 2026-05-11 10:35:48 +01:00
Johnny Fernandes b457155538 Checkpoint 5 - incomplete 2026-05-11 10:35:39 +01:00
Johnny Fernandes 6688325d89 Checkpoint 4 2026-05-11 00:42:52 +01:00
Johnny Fernandes 1bb9415414 Checkpoint 2 2026-05-07 22:00:10 +01:00
Johnny Fernandes a2363d882f Trying attention method 2026-04-26 22:28:43 +01:00
Johnny Fernandes 57b1735e1a Mimics webots approach better + debug. Lucky number 2026-04-26 20:36:36 +01:00
Johnny Fernandes deeae3193e Mimics webots approach better + debug. Lucky number 2026-04-26 18:55:53 +01:00
Johnny Fernandes 1af7d03ce2 Mimic webots physics 2026-04-26 18:22:26 +01:00
Johnny Fernandes e2883212c5 Approach v3 w/ south penalty fix 2026-04-26 15:26:24 +01:00
Johnny Fernandes 11e13c6980 Approach v3 w/ south penalty 2026-04-26 14:55:13 +01:00
Johnny Fernandes 3cfd6b5e81 Approach refinement 2026-04-26 02:55:14 +01:00
Johnny Fernandes 287743709a Approach refinement 2026-04-26 02:02:25 +01:00
Johnny Fernandes 61f8a7db15 Cleanup and new approach 2026-04-26 01:50:01 +01:00
Johnny Fernandes b031473758 Behaviour refinement - fence penalty 2026-04-26 01:09:50 +01:00
Johnny Fernandes 6253850620 Behaviour refinement - fence penalty 2026-04-25 23:42:02 +01:00
Johnny Fernandes 7b87908410 Behaviour refinement 2026-04-25 21:35:23 +01:00
Johnny Fernandes 16878c5a0b Sheep training flock _ improver 2026-04-25 18:02:56 +01:00
Johnny Fernandes 438fa1be1d Sheep training flock _ improver 2026-04-25 13:24:52 +01:00
Johnny Fernandes f889dc78cc Sheep training flock _ improver 2026-04-25 12:50:06 +01:00
Johnny Fernandes 02b20fbdb4 Sheep training flock _ improver 2026-04-25 12:20:42 +01:00
Johnny Fernandes fbe76a0d04 Sheep training flock _ improver 2026-04-25 11:31:39 +01:00
Johnny Fernandes 7d5725cc3e Sheep training flock _ improver 2026-04-25 00:18:01 +01:00
Johnny Fernandes b77f36b713 Sheep training flock _ improver 2026-04-24 23:38:09 +01:00
Johnny Fernandes b3251fcca3 Sheep training flock _ improver 2026-04-24 22:46:51 +01:00
Johnny Fernandes d599181d22 Sheep training flock _ improver 2026-04-24 21:29:44 +01:00
Johnny Fernandes bf9fe902d9 Sheep training flock of 10 fix? 2026-04-24 17:49:42 +01:00
Johnny Fernandes 4d7f365358 Sheep training flock of 10 fix? 2026-04-24 17:31:11 +01:00
Johnny Fernandes 3574d57ba2 Sheep training flock of 10 fix? 2026-04-24 16:30:35 +01:00
Johnny Fernandes 58d773cb7c Sheep training flock of 10 fix? 2026-04-24 16:12:16 +01:00
Johnny Fernandes fe5174e0bd Sheep training flock of 10 fix? 2026-04-24 15:55:15 +01:00
Johnny Fernandes bdbe8ba1de Sheep training flock of 10 fix? 2026-04-24 15:10:36 +01:00
Johnny Fernandes fcfa2c35c8 Sheep training flock of 10 fix? 2026-04-24 14:54:20 +01:00
Johnny Fernandes 17eb25864e Sheep training flock of 10 fix? 2026-04-24 10:58:36 +01:00
Johnny Fernandes 4189cc8dba Sheep training flock of 10 fix? 2026-04-24 01:59:15 +01:00
Johnny Fernandes f68dea44da Sheep training flock of 10 fix? 2026-04-23 23:20:23 +01:00
Johnny Fernandes a13f5d0ff0 Sheep training flock of 10 fix? 2026-04-23 20:41:48 +01:00
Johnny Fernandes 81dc2aca01 Sheep training flock of 10 2026-04-23 19:22:39 +01:00
Johnny Fernandes ffbfaa3977 A more classical approach 2026-04-23 11:51:52 +01:00
Johnny Fernandes f9c5093211 Dog rewarding adjustment 2026-04-23 11:35:15 +01:00
Johnny Fernandes 00eaf47d1f RL training ready to test 2026-04-22 23:34:58 +01:00