Mecanum Webots via Supervisor kinematic injection

Replace the failing ODE-rolled mecanum chassis dynamics with a Supervisor.setVelocity call that uses the gym mecanum forward kinematics formula directly. Wheel motors still spin (visual); chassis motion comes from the gym model so training and deployment match by construction. Results (seed=42, n=10 sheep): BC + RL mecanum pen 10/10 in both field and field_round. n=5 mecanum cells still 0/5 due to tracker phantoms anchored to wall corners under the 360° LiDAR — documented in docs/status.md as the remaining gap. Cleanup: drop deploy-time hacks (HERDING_HEADING_*, HERDING_OMEGA_CLAMP, HERDING_TRACKER_*) that were workarounds for the old ODE chaos; revert the proto inertiaMatrix, roller dampingConstant, and reduced motor torque since they no longer carry load; refresh comments around the mecanum config presets.
2026-05-18 22:46:37 +00:00
parent 1df84ae4b5
commit 27c0f65722
25 changed files with 2635 additions and 76 deletions
@@ -276,37 +276,53 @@ def main() -> None:
    print(f"[rl] drive_mode={drive_mode} (BC action_dim={bc_action_dim})")

    from herding.config import (
-        HerdingConfig, HERDING_MEC_WEBOTS, DomainRandomConfig, RobotConfig,
+        HerdingConfig, HERDING_MEC_WEBOTS_360, DomainRandomConfig, RobotConfig,
    )
    herding_cfg = None
-    # When fine-tuning a mecanum policy we always apply the Webots
-    # roller-hinge calibration to the gym kinematics (strafe efficiency
-    # and bleed). Without this, the RL agent updates against the
-    # textbook X-pattern and fails on deployment.
+    # Mecanum always trains under HERDING_MEC_WEBOTS_360 (360° LiDAR +
+    # kinematic-matched strafe scaling + small compass-noise DR).
    is_mecanum = (drive_mode == "mecanum")
    if is_mecanum or args.fp_rate > 0.0 or args.action_smooth > 0.0 or args.wheel_slip_std > 0.0:
        if is_mecanum:
-            base_robot = HERDING_MEC_WEBOTS.robot
-            strafe_eff = base_robot.strafe_efficiency
-            strafe_bleed = base_robot.strafe_to_forward_bleed
+            base = HERDING_MEC_WEBOTS_360
+            strafe_eff = base.robot.strafe_efficiency
+            strafe_bleed = base.robot.strafe_to_forward_bleed
+            compass_std = 0.1   # heading robustness DR
        else:
+            base = None
            strafe_eff = 1.0
            strafe_bleed = 0.0
-        herding_cfg = HerdingConfig(
-            domain_random=DomainRandomConfig(
-                fp_rate=args.fp_rate,
-                wheel_slip_std=args.wheel_slip_std,
-            ),
-            robot=RobotConfig(
-                action_smooth=args.action_smooth,
-                strafe_efficiency=strafe_eff,
-                strafe_to_forward_bleed=strafe_bleed,
-            ),
-        )
+            compass_std = 0.0
+        if is_mecanum:
+            herding_cfg = base.replace(
+                domain_random=DomainRandomConfig(
+                    fp_rate=args.fp_rate,
+                    wheel_slip_std=args.wheel_slip_std,
+                    compass_noise_std=compass_std,
+                ),
+                robot=RobotConfig(
+                    action_smooth=args.action_smooth,
+                    strafe_efficiency=strafe_eff,
+                    strafe_to_forward_bleed=strafe_bleed,
+                ),
+            )
+        else:
+            herding_cfg = HerdingConfig(
+                domain_random=DomainRandomConfig(
+                    fp_rate=args.fp_rate,
+                    wheel_slip_std=args.wheel_slip_std,
+                ),
+                robot=RobotConfig(
+                    action_smooth=args.action_smooth,
+                    strafe_efficiency=strafe_eff,
+                    strafe_to_forward_bleed=strafe_bleed,
+                ),
+            )
        print(f"[rl] domain-random: fp_rate={args.fp_rate}  "
              f"action_smooth={args.action_smooth}  "
              f"wheel_slip_std={args.wheel_slip_std}  "
-              f"strafe_eff={strafe_eff:.2f}  strafe_bleed={strafe_bleed:.2f}")
+              f"strafe_eff={strafe_eff:.2f}  strafe_bleed={strafe_bleed:.2f}  "
+              f"compass_noise={compass_std}")

    env_fns = [_make_env(i, args.seed, frame_stack, drive_mode,
                         difficulty=args.difficulty,