Mecanum Webots via Supervisor kinematic injection

Replace the failing ODE-rolled mecanum chassis dynamics with a
Supervisor.setVelocity call that uses the gym mecanum forward
kinematics formula directly. Wheel motors still spin (visual);
chassis motion comes from the gym model so training and deployment
match by construction.

Results (seed=42, n=10 sheep): BC + RL mecanum pen 10/10 in both
field and field_round. n=5 mecanum cells still 0/5 due to tracker
phantoms anchored to wall corners under the 360° LiDAR — documented
in docs/status.md as the remaining gap.

Cleanup: drop deploy-time hacks (HERDING_HEADING_*, HERDING_OMEGA_CLAMP,
HERDING_TRACKER_*) that were workarounds for the old ODE chaos;
revert the proto inertiaMatrix, roller dampingConstant, and reduced
motor torque since they no longer carry load; refresh comments
around the mecanum config presets.
This commit is contained in:
Johnny Fernandes
2026-05-18 22:46:37 +00:00
parent 1df84ae4b5
commit 27c0f65722
25 changed files with 2635 additions and 76 deletions
+36 -20
View File
@@ -276,37 +276,53 @@ def main() -> None:
print(f"[rl] drive_mode={drive_mode} (BC action_dim={bc_action_dim})")
from herding.config import (
HerdingConfig, HERDING_MEC_WEBOTS, DomainRandomConfig, RobotConfig,
HerdingConfig, HERDING_MEC_WEBOTS_360, DomainRandomConfig, RobotConfig,
)
herding_cfg = None
# When fine-tuning a mecanum policy we always apply the Webots
# roller-hinge calibration to the gym kinematics (strafe efficiency
# and bleed). Without this, the RL agent updates against the
# textbook X-pattern and fails on deployment.
# Mecanum always trains under HERDING_MEC_WEBOTS_360 (360° LiDAR +
# kinematic-matched strafe scaling + small compass-noise DR).
is_mecanum = (drive_mode == "mecanum")
if is_mecanum or args.fp_rate > 0.0 or args.action_smooth > 0.0 or args.wheel_slip_std > 0.0:
if is_mecanum:
base_robot = HERDING_MEC_WEBOTS.robot
strafe_eff = base_robot.strafe_efficiency
strafe_bleed = base_robot.strafe_to_forward_bleed
base = HERDING_MEC_WEBOTS_360
strafe_eff = base.robot.strafe_efficiency
strafe_bleed = base.robot.strafe_to_forward_bleed
compass_std = 0.1 # heading robustness DR
else:
base = None
strafe_eff = 1.0
strafe_bleed = 0.0
herding_cfg = HerdingConfig(
domain_random=DomainRandomConfig(
fp_rate=args.fp_rate,
wheel_slip_std=args.wheel_slip_std,
),
robot=RobotConfig(
action_smooth=args.action_smooth,
strafe_efficiency=strafe_eff,
strafe_to_forward_bleed=strafe_bleed,
),
)
compass_std = 0.0
if is_mecanum:
herding_cfg = base.replace(
domain_random=DomainRandomConfig(
fp_rate=args.fp_rate,
wheel_slip_std=args.wheel_slip_std,
compass_noise_std=compass_std,
),
robot=RobotConfig(
action_smooth=args.action_smooth,
strafe_efficiency=strafe_eff,
strafe_to_forward_bleed=strafe_bleed,
),
)
else:
herding_cfg = HerdingConfig(
domain_random=DomainRandomConfig(
fp_rate=args.fp_rate,
wheel_slip_std=args.wheel_slip_std,
),
robot=RobotConfig(
action_smooth=args.action_smooth,
strafe_efficiency=strafe_eff,
strafe_to_forward_bleed=strafe_bleed,
),
)
print(f"[rl] domain-random: fp_rate={args.fp_rate} "
f"action_smooth={args.action_smooth} "
f"wheel_slip_std={args.wheel_slip_std} "
f"strafe_eff={strafe_eff:.2f} strafe_bleed={strafe_bleed:.2f}")
f"strafe_eff={strafe_eff:.2f} strafe_bleed={strafe_bleed:.2f} "
f"compass_noise={compass_std}")
env_fns = [_make_env(i, args.seed, frame_stack, drive_mode,
difficulty=args.difficulty,