Mecanum Webots via Supervisor kinematic injection
Replace the failing ODE-rolled mecanum chassis dynamics with a Supervisor.setVelocity call that uses the gym mecanum forward kinematics formula directly. Wheel motors still spin (visual); chassis motion comes from the gym model so training and deployment match by construction. Results (seed=42, n=10 sheep): BC + RL mecanum pen 10/10 in both field and field_round. n=5 mecanum cells still 0/5 due to tracker phantoms anchored to wall corners under the 360° LiDAR — documented in docs/status.md as the remaining gap. Cleanup: drop deploy-time hacks (HERDING_HEADING_*, HERDING_OMEGA_CLAMP, HERDING_TRACKER_*) that were workarounds for the old ODE chaos; revert the proto inertiaMatrix, roller dampingConstant, and reduced motor torque since they no longer carry load; refresh comments around the mecanum config presets.
This commit is contained in:
+36
-20
@@ -276,37 +276,53 @@ def main() -> None:
|
||||
print(f"[rl] drive_mode={drive_mode} (BC action_dim={bc_action_dim})")
|
||||
|
||||
from herding.config import (
|
||||
HerdingConfig, HERDING_MEC_WEBOTS, DomainRandomConfig, RobotConfig,
|
||||
HerdingConfig, HERDING_MEC_WEBOTS_360, DomainRandomConfig, RobotConfig,
|
||||
)
|
||||
herding_cfg = None
|
||||
# When fine-tuning a mecanum policy we always apply the Webots
|
||||
# roller-hinge calibration to the gym kinematics (strafe efficiency
|
||||
# and bleed). Without this, the RL agent updates against the
|
||||
# textbook X-pattern and fails on deployment.
|
||||
# Mecanum always trains under HERDING_MEC_WEBOTS_360 (360° LiDAR +
|
||||
# kinematic-matched strafe scaling + small compass-noise DR).
|
||||
is_mecanum = (drive_mode == "mecanum")
|
||||
if is_mecanum or args.fp_rate > 0.0 or args.action_smooth > 0.0 or args.wheel_slip_std > 0.0:
|
||||
if is_mecanum:
|
||||
base_robot = HERDING_MEC_WEBOTS.robot
|
||||
strafe_eff = base_robot.strafe_efficiency
|
||||
strafe_bleed = base_robot.strafe_to_forward_bleed
|
||||
base = HERDING_MEC_WEBOTS_360
|
||||
strafe_eff = base.robot.strafe_efficiency
|
||||
strafe_bleed = base.robot.strafe_to_forward_bleed
|
||||
compass_std = 0.1 # heading robustness DR
|
||||
else:
|
||||
base = None
|
||||
strafe_eff = 1.0
|
||||
strafe_bleed = 0.0
|
||||
herding_cfg = HerdingConfig(
|
||||
domain_random=DomainRandomConfig(
|
||||
fp_rate=args.fp_rate,
|
||||
wheel_slip_std=args.wheel_slip_std,
|
||||
),
|
||||
robot=RobotConfig(
|
||||
action_smooth=args.action_smooth,
|
||||
strafe_efficiency=strafe_eff,
|
||||
strafe_to_forward_bleed=strafe_bleed,
|
||||
),
|
||||
)
|
||||
compass_std = 0.0
|
||||
if is_mecanum:
|
||||
herding_cfg = base.replace(
|
||||
domain_random=DomainRandomConfig(
|
||||
fp_rate=args.fp_rate,
|
||||
wheel_slip_std=args.wheel_slip_std,
|
||||
compass_noise_std=compass_std,
|
||||
),
|
||||
robot=RobotConfig(
|
||||
action_smooth=args.action_smooth,
|
||||
strafe_efficiency=strafe_eff,
|
||||
strafe_to_forward_bleed=strafe_bleed,
|
||||
),
|
||||
)
|
||||
else:
|
||||
herding_cfg = HerdingConfig(
|
||||
domain_random=DomainRandomConfig(
|
||||
fp_rate=args.fp_rate,
|
||||
wheel_slip_std=args.wheel_slip_std,
|
||||
),
|
||||
robot=RobotConfig(
|
||||
action_smooth=args.action_smooth,
|
||||
strafe_efficiency=strafe_eff,
|
||||
strafe_to_forward_bleed=strafe_bleed,
|
||||
),
|
||||
)
|
||||
print(f"[rl] domain-random: fp_rate={args.fp_rate} "
|
||||
f"action_smooth={args.action_smooth} "
|
||||
f"wheel_slip_std={args.wheel_slip_std} "
|
||||
f"strafe_eff={strafe_eff:.2f} strafe_bleed={strafe_bleed:.2f}")
|
||||
f"strafe_eff={strafe_eff:.2f} strafe_bleed={strafe_bleed:.2f} "
|
||||
f"compass_noise={compass_std}")
|
||||
|
||||
env_fns = [_make_env(i, args.seed, frame_stack, drive_mode,
|
||||
difficulty=args.difficulty,
|
||||
|
||||
Reference in New Issue
Block a user