Training pipelines auto-select mecanum-Webots preset
* training/bc/collect.py: --use-webots-preset now picks the
drive-matched variant. Mecanum drives get HERDING_MEC_WEBOTS
(with the Webots-calibrated strafe efficiency and bleed) so the
collected demos reflect the imperfect physical mecanum the
deployed policy will see. Differential drives still use
HERDING_WEBOTS (no behaviour change there).
* training/rl/train.py: mecanum fine-tune now *unconditionally*
applies the HERDING_MEC_WEBOTS robot config to the PPO env (the
policy must update against the same imperfect kinematics it
deploys on). Diff fine-tune unchanged.
To retrain a mecanum policy end-to-end against the new proto:
python -m training.bc.collect --drive-mode mecanum --world field \
--use-webots-preset \
--out training/bc/demos_mecanum_field_v2.npz
python -m training.bc.pretrain --demos training/bc/demos_mecanum_field_v2.npz \
--out training/runs/bc_mecanum_field_v2 ...
python -m training.rl.train --bc training/runs/bc_mecanum_field_v2 \
--out training/runs/rl_mecanum_field_v2 \
--drive-mode mecanum --world field --use-webots-preset
The same flow for field_round / mecanum/round.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
+19
-5
@@ -176,17 +176,31 @@ def main():
|
||||
print(f"[demos] WARNING: --world={args.world} but geometry is "
|
||||
f"'{FIELD_SHAPE}'. This should not happen — file a bug.")
|
||||
|
||||
from herding.config import HerdingConfig, HERDING_WEBOTS, DomainRandomConfig, RobotConfig
|
||||
from herding.config import (
|
||||
HerdingConfig, HERDING_WEBOTS, HERDING_MEC_WEBOTS,
|
||||
DomainRandomConfig, RobotConfig,
|
||||
)
|
||||
if args.use_webots_preset:
|
||||
herding_cfg = HERDING_WEBOTS.replace(
|
||||
# Pick the drive-matched Webots preset — for mecanum we use the
|
||||
# variant that simulates the physical-roller proto's strafe
|
||||
# efficiency and forward bleed so the policy trains under the
|
||||
# same imperfect kinematics it sees at deployment.
|
||||
base = HERDING_MEC_WEBOTS if args.drive_mode == "mecanum" else HERDING_WEBOTS
|
||||
herding_cfg = base.replace(
|
||||
domain_random=DomainRandomConfig(
|
||||
fp_rate=args.fp_rate,
|
||||
wheel_slip_std=args.wheel_slip_std,
|
||||
),
|
||||
robot=RobotConfig(action_smooth=args.action_smooth),
|
||||
robot=RobotConfig(
|
||||
action_smooth=args.action_smooth,
|
||||
strafe_efficiency=base.robot.strafe_efficiency,
|
||||
strafe_to_forward_bleed=base.robot.strafe_to_forward_bleed,
|
||||
),
|
||||
)
|
||||
print(f"[demos] HERDING_WEBOTS preset + DR: fp_rate={args.fp_rate} "
|
||||
f"action_smooth={args.action_smooth} wheel_slip_std={args.wheel_slip_std}")
|
||||
preset_name = "HERDING_MEC_WEBOTS" if args.drive_mode == "mecanum" else "HERDING_WEBOTS"
|
||||
print(f"[demos] {preset_name} preset + DR: fp_rate={args.fp_rate} "
|
||||
f"action_smooth={args.action_smooth} wheel_slip_std={args.wheel_slip_std} "
|
||||
f"strafe_eff={herding_cfg.robot.strafe_efficiency:.2f}")
|
||||
else:
|
||||
herding_cfg = None
|
||||
if args.fp_rate > 0.0 or args.action_smooth > 0.0 or args.wheel_slip_std > 0.0:
|
||||
|
||||
Reference in New Issue
Block a user