Drop webots_quick target; mecanum BC demos now auto-use HERDING_MEC_WEBOTS
* Remove `webots_quick` Makefile target — `make webots` is the only
webots entry point now (it fires the interactive picker). The
positional non-interactive path is still available as
`bash tools/run_webots.sh N MODE DRIVE WORLD` for scripted use.
* Add `WEBOTS_PRESET_FLAG = --use-webots-preset` for mecanum drive
and pass it to the `bc.collect` recipe so demos are collected
under the gym kinematics that match the physical-roller Webots
mecanum. Without this, mecanum BC demos would record textbook
X-pattern teacher actions against textbook gym kinematics, and
the resulting policy would fail at deployment exactly the same
way the current v1 mecanum policies do.
* `rl/train.py` already auto-detects mecanum and applies
HERDING_MEC_WEBOTS internally (commit 3b4c99a), so the rl recipe
doesn't need the flag — a one-line comment in the Makefile makes
that intent explicit.
Diff drive keeps the existing recipe: no --use-webots-preset, so
BC demos collected on HERDING_DEFAULT (360° gym, no FP). This is
the regime that produced the current diff/field and diff/round
policies that pen 5/5 in Webots LiDAR; retraining under the same
regime is the safest reproduction.
126 pytest cases still pass.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
@@ -146,7 +146,7 @@ MODE ?= rl
|
|||||||
|
|
||||||
|
|
||||||
.PHONY: all bc_demos bc rl rl_fast eval eval_fast eval_all eval_all_fast \
|
.PHONY: all bc_demos bc rl rl_fast eval eval_fast eval_all eval_all_fast \
|
||||||
test webots webots_quick webots_sweep clean clean_all help \
|
test webots webots_sweep clean clean_all help \
|
||||||
train_all train_diff_rect train_diff_round \
|
train_all train_diff_rect train_diff_round \
|
||||||
train_mec_rect train_mec_round \
|
train_mec_rect train_mec_round \
|
||||||
train_all_fast train_diff_rect_fast train_diff_round_fast \
|
train_all_fast train_diff_rect_fast train_diff_round_fast \
|
||||||
@@ -161,6 +161,17 @@ export HERDING_WORLD = $(WORLD)
|
|||||||
# the build is run under tee / nohup / tmux pipes.
|
# the build is run under tee / nohup / tmux pipes.
|
||||||
export PYTHONUNBUFFERED = 1
|
export PYTHONUNBUFFERED = 1
|
||||||
|
|
||||||
|
# Mecanum needs --use-webots-preset so collect/rl pick up
|
||||||
|
# HERDING_MEC_WEBOTS — the gym mecanum kinematics get the strafe
|
||||||
|
# efficiency and forward-bleed match against the physical-roller
|
||||||
|
# Webots proto. Without this flag the policy trains on textbook
|
||||||
|
# X-pattern mecanum and fails on deployment.
|
||||||
|
ifeq ($(DRIVE),mecanum)
|
||||||
|
WEBOTS_PRESET_FLAG = --use-webots-preset
|
||||||
|
else
|
||||||
|
WEBOTS_PRESET_FLAG =
|
||||||
|
endif
|
||||||
|
|
||||||
bc_demos: $(BC_DEMOS)
|
bc_demos: $(BC_DEMOS)
|
||||||
$(BC_DEMOS):
|
$(BC_DEMOS):
|
||||||
$(PY) -m training.bc.collect \
|
$(PY) -m training.bc.collect \
|
||||||
@@ -171,7 +182,8 @@ $(BC_DEMOS):
|
|||||||
--max-steps $(DEMO_MAX_STEPS) \
|
--max-steps $(DEMO_MAX_STEPS) \
|
||||||
--fp-rate $(FP_RATE) \
|
--fp-rate $(FP_RATE) \
|
||||||
--action-smooth $(ACTION_SMOOTH_TRAIN) \
|
--action-smooth $(ACTION_SMOOTH_TRAIN) \
|
||||||
--wheel-slip-std $(WHEEL_SLIP_STD)
|
--wheel-slip-std $(WHEEL_SLIP_STD) \
|
||||||
|
$(WEBOTS_PRESET_FLAG)
|
||||||
|
|
||||||
bc: $(BC_POLICY)
|
bc: $(BC_POLICY)
|
||||||
$(BC_POLICY): $(BC_DEMOS)
|
$(BC_POLICY): $(BC_DEMOS)
|
||||||
@@ -190,6 +202,8 @@ $(RL_POLICY): $(BC_POLICY)
|
|||||||
--fp-rate $(FP_RATE) \
|
--fp-rate $(FP_RATE) \
|
||||||
--action-smooth $(ACTION_SMOOTH_TRAIN) \
|
--action-smooth $(ACTION_SMOOTH_TRAIN) \
|
||||||
--wheel-slip-std $(WHEEL_SLIP_STD)
|
--wheel-slip-std $(WHEEL_SLIP_STD)
|
||||||
|
# (rl/train.py auto-applies HERDING_MEC_WEBOTS when drive=mecanum;
|
||||||
|
# no --use-webots-preset flag needed.)
|
||||||
|
|
||||||
eval: $(RL_POLICY)
|
eval: $(RL_POLICY)
|
||||||
$(PY) -m training.eval --policy $(RL_DIR) \
|
$(PY) -m training.eval --policy $(RL_DIR) \
|
||||||
@@ -223,9 +237,6 @@ test:
|
|||||||
webots:
|
webots:
|
||||||
@bash tools/webots_menu.sh
|
@bash tools/webots_menu.sh
|
||||||
|
|
||||||
webots_quick:
|
|
||||||
tools/run_webots.sh $(N) $(MODE) $(DRIVE) $(WORLD)
|
|
||||||
|
|
||||||
# Headless sweep across all modes × worlds × flock sizes.
|
# Headless sweep across all modes × worlds × flock sizes.
|
||||||
# Results are written to webots_sweep.log.
|
# Results are written to webots_sweep.log.
|
||||||
# Set USE_GT=1 to bypass LiDAR tracker (isolate perception from policy).
|
# Set USE_GT=1 to bypass LiDAR tracker (isolate perception from policy).
|
||||||
|
|||||||
Reference in New Issue
Block a user