Webots sim-to-real fixes, DAgger pipeline, 360° proto variant

Today's session worked across the full Webots delivery stack — found and
fixed a cluster of bugs blocking the BC/RL transfer, then explored
training-side mitigations for the residual perception gap.

Bug fixes:
- Makefile FP_RATE default 2.0 → 0.0: BC demos used fp_rate=0 but RL
  fine-tune defaulted to fp_rate=2, poisoning the BC obs distribution
  and stalling PPO at 0% success across 1.46M+ steps.
- controllers/{shepherd_dog,sheep}/runtime.ini: Webots was launching
  controllers under system python3 (no numpy) and they were crashing
  silently. Pinned to the conda tir env.
- herding/config.py HERDING_WEBOTS preset: pen_latch_depth 0.5 → 2.0,
  max_new_tracks_per_step 3 → 1, static_reject 0.8 → 1.2. Stops phantom
  FPs near the gate from latching as permanently-penned tracks.
- herding/perception/sheep_tracker.py: penned tracks now decay at
  forget_steps × 8 instead of living forever. Adds get_positions
  min_freshness filter for deploy-time use.

Training/eval matches deployment:
- training/bc/collect.py: --dagger-policy flag for DAgger rollouts
  (policy drives, teacher labels) + --use-webots-preset for matched
  140° tracker + DR config.
- controllers/shepherd_dog/shepherd_dog.py: scan-fallback (0, 0.6) when
  BC/RL sees empty sheep_positions — recovers from FOV gaps.

Tooling:
- tools/dagger_round.sh: one-shot DAgger round (collect + concat + bc).
- tools/webots_sweep_gt.sh: full sweep with HERDING_USE_GT=1 for the
  perception-gap diagnosis matrix.
- protos/ShepherdDog360.proto: 360° FOV variant for the FOV-ablation
  comparison. Canonical proto stays at 140° per project spec.

Artifacts: v1 BC/RL policies for all 4 (drive × world) combos trained
in clean gym (success: diff/field 90-100%, diff/round 58%, mec/field
60-100%, mec/round 50-100%). DAgger r1/r2 BCs for diff/field show
12%→38% progression on gym HERDING_WEBOTS proxy but did not close
to actual Webots LiDAR (0/5 throughout). Next: LSTM policy or
learned tracker per the project-state memory.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
Johnny Fernandes
2026-05-16 17:21:02 +00:00
parent c61df91950
commit dd5ac669e5
34 changed files with 2336 additions and 188 deletions
+31 -11
View File
@@ -2,20 +2,25 @@
Raycasts against sheep (discs) and static world geometry. For rectangular
fields this is axis-aligned walls + gate posts; for round fields it is a
circular wall + gate posts. The env reproduces the false-positive cluster
distribution Webots produces from real 3D geometry.
circular wall + gate posts.
Returns a range array matching the Webots Lidar device:
180 rays, 140° FOV centred on forward, 12 m max range, 5 mm noise.
See ``protos/ShepherdDog.proto``.
The module-level constants (``LIDAR_N_RAYS``, ``LIDAR_FOV``, etc.) reflect
the original 360°/360-ray oracle configuration. Pass a
:class:`~herding.config.LidarConfig` to :func:`simulate_scan` to use a
different spec (e.g. :data:`~herding.config.LIDAR_WEBOTS` for 180-ray/140°
matching the ShepherdDog.proto hardware).
"""
from __future__ import annotations
import math
from typing import TYPE_CHECKING
import numpy as np
if TYPE_CHECKING:
from herding.config import LidarConfig
from herding.world.geometry import (
FIELD_SHAPE, FIELD_ROUND_R,
FIELD_X, FIELD_Y,
@@ -192,14 +197,30 @@ def simulate_scan(
noise: float = LIDAR_NOISE,
max_range: float = LIDAR_MAX_RANGE,
rng: np.random.Generator | None = None,
lidar_cfg: "LidarConfig | None" = None,
) -> np.ndarray:
"""Return a (N,) float32 range array. No-hit entries equal ``max_range``.
``sheep_xy`` is every sheep (penned or active) in the scene.
Pass ``lidar_cfg`` to override the module-level defaults for a single
call (e.g. to use :data:`~herding.config.LIDAR_WEBOTS`).
"""
ch, sh = math.cos(dog_heading), math.sin(dog_heading)
cos_w = ch * _COS - sh * _SIN
sin_w = sh * _COS + ch * _SIN
if lidar_cfg is not None:
n_rays = lidar_cfg.n_rays
fov = lidar_cfg.fov_rad
max_range = lidar_cfg.max_range
noise = lidar_cfg.noise_std
sheep_r2 = lidar_cfg.sheep_radius ** 2
angles = ray_angles(n_rays, fov)
ch, sh = math.cos(dog_heading), math.sin(dog_heading)
cos_w = ch * np.cos(angles) - sh * np.sin(angles)
sin_w = sh * np.cos(angles) + ch * np.sin(angles)
else:
sheep_r2 = SHEEP_RADIUS ** 2
ch, sh = math.cos(dog_heading), math.sin(dog_heading)
cos_w = ch * _COS - sh * _SIN
sin_w = sh * _COS + ch * _SIN
best = _raycast_static(dog_x, dog_y, cos_w, sin_w)
@@ -209,9 +230,8 @@ def simulate_scan(
t = np.outer(sx, cos_w) + np.outer(sy, sin_w)
s_dist2 = (sx ** 2 + sy ** 2)[:, None]
perp2 = s_dist2 - t ** 2
R2 = SHEEP_RADIUS ** 2
hit = (perp2 < R2) & (t > 0.0)
half = np.sqrt(np.clip(R2 - perp2, 0.0, None))
hit = (perp2 < sheep_r2) & (t > 0.0)
half = np.sqrt(np.clip(sheep_r2 - perp2, 0.0, None))
candidate = np.where(hit, t - half, np.inf)
nearest = candidate.min(axis=0)
np.minimum(best, nearest, out=best)