dd5ac669e5
Today's session worked across the full Webots delivery stack — found and
fixed a cluster of bugs blocking the BC/RL transfer, then explored
training-side mitigations for the residual perception gap.
Bug fixes:
- Makefile FP_RATE default 2.0 → 0.0: BC demos used fp_rate=0 but RL
fine-tune defaulted to fp_rate=2, poisoning the BC obs distribution
and stalling PPO at 0% success across 1.46M+ steps.
- controllers/{shepherd_dog,sheep}/runtime.ini: Webots was launching
controllers under system python3 (no numpy) and they were crashing
silently. Pinned to the conda tir env.
- herding/config.py HERDING_WEBOTS preset: pen_latch_depth 0.5 → 2.0,
max_new_tracks_per_step 3 → 1, static_reject 0.8 → 1.2. Stops phantom
FPs near the gate from latching as permanently-penned tracks.
- herding/perception/sheep_tracker.py: penned tracks now decay at
forget_steps × 8 instead of living forever. Adds get_positions
min_freshness filter for deploy-time use.
Training/eval matches deployment:
- training/bc/collect.py: --dagger-policy flag for DAgger rollouts
(policy drives, teacher labels) + --use-webots-preset for matched
140° tracker + DR config.
- controllers/shepherd_dog/shepherd_dog.py: scan-fallback (0, 0.6) when
BC/RL sees empty sheep_positions — recovers from FOV gaps.
Tooling:
- tools/dagger_round.sh: one-shot DAgger round (collect + concat + bc).
- tools/webots_sweep_gt.sh: full sweep with HERDING_USE_GT=1 for the
perception-gap diagnosis matrix.
- protos/ShepherdDog360.proto: 360° FOV variant for the FOV-ablation
comparison. Canonical proto stays at 140° per project spec.
Artifacts: v1 BC/RL policies for all 4 (drive × world) combos trained
in clean gym (success: diff/field 90-100%, diff/round 58%, mec/field
60-100%, mec/round 50-100%). DAgger r1/r2 BCs for diff/field show
12%→38% progression on gym HERDING_WEBOTS proxy but did not close
to actual Webots LiDAR (0/5 throughout). Next: LSTM policy or
learned tracker per the project-state memory.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
336 lines
12 KiB
Python
336 lines
12 KiB
Python
"""Central configuration dataclasses for the herding simulation.
|
||
|
||
Every tunable constant that previously lived as a module-level literal in
|
||
perception/lidar_sim.py, perception/lidar_perception.py,
|
||
perception/sheep_tracker.py, world/geometry.py, or training/herding_env.py
|
||
is now represented here as a field with its original default value.
|
||
|
||
Usage — use the module defaults unchanged::
|
||
|
||
env = HerdingEnv() # same behaviour as before
|
||
|
||
Override a subset of parameters::
|
||
|
||
from herding.config import HerdingConfig, TrackerConfig
|
||
cfg = HerdingConfig(tracker=TrackerConfig(forget_steps=60))
|
||
env = HerdingEnv(herding_cfg=cfg)
|
||
|
||
Use a named preset for Webots-matched training::
|
||
|
||
from herding.config import HERDING_WEBOTS
|
||
env = HerdingEnv(herding_cfg=HERDING_WEBOTS)
|
||
|
||
Design notes
|
||
------------
|
||
* All dataclasses are frozen — instances are immutable after construction.
|
||
* This module must not import from other ``herding.*`` packages to avoid
|
||
import cycles. Field-geometry constants (pen coordinates, field size)
|
||
stay in ``herding.world.geometry`` because they depend on the world
|
||
variant selected at runtime via ``HERDING_WORLD``.
|
||
"""
|
||
|
||
from __future__ import annotations
|
||
|
||
import math
|
||
from dataclasses import dataclass, field, replace
|
||
|
||
|
||
# ---------------------------------------------------------------------------
|
||
# LiDAR hardware spec
|
||
# ---------------------------------------------------------------------------
|
||
|
||
@dataclass(frozen=True)
|
||
class LidarConfig:
|
||
"""Parameters of the simulated / physical LiDAR sensor.
|
||
|
||
The two canonical presets are :data:`LIDAR_FULL` (360°, oracle mode)
|
||
and :data:`LIDAR_WEBOTS` (140°/180-ray, matches the ShepherdDog proto).
|
||
"""
|
||
|
||
n_rays: int = 360
|
||
"""Number of rays in the scan."""
|
||
|
||
fov_rad: float = 2.0 * math.pi
|
||
"""Full field-of-view in radians, centred on the robot's forward axis."""
|
||
|
||
max_range: float = 12.0
|
||
"""Maximum detectable range in metres."""
|
||
|
||
noise_std: float = 0.005
|
||
"""Gaussian standard deviation (metres) applied to each hit reading."""
|
||
|
||
sheep_radius: float = 0.30
|
||
"""Effective disc radius of a sheep in the 2-D LiDAR plane (metres)."""
|
||
|
||
post_radius: float = 0.25
|
||
"""Effective disc radius of gate / corner posts (metres)."""
|
||
|
||
def __post_init__(self) -> None:
|
||
if self.n_rays < 1:
|
||
raise ValueError(f"n_rays must be ≥ 1, got {self.n_rays}")
|
||
if not (0.0 < self.fov_rad <= 2.0 * math.pi):
|
||
raise ValueError(f"fov_rad must be in (0, 2π], got {self.fov_rad:.4f}")
|
||
if self.max_range <= 0.0:
|
||
raise ValueError(f"max_range must be > 0, got {self.max_range}")
|
||
|
||
|
||
# Named presets -----------------------------------------------------------
|
||
|
||
LIDAR_FULL = LidarConfig(
|
||
n_rays=360,
|
||
fov_rad=2.0 * math.pi,
|
||
)
|
||
"""360° full-circle scan — oracle / ablation mode."""
|
||
|
||
LIDAR_WEBOTS = LidarConfig(
|
||
n_rays=180,
|
||
fov_rad=math.radians(140.0),
|
||
)
|
||
"""Matches the ShepherdDog.proto Lidar device (180 rays, 140° FOV).
|
||
|
||
Training with this preset closes the sim-to-real gap for the sensor
|
||
geometry. Because the observation is built from tracker output (not raw
|
||
rays), a policy trained here can be deployed on a wider-FOV LiDAR (e.g.
|
||
240° or 360°) without retraining — more FOV means more true detections,
|
||
which can only improve tracker quality.
|
||
"""
|
||
|
||
|
||
# ---------------------------------------------------------------------------
|
||
# Cluster-detection pipeline
|
||
# ---------------------------------------------------------------------------
|
||
|
||
@dataclass(frozen=True)
|
||
class DetectionConfig:
|
||
"""Parameters for the LiDAR-scan → detection clustering pipeline."""
|
||
|
||
gap_threshold: float = 0.6
|
||
"""Adjacent hit-points farther apart than this (metres) start a new cluster."""
|
||
|
||
max_cluster_span: float = 1.5
|
||
"""Clusters wider than this (metres) are rejected as walls / structures."""
|
||
|
||
range_hit_eps: float = 0.05
|
||
"""A ray is considered a hit if ``range < max_range - range_hit_eps``."""
|
||
|
||
split_range_gap: float = 0.20
|
||
"""Range increase within a cluster that triggers a multi-peak split."""
|
||
|
||
wall_reject: float = 0.5
|
||
"""Drop detections within this distance (metres) of any field wall."""
|
||
|
||
static_reject: float = 0.8
|
||
"""Drop detections within this distance (metres) of known static features
|
||
(gate posts, field corners)."""
|
||
|
||
def __post_init__(self) -> None:
|
||
if self.wall_reject < 0.0:
|
||
raise ValueError(f"wall_reject must be ≥ 0, got {self.wall_reject}")
|
||
if self.static_reject < 0.0:
|
||
raise ValueError(f"static_reject must be ≥ 0, got {self.static_reject}")
|
||
|
||
|
||
# ---------------------------------------------------------------------------
|
||
# Multi-target tracker
|
||
# ---------------------------------------------------------------------------
|
||
|
||
@dataclass(frozen=True)
|
||
class TrackerConfig:
|
||
"""Parameters for the nearest-neighbour sheep tracker."""
|
||
|
||
gate_m: float = 2.5
|
||
"""Primary NN association gate in metres (recently observed tracks)."""
|
||
|
||
reacquire_gate_m: float = 4.5
|
||
"""Wider gate used when re-acquiring tracks stale for ≥ ``reacquire_min_age`` steps."""
|
||
|
||
reacquire_min_age: int = 20
|
||
"""Minimum staleness (steps) before the wider re-acquisition gate activates."""
|
||
|
||
penned_gate_m: float = 4.0
|
||
"""Gate for matching new detections to already-penned tracks."""
|
||
|
||
forget_steps: int = 200
|
||
"""Delete an active track that has not been observed for this many steps (~3.2 s)."""
|
||
|
||
predict_steps: int = 120
|
||
"""Extrapolate a track's position using constant velocity for this many steps (~1.9 s)."""
|
||
|
||
velocity_clamp: float = 1.0
|
||
"""Maximum predicted speed (m/s) used during extrapolation."""
|
||
|
||
max_new_tracks_per_step: int = 10
|
||
"""Maximum number of new tracks that may be spawned in a single step.
|
||
|
||
Capping this limits the damage from LiDAR false-positive bursts (e.g.
|
||
wall reflections in Webots) that would otherwise flood the track set.
|
||
The default (10 = MAX_SHEEP) preserves the original behaviour; reduce
|
||
to 2–3 for Webots deployment robustness.
|
||
"""
|
||
|
||
pen_latch_depth: float = 0.0
|
||
"""Minimum depth past the gate line (metres) before a track is latched
|
||
as penned. 0.0 = original behaviour (latch at y ≤ GATE_Y). Increase
|
||
to 0.5 for Webots to prevent gate-hardware LiDAR reflections near y=-15
|
||
from permanently consuming tracker slots as false "penned" sheep.
|
||
"""
|
||
|
||
def __post_init__(self) -> None:
|
||
if self.forget_steps < 1:
|
||
raise ValueError(f"forget_steps must be ≥ 1, got {self.forget_steps}")
|
||
if self.max_new_tracks_per_step < 1:
|
||
raise ValueError(
|
||
f"max_new_tracks_per_step must be ≥ 1, got {self.max_new_tracks_per_step}"
|
||
)
|
||
|
||
|
||
# ---------------------------------------------------------------------------
|
||
# Robot physical specification
|
||
# ---------------------------------------------------------------------------
|
||
|
||
@dataclass(frozen=True)
|
||
class RobotConfig:
|
||
"""Physical parameters of the shepherd-dog robot.
|
||
|
||
Values mirror ``protos/ShepherdDog.proto`` and ``protos/ShepherdDogMecanum.proto``.
|
||
"""
|
||
|
||
wheel_radius: float = 0.038
|
||
"""Wheel radius in metres."""
|
||
|
||
wheel_base: float = 0.28
|
||
"""Axle-to-axle distance for differential drive (metres)."""
|
||
|
||
wheel_base_x: float = 0.28
|
||
"""Front-to-back axle distance for mecanum drive (metres)."""
|
||
|
||
wheel_base_y: float = 0.28
|
||
"""Left-to-right axle distance for mecanum drive (metres)."""
|
||
|
||
max_wheel_omega: float = 70.0
|
||
"""Maximum wheel angular velocity (rad/s)."""
|
||
|
||
action_smooth: float = 0.0
|
||
"""Exponential moving-average coefficient applied to actions inside the env.
|
||
|
||
``0.0`` means no smoothing (gym default).
|
||
``0.55`` matches the hard-coded EMA in ``shepherd_dog.py`` — use this
|
||
when training so the policy learns to act through the same filter it
|
||
sees at deployment.
|
||
"""
|
||
|
||
def __post_init__(self) -> None:
|
||
if not (0.0 <= self.action_smooth < 1.0):
|
||
raise ValueError(
|
||
f"action_smooth must be in [0, 1), got {self.action_smooth}"
|
||
)
|
||
|
||
@property
|
||
def max_linear(self) -> float:
|
||
"""Maximum achievable linear speed (m/s)."""
|
||
return self.wheel_radius * self.max_wheel_omega
|
||
|
||
|
||
# ---------------------------------------------------------------------------
|
||
# Domain randomisation
|
||
# ---------------------------------------------------------------------------
|
||
|
||
@dataclass(frozen=True)
|
||
class DomainRandomConfig:
|
||
"""Parameters that inject physics / sensor noise for domain randomisation.
|
||
|
||
All values default to 0 (disabled) so the base env is deterministic and
|
||
backwards-compatible. Enable them gradually to close the sim-to-real gap.
|
||
"""
|
||
|
||
fp_rate: float = 0.0
|
||
"""Mean number of false-positive detections injected per step (Poisson λ).
|
||
|
||
FPs are placed near static features (walls, posts) with positional
|
||
noise ``fp_std_pos``, mimicking the spurious clusters Webots' physical
|
||
LiDAR returns from 3D geometry.
|
||
"""
|
||
|
||
fp_std_pos: float = 0.3
|
||
"""Positional standard deviation (metres) of injected false-positive clusters."""
|
||
|
||
wheel_slip_std: float = 0.0
|
||
"""Gaussian noise standard deviation (rad/s) added to each wheel speed
|
||
before kinematic integration. Models real-world wheel slip and motor
|
||
variation. Suggested starting value: 0.05.
|
||
"""
|
||
|
||
compass_noise_std: float = 0.0
|
||
"""Gaussian noise standard deviation (radians) added to the heading
|
||
reading each step. Models magnetometer drift in Webots.
|
||
Suggested starting value: 0.02.
|
||
"""
|
||
|
||
def __post_init__(self) -> None:
|
||
if self.fp_rate < 0.0:
|
||
raise ValueError(f"fp_rate must be ≥ 0, got {self.fp_rate}")
|
||
if self.wheel_slip_std < 0.0:
|
||
raise ValueError(f"wheel_slip_std must be ≥ 0, got {self.wheel_slip_std}")
|
||
if self.compass_noise_std < 0.0:
|
||
raise ValueError(f"compass_noise_std must be ≥ 0, got {self.compass_noise_std}")
|
||
|
||
|
||
# ---------------------------------------------------------------------------
|
||
# Aggregate config
|
||
# ---------------------------------------------------------------------------
|
||
|
||
@dataclass(frozen=True)
|
||
class HerdingConfig:
|
||
"""Root configuration object passed to :class:`~training.herding_env.HerdingEnv`.
|
||
|
||
Sub-configs default to the original simulation parameters so that
|
||
``HerdingEnv()`` and ``HerdingEnv(herding_cfg=HerdingConfig())`` produce
|
||
identical behaviour.
|
||
"""
|
||
|
||
lidar: LidarConfig = field(default_factory=LidarConfig)
|
||
detection: DetectionConfig = field(default_factory=DetectionConfig)
|
||
tracker: TrackerConfig = field(default_factory=TrackerConfig)
|
||
robot: RobotConfig = field(default_factory=RobotConfig)
|
||
domain_random: DomainRandomConfig = field(default_factory=DomainRandomConfig)
|
||
|
||
def replace(self, **kwargs) -> "HerdingConfig":
|
||
"""Return a new config with selected top-level sub-configs replaced.
|
||
|
||
Example::
|
||
|
||
cfg = HERDING_WEBOTS.replace(
|
||
domain_random=DomainRandomConfig(fp_rate=2.0, wheel_slip_std=0.05)
|
||
)
|
||
"""
|
||
return replace(self, **kwargs)
|
||
|
||
|
||
# ---------------------------------------------------------------------------
|
||
# Named full-pipeline presets
|
||
# ---------------------------------------------------------------------------
|
||
|
||
HERDING_DEFAULT = HerdingConfig()
|
||
"""Original simulation defaults — zero behaviour change."""
|
||
|
||
HERDING_WEBOTS = HerdingConfig(
|
||
lidar=LIDAR_WEBOTS,
|
||
detection=DetectionConfig(wall_reject=0.5, static_reject=1.2),
|
||
tracker=TrackerConfig(
|
||
forget_steps=120,
|
||
max_new_tracks_per_step=1,
|
||
pen_latch_depth=2.0,
|
||
),
|
||
robot=RobotConfig(action_smooth=0.55),
|
||
)
|
||
"""Webots-matched training preset.
|
||
|
||
Changes vs HERDING_DEFAULT:
|
||
* LiDAR: 180 rays / 140° FOV matching ShepherdDog.proto hardware
|
||
* Detection: wall_reject kept at 0.5 m (original default; static_reject
|
||
handles post FPs; 1.0 m was too aggressive near the south gate)
|
||
* Tracker: forget_steps 200 → 60 (~1 s ghost-track lifetime)
|
||
max_new_tracks_per_step 10 → 3 (rate-caps FP flooding)
|
||
* Robot: action_smooth 0.0 → 0.55 (matches Webots controller EMA)
|
||
"""
|