Sweep dir: runs/sweep_20260425_124021 Search space: ['W_PER_SHEEP', 'W_ALIGN', 'W_PEN_BONUS', 'W_STEP_COST', 'W_COMPLETE', 'W_COMPACT', 'ALIGN_SHAPE', 'ALIGN_GATED', 'ent_coef'] Per-trial: 1,000,000 steps train + 30 eval eps Time budget: 0.5h [Trial 1] {'W_PER_SHEEP': 1.0, 'W_ALIGN': 0.1, 'W_PEN_BONUS': 10.0, 'W_STEP_COST': 0.02, 'W_COMPLETE': 100.0, 'W_COMPACT': 3.0, 'ALIGN_SHAPE': 'standoff', 'ALIGN_GATED': False, 'ent_coef': 0.005} ... [trial 1 | 1 sheep | 50,000 steps | ret(last 32)=-8.33 sr=6%] ... [trial 1 | 1 sheep | 100,000 steps | ret(last 50)=-2.95 sr=6%] ... [trial 1 | 1 sheep | 150,000 steps | ret(last 50)=+12.68 sr=10%] ... [trial 1 | 1 sheep | 200,000 steps | ret(last 50)=+22.15 sr=22%] ... [trial 1 | 1 sheep | 250,000 steps | ret(last 50)=+22.47 sr=18%] ... [trial 1 | 1 sheep | 300,000 steps | ret(last 50)=+23.58 sr=24%] ... [trial 1 | 1 sheep | 350,000 steps | ret(last 50)=+23.42 sr=18%] ... [trial 1 | 1 sheep | 400,000 steps | ret(last 50)=+24.39 sr=32%] ... [trial 1 | 2 sheep | 409,608 steps | ret(last 0)=+nan sr=nan%] ... [trial 1 | 2 sheep | 459,608 steps | ret(last 35)=+15.39 sr=3%] ... [trial 1 | 2 sheep | 509,608 steps | ret(last 50)=+20.25 sr=0%] ... [trial 1 | 2 sheep | 559,608 steps | ret(last 50)=+23.24 sr=4%] ... [trial 1 | 2 sheep | 609,608 steps | ret(last 50)=+23.36 sr=4%] ... [trial 1 | 2 sheep | 659,608 steps | ret(last 50)=+25.32 sr=2%] ... [trial 1 | 2 sheep | 709,608 steps | ret(last 50)=+24.02 sr=4%] ... [trial 1 | 2 sheep | 759,608 steps | ret(last 50)=+24.66 sr=4%] ... [trial 1 | 2 sheep | 809,608 steps | ret(last 50)=+25.41 sr=4%] ... [trial 1 | 2 sheep | 859,608 steps | ret(last 50)=+24.27 sr=4%] ... [trial 1 | 2 sheep | 909,608 steps | ret(last 50)=+25.13 sr=8%] ... [trial 1 | 2 sheep | 959,608 steps | ret(last 50)=+25.10 sr=2%] ... [trial 1 | 2 sheep | 1,009,608 steps | ret(last 50)=+26.02 sr=2%] ... [trial 1 | eval n=1] ... [trial 1 | eval n=2] ... [trial 1 | eval n=3] → score=0.060 sr1=0.30 sr2=0.00 sr3=0.00 [308s] ============================================================================================ LEADERBOARD ============================================================================================ rank score sr1 sr2 sr3 config ---------------------------------------------------------------------------------------- 1 0.060 0.30 0.00 0.00 W_PER_SHEEP=1.0 W_ALIGN=0.1 W_PEN_BONUS=10.0 W_STEP_COST=0.02 W_COMPLETE=100.0 W_COMPACT=3.0 ALIGN_SHAPE=standoff ALIGN_GATED=False ent_coef=0.005 Best config saved to runs/sweep_20260425_124021/best.json Total trials: 1 (1 successful, 0 failed) Total time: 0.09h