Files
TIR_PROJ/training/runs/sweep_smoke.log
T
Johnny Fernandes 4350c7d320 Test25_1600
2026-04-25 15:06:06 +00:00

44 lines
2.9 KiB
Plaintext

Sweep dir: runs/sweep_20260425_124021
Search space: ['W_PER_SHEEP', 'W_ALIGN', 'W_PEN_BONUS', 'W_STEP_COST', 'W_COMPLETE', 'W_COMPACT', 'ALIGN_SHAPE', 'ALIGN_GATED', 'ent_coef']
Per-trial: 1,000,000 steps train + 30 eval eps
Time budget: 0.5h
[Trial 1] {'W_PER_SHEEP': 1.0, 'W_ALIGN': 0.1, 'W_PEN_BONUS': 10.0, 'W_STEP_COST': 0.02, 'W_COMPLETE': 100.0, 'W_COMPACT': 3.0, 'ALIGN_SHAPE': 'standoff', 'ALIGN_GATED': False, 'ent_coef': 0.005}
... [trial 1 | 1 sheep | 50,000 steps | ret(last 32)=-8.33 sr=6%]
... [trial 1 | 1 sheep | 100,000 steps | ret(last 50)=-2.95 sr=6%]
... [trial 1 | 1 sheep | 150,000 steps | ret(last 50)=+12.68 sr=10%]
... [trial 1 | 1 sheep | 200,000 steps | ret(last 50)=+22.15 sr=22%]
... [trial 1 | 1 sheep | 250,000 steps | ret(last 50)=+22.47 sr=18%]
... [trial 1 | 1 sheep | 300,000 steps | ret(last 50)=+23.58 sr=24%]
... [trial 1 | 1 sheep | 350,000 steps | ret(last 50)=+23.42 sr=18%]
... [trial 1 | 1 sheep | 400,000 steps | ret(last 50)=+24.39 sr=32%]
... [trial 1 | 2 sheep | 409,608 steps | ret(last 0)=+nan sr=nan%]
... [trial 1 | 2 sheep | 459,608 steps | ret(last 35)=+15.39 sr=3%]
... [trial 1 | 2 sheep | 509,608 steps | ret(last 50)=+20.25 sr=0%]
... [trial 1 | 2 sheep | 559,608 steps | ret(last 50)=+23.24 sr=4%]
... [trial 1 | 2 sheep | 609,608 steps | ret(last 50)=+23.36 sr=4%]
... [trial 1 | 2 sheep | 659,608 steps | ret(last 50)=+25.32 sr=2%]
... [trial 1 | 2 sheep | 709,608 steps | ret(last 50)=+24.02 sr=4%]
... [trial 1 | 2 sheep | 759,608 steps | ret(last 50)=+24.66 sr=4%]
... [trial 1 | 2 sheep | 809,608 steps | ret(last 50)=+25.41 sr=4%]
... [trial 1 | 2 sheep | 859,608 steps | ret(last 50)=+24.27 sr=4%]
... [trial 1 | 2 sheep | 909,608 steps | ret(last 50)=+25.13 sr=8%]
... [trial 1 | 2 sheep | 959,608 steps | ret(last 50)=+25.10 sr=2%]
... [trial 1 | 2 sheep | 1,009,608 steps | ret(last 50)=+26.02 sr=2%]
... [trial 1 | eval n=1]
... [trial 1 | eval n=2]
... [trial 1 | eval n=3]
→ score=0.060 sr1=0.30 sr2=0.00 sr3=0.00 [308s]
============================================================================================
LEADERBOARD
============================================================================================
rank score sr1 sr2 sr3 config
----------------------------------------------------------------------------------------
1 0.060 0.30 0.00 0.00 W_PER_SHEEP=1.0 W_ALIGN=0.1 W_PEN_BONUS=10.0 W_STEP_COST=0.02 W_COMPLETE=100.0 W_COMPACT=3.0 ALIGN_SHAPE=standoff ALIGN_GATED=False ent_coef=0.005
Best config saved to runs/sweep_20260425_124021/best.json
Total trials: 1 (1 successful, 0 failed)
Total time: 0.09h