Config: {'W_PER_SHEEP': 1.0, 'W_ALIGN': 0.0, 'W_PEN_BONUS': 5.0, 'W_STEP_COST': 0.02, 'W_COMPLETE': 200.0, 'W_COMPACT': 1.5, 'ALIGN_SHAPE': 'standoff', 'ALIGN_GATED': False, 'ent_coef': 0.02} Run dir: runs/expB_mixed MIXED training: random n_sheep ∈ [1, 3], 3,000,000 total steps [Mixed] training 3,000,000 steps ... [trial 1 | mixed | 100,000 steps | ret(last 50)=-13.68 sr=2%] ... [trial 1 | mixed | 200,000 steps | ret(last 50)=-14.08 sr=0%] ... [trial 1 | mixed | 300,000 steps | ret(last 50)=-9.80 sr=0%] ... [trial 1 | mixed | 400,000 steps | ret(last 50)=-11.20 sr=0%] ... [trial 1 | mixed | 500,000 steps | ret(last 50)=-10.61 sr=0%] ... [trial 1 | mixed | 600,000 steps | ret(last 50)=-11.19 sr=0%] ... [trial 1 | mixed | 700,000 steps | ret(last 50)=-14.22 sr=0%] ... [trial 1 | mixed | 800,000 steps | ret(last 50)=-6.31 sr=0%] ... [trial 1 | mixed | 900,000 steps | ret(last 50)=-12.68 sr=0%] ... [trial 1 | mixed | 1,000,000 steps | ret(last 50)=-11.06 sr=0%] ... [trial 1 | mixed | 1,100,000 steps | ret(last 50)=-13.39 sr=0%] ... [trial 1 | mixed | 1,200,000 steps | ret(last 50)=-14.20 sr=0%] ... [trial 1 | mixed | 1,300,000 steps | ret(last 50)=-11.33 sr=0%] ... [trial 1 | mixed | 1,400,000 steps | ret(last 50)=-10.73 sr=0%] ... [trial 1 | mixed | 1,500,000 steps | ret(last 50)=-10.91 sr=0%] ... [trial 1 | mixed | 1,600,000 steps | ret(last 50)=-10.44 sr=0%] ... [trial 1 | mixed | 1,700,000 steps | ret(last 50)=-10.56 sr=0%] ... [trial 1 | mixed | 1,800,000 steps | ret(last 50)=-15.74 sr=0%] ... [trial 1 | mixed | 1,900,000 steps | ret(last 50)=-13.46 sr=0%] ... [trial 1 | mixed | 2,000,000 steps | ret(last 50)=-9.86 sr=0%] ... [trial 1 | mixed | 2,100,000 steps | ret(last 50)=-13.07 sr=0%] ... [trial 1 | mixed | 2,200,000 steps | ret(last 50)=-9.86 sr=0%] ... [trial 1 | mixed | 2,300,000 steps | ret(last 50)=-9.73 sr=2%] ... [trial 1 | mixed | 2,400,000 steps | ret(last 50)=-12.21 sr=0%] ... [trial 1 | mixed | 2,500,000 steps | ret(last 50)=-14.27 sr=0%] ... [trial 1 | mixed | 2,600,000 steps | ret(last 50)=-10.90 sr=2%] ... [trial 1 | mixed | 2,700,000 steps | ret(last 50)=-9.67 sr=0%] ... [trial 1 | mixed | 2,800,000 steps | ret(last 50)=-14.29 sr=0%] ... [trial 1 | mixed | 2,900,000 steps | ret(last 50)=-9.08 sr=0%] ... [trial 1 | mixed | 3,000,000 steps | ret(last 50)=-11.62 sr=6%] [Mixed] evaluating n=1, 30 eps [Mixed] n_sheep=1 sr=0% mean_len=1500 mean_min_pen=12.1m mean_act=0.64 [Mixed] evaluating n=2, 30 eps [Mixed] n_sheep=2 sr=0% mean_len=1500 mean_min_pen=13.6m mean_act=1.12 [Mixed] evaluating n=3, 30 eps [Mixed] n_sheep=3 sr=0% mean_len=1500 mean_min_pen=13.3m mean_act=1.02 ============================================================ REPLAY SUMMARY ============================================================ n_sheep=1 sr= 0% len= 1500 min_pen= 12.1m act=0.64 n_sheep=2 sr= 0% len= 1500 min_pen= 13.6m act=1.12 n_sheep=3 sr= 0% len= 1500 min_pen= 13.3m act=1.02 Total time: 20.6 min Artefacts: runs/expB_mixed/