Checkpoint 8
This commit is contained in:
+6
-20
@@ -1,4 +1,9 @@
|
||||
# Training pipeline
|
||||
# Training and Evaluation Details
|
||||
|
||||
This file is the command-level companion to the root README. It focuses
|
||||
on data collection, BC, PPO fine-tuning, evaluation flags, and generated
|
||||
artifacts; use the root README for the high-level architecture and
|
||||
Webots demo quick start.
|
||||
|
||||
Two stages, strictly sequential:
|
||||
|
||||
@@ -26,16 +31,6 @@ runs/ — checkpoints (whitelisted entries in top-level .gitignore)
|
||||
run with ``python -m pytest tests/``.)
|
||||
```
|
||||
|
||||
## Setup
|
||||
|
||||
```
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
CPU is the default and recommended device — SB3 PPO with an MLP policy
|
||||
of this size runs faster on CPU than GPU because the bottleneck is
|
||||
rollout collection, not gradient compute.
|
||||
|
||||
## End-to-end pipeline
|
||||
|
||||
The simplest way to run everything is the Makefile at the project
|
||||
@@ -93,12 +88,3 @@ python -m training.eval --policy strombom --max-flock 10 --max-steps 15000 --
|
||||
python -m training.eval --policy sequential --max-flock 10 --max-steps 15000 --n-seeds 10
|
||||
```
|
||||
|
||||
## Webots inference
|
||||
|
||||
```
|
||||
tools/run_webots.sh 10 bc # or rl, strombom, sequential
|
||||
```
|
||||
|
||||
The dog controller loads `runs/bc` for `bc` mode and `runs/rl` for
|
||||
`rl` mode. Override with `HERDING_POLICY_DIR=…` for a specific
|
||||
checkpoint.
|
||||
|
||||
Reference in New Issue
Block a user