feat: retain best checkpoints and support alternate output roots
This commit is contained in:
@@ -124,6 +124,33 @@ The launcher configures both:
|
||||
|
||||
This makes it easier to recover logs even if the original shell or tool session disappears.
|
||||
|
||||
## Moving outputs off the SSD
|
||||
|
||||
OpenGait writes checkpoints, TensorBoard summaries, best-checkpoint snapshots, and file logs under a run output root.
|
||||
|
||||
By default that root is `output/`, but you can override it per run with `output_root` in the engine config:
|
||||
|
||||
```yaml
|
||||
trainer_cfg:
|
||||
output_root: /mnt/hddl/data/OpenGait-output
|
||||
|
||||
evaluator_cfg:
|
||||
output_root: /mnt/hddl/data/OpenGait-output
|
||||
```
|
||||
|
||||
The final path layout stays the same under that root:
|
||||
|
||||
```text
|
||||
<output_root>/<dataset>/<model>/<save_name>/
|
||||
```
|
||||
|
||||
For long scoliosis runs, using an HDD-backed root is recommended so local SSD space is not consumed by:
|
||||
|
||||
- numbered checkpoints
|
||||
- rolling resume checkpoints
|
||||
- best-N retained checkpoints
|
||||
- TensorBoard summary files
|
||||
|
||||
## GPU selection
|
||||
|
||||
Prefer GPU UUIDs, not ordinal indices.
|
||||
|
||||
Reference in New Issue
Block a user