feat: retain best checkpoints and support alternate output roots

2026-03-11 01:14:05 +08:00
parent 63e2ed1097
commit a0150c791f
14 changed files with 852 additions and 9 deletions
@@ -124,6 +124,33 @@ The launcher configures both:

 This makes it easier to recover logs even if the original shell or tool session disappears.

+## Moving outputs off the SSD
+
+OpenGait writes checkpoints, TensorBoard summaries, best-checkpoint snapshots, and file logs under a run output root.
+
+By default that root is `output/`, but you can override it per run with `output_root` in the engine config:
+
+```yaml
+trainer_cfg:
+  output_root: /mnt/hddl/data/OpenGait-output
+
+evaluator_cfg:
+  output_root: /mnt/hddl/data/OpenGait-output
+```
+
+The final path layout stays the same under that root:
+
+```text
+<output_root>/<dataset>/<model>/<save_name>/
+```
+
+For long scoliosis runs, using an HDD-backed root is recommended so local SSD space is not consumed by:
+
+- numbered checkpoints
+- rolling resume checkpoints
+- best-N retained checkpoints
+- TensorBoard summary files
+
 ## GPU selection

 Prefer GPU UUIDs, not ordinal indices.