7.1 KiB
Scoliosis: Next Experiments
This note is the short operational plan for the next scoliosis experiments. It is written to be runnable by someone who did not follow the full debugging history.
Related notes:
- ScoNet and DRF: Status, Architecture, and Reproduction Notes
- Scoliosis Training Change Log
- Scoliosis Reproducibility Audit
- systemd-run Training
Current Best Known Result
Current practical winner:
- model family:
ScoNet-MT-ske - split:
1:1:2(Scoliosis1K_112.json) - representation:
body-only - loss: plain CE + triplet
- optimizer path: later
AdamWcosine finetune
Best verified checkpoints:
- best macro-F1:
- checkpoint:
artifact/scoliosis_sconet_112_bodyonly_plaince_adamw_cosine/ScoNet_skeleton_112_sigma15_joint8_bodyonly_plaince_adamw_cosine_finetune_1gpu_80k-iter-27000-score-0.8870-scalar_test_f1.pt - full test:
92.38 Acc / 90.30 Prec / 87.39 Rec / 88.70 F1
- checkpoint:
- best accuracy:
- checkpoint:
artifact/scoliosis_sconet_112_bodyonly_plaince_adamw_cosine/ScoNet_skeleton_112_sigma15_joint8_bodyonly_plaince_adamw_cosine_finetune_1gpu_80k-iter-64000-score-0.9425-scalar_test_accuracy.pt - full test:
94.25 Acc / 83.24 Prec / 95.76 Rec / 87.63 F1
- checkpoint:
Branches Already Tried
These are the main branches already explored, so the next person does not rerun the same ideas blindly.
Representation branches
full-body- original all-joints skeleton-map path
- useful as a reference, but the practical winner did not come from this branch
body-only- removed face/head keypoints and head limbs
- this became the strongest practical representation
- current best checkpoint family comes from this branch
head-lite- added back limited head information, mainly
noseand shoulder-linked head context - looked promising on the small fixed proxy subset
- lost on the full test set, so it is not the current winner
- added back limited head information, mainly
geom-fix- aspect-ratio-preserving geometry/padding experiment instead of the older square-warp behavior
- useful for debugging, but not the branch that produced the best practical result
no-cut- removed the silhouette-style width cut at runtime
- not the branch that ultimately won
Loss / optimization branches
weighted CE- helped on harder imbalance settings and some short proxies
- did not beat
plain CEon the practical1:1:2path
plain CE- better than weighted CE on the working
1:1:2setup - remains the default practical choice
- better than weighted CE on the working
SGD bridge- gave the first strong practical skeleton baseline
AdamWmultistep finetune- unstable and often worse
AdamWcosine finetune- this is the branch that finally produced the retained best checkpoints
Model branches
ScoNet-MT-ske- practical winner so far
DRF- implemented and tested
- underperformed the plain skeleton baseline in the first serious practical run
- still worth one final warm-start retry, but not the default winner
What Not To Do First
Do not start here:
- do not treat
DRFas the mainline model yet - do not go back to
1:1:8first unless the goal is specifically hard-imbalance study - do not spend time on more visualization work first
- do not keep changing preprocessing and optimizer at the same time
Reason:
- the current plain skeleton baseline is already strong
- the first practical DRF run underperformed that baseline badly
- broad exploratory tuning is lower-value than a few controlled confirmation experiments
Priority Order
Run experiments in this order.
1. Full-body control under the winning recipe
Goal:
- verify that
body-onlyis truly the best practical representation
Keep fixed:
- split:
1:1:2 - optimizer path: same successful
AdamWcosine finetune style - loss: plain CE + triplet
- scheduler style: cosine, not the old hard multistep decay
Change only:
- representation:
full-bodyinstead ofbody-only
Success criterion:
- only keep
full-bodyif it beats the current bestbody-onlycheckpoint on full-test macro-F1
If full-body loses:
- stop
- keep
body-onlyas the default practical representation
2. One serious DRF retry, but only on top of the strong baseline recipe
Goal:
- test whether DRF can add value once the skeleton baseline is already strong
Recommended setup:
- split:
1:1:2 - representation: start from the same
body-onlyskeleton maps - initialization: warm-start from the strong skeleton model, not from random init
- optimizer: small-LR
AdamW - scheduler: cosine
Recommended strategy:
- restore the strong skeleton checkpoint weights
- initialize DRF from that visual branch
- use a smaller LR than the plain baseline finetune
- if needed, freeze or partially freeze the backbone for a short warmup so the PAV/PGA branch learns without immediately disturbing the already-good visual branch
Why:
- the earlier DRF bridge peaked early and then degraded
- local evidence suggests the prior branch is currently weak or noisy
- DRF deserves one fair test from a strong starting point, not another scratch run
Success criterion:
- DRF must beat the current best plain skeleton checkpoint on full-test macro-F1
- if it does not, stop treating DRF as the practical winner
3. Only after that, optional optimizer confirmation
Goal:
- confirm that the
AdamWcosine win was not just a lucky branch
Options:
- rerun the winning
body-onlyrecipe once more with the same finetune schedule - or compare one cleaner
SGDcontinuation versusAdamWcosine finetune from the same checkpoint
This is lower priority because:
- we already have a verified strong artifact checkpoint
- the practical problem is solved well enough for use
Recommended Commands Pattern
For long detached runs, use systemd-run Training.
Use:
output_root: /mnt/hddl/data/OpenGait-outputsave_iter: 500eval_iter: 1000best_ckpt_cfg.keep_n: 3- metrics:
scalar/test_f1/scalar/test_accuracy/
This avoids losing a strong checkpoint between evals and keeps large checkpoints off the SSD.
Decision Rules
Use these rules so the experiment search does not drift again.
Promote a new run only if:
- it improves full-test macro-F1 over the current best retained checkpoint
- and the gain survives a standalone eval rerun from the saved checkpoint
Stop a branch if:
- it is clearly below the current best baseline
- and its later checkpoints are not trending upward
Do not declare a winner from:
- a proxy subset alone
- TensorBoard scalars alone
- an unsaved checkpoint implied only by logs
Always verify the claimed winner by:
- standalone eval from the checkpoint file
- writing the result into the changelog and audit
- copying the final selected checkpoint into
artifact/if it becomes the new best
Short Recommendation
If only one more experiment is going to be run, run this:
full-bodyunder the same successful1:1:2 + plain CE + AdamW cosinerecipe
If two more experiments are going to be run, run:
full-bodycontrol- DRF warm-start finetune from the strong
body-onlyskeleton checkpoint
That is the highest-value next work from the current state.