Files
OpenGait/docs/scoliosis_next_experiments.md
T

7.1 KiB

Scoliosis: Next Experiments

This note is the short operational plan for the next scoliosis experiments. It is written to be runnable by someone who did not follow the full debugging history.

Related notes:

Current Best Known Result

Current practical winner:

  • model family: ScoNet-MT-ske
  • split: 1:1:2 (Scoliosis1K_112.json)
  • representation: body-only
  • loss: plain CE + triplet
  • optimizer path: later AdamW cosine finetune

Best verified checkpoints:

  • best macro-F1:
    • checkpoint: artifact/scoliosis_sconet_112_bodyonly_plaince_adamw_cosine/ScoNet_skeleton_112_sigma15_joint8_bodyonly_plaince_adamw_cosine_finetune_1gpu_80k-iter-27000-score-0.8870-scalar_test_f1.pt
    • full test: 92.38 Acc / 90.30 Prec / 87.39 Rec / 88.70 F1
  • best accuracy:
    • checkpoint: artifact/scoliosis_sconet_112_bodyonly_plaince_adamw_cosine/ScoNet_skeleton_112_sigma15_joint8_bodyonly_plaince_adamw_cosine_finetune_1gpu_80k-iter-64000-score-0.9425-scalar_test_accuracy.pt
    • full test: 94.25 Acc / 83.24 Prec / 95.76 Rec / 87.63 F1

Branches Already Tried

These are the main branches already explored, so the next person does not rerun the same ideas blindly.

Representation branches

  • full-body
    • original all-joints skeleton-map path
    • useful as a reference, but the practical winner did not come from this branch
  • body-only
    • removed face/head keypoints and head limbs
    • this became the strongest practical representation
    • current best checkpoint family comes from this branch
  • head-lite
    • added back limited head information, mainly nose and shoulder-linked head context
    • looked promising on the small fixed proxy subset
    • lost on the full test set, so it is not the current winner
  • geom-fix
    • aspect-ratio-preserving geometry/padding experiment instead of the older square-warp behavior
    • useful for debugging, but not the branch that produced the best practical result
  • no-cut
    • removed the silhouette-style width cut at runtime
    • not the branch that ultimately won

Loss / optimization branches

  • weighted CE
    • helped on harder imbalance settings and some short proxies
    • did not beat plain CE on the practical 1:1:2 path
  • plain CE
    • better than weighted CE on the working 1:1:2 setup
    • remains the default practical choice
  • SGD bridge
    • gave the first strong practical skeleton baseline
  • AdamW multistep finetune
    • unstable and often worse
  • AdamW cosine finetune
    • this is the branch that finally produced the retained best checkpoints

Model branches

  • ScoNet-MT-ske
    • practical winner so far
  • DRF
    • implemented and tested
    • underperformed the plain skeleton baseline in the first serious practical run
    • still worth one final warm-start retry, but not the default winner

What Not To Do First

Do not start here:

  • do not treat DRF as the mainline model yet
  • do not go back to 1:1:8 first unless the goal is specifically hard-imbalance study
  • do not spend time on more visualization work first
  • do not keep changing preprocessing and optimizer at the same time

Reason:

  • the current plain skeleton baseline is already strong
  • the first practical DRF run underperformed that baseline badly
  • broad exploratory tuning is lower-value than a few controlled confirmation experiments

Priority Order

Run experiments in this order.

1. Full-body control under the winning recipe

Goal:

  • verify that body-only is truly the best practical representation

Keep fixed:

  • split: 1:1:2
  • optimizer path: same successful AdamW cosine finetune style
  • loss: plain CE + triplet
  • scheduler style: cosine, not the old hard multistep decay

Change only:

  • representation: full-body instead of body-only

Success criterion:

  • only keep full-body if it beats the current best body-only checkpoint on full-test macro-F1

If full-body loses:

  • stop
  • keep body-only as the default practical representation

2. One serious DRF retry, but only on top of the strong baseline recipe

Goal:

  • test whether DRF can add value once the skeleton baseline is already strong

Recommended setup:

  • split: 1:1:2
  • representation: start from the same body-only skeleton maps
  • initialization: warm-start from the strong skeleton model, not from random init
  • optimizer: small-LR AdamW
  • scheduler: cosine

Recommended strategy:

  1. restore the strong skeleton checkpoint weights
  2. initialize DRF from that visual branch
  3. use a smaller LR than the plain baseline finetune
  4. if needed, freeze or partially freeze the backbone for a short warmup so the PAV/PGA branch learns without immediately disturbing the already-good visual branch

Why:

  • the earlier DRF bridge peaked early and then degraded
  • local evidence suggests the prior branch is currently weak or noisy
  • DRF deserves one fair test from a strong starting point, not another scratch run

Success criterion:

  • DRF must beat the current best plain skeleton checkpoint on full-test macro-F1
  • if it does not, stop treating DRF as the practical winner

3. Only after that, optional optimizer confirmation

Goal:

  • confirm that the AdamW cosine win was not just a lucky branch

Options:

  • rerun the winning body-only recipe once more with the same finetune schedule
  • or compare one cleaner SGD continuation versus AdamW cosine finetune from the same checkpoint

This is lower priority because:

  • we already have a verified strong artifact checkpoint
  • the practical problem is solved well enough for use

For long detached runs, use systemd-run Training.

Use:

  • output_root: /mnt/hddl/data/OpenGait-output
  • save_iter: 500
  • eval_iter: 1000
  • best_ckpt_cfg.keep_n: 3
  • metrics:
    • scalar/test_f1/
    • scalar/test_accuracy/

This avoids losing a strong checkpoint between evals and keeps large checkpoints off the SSD.

Decision Rules

Use these rules so the experiment search does not drift again.

Promote a new run only if:

  • it improves full-test macro-F1 over the current best retained checkpoint
  • and the gain survives a standalone eval rerun from the saved checkpoint

Stop a branch if:

  • it is clearly below the current best baseline
  • and its later checkpoints are not trending upward

Do not declare a winner from:

  • a proxy subset alone
  • TensorBoard scalars alone
  • an unsaved checkpoint implied only by logs

Always verify the claimed winner by:

  1. standalone eval from the checkpoint file
  2. writing the result into the changelog and audit
  3. copying the final selected checkpoint into artifact/ if it becomes the new best

Short Recommendation

If only one more experiment is going to be run, run this:

  • full-body under the same successful 1:1:2 + plain CE + AdamW cosine recipe

If two more experiments are going to be run, run:

  1. full-body control
  2. DRF warm-start finetune from the strong body-only skeleton checkpoint

That is the highest-value next work from the current state.