OpenGait/docs/scoliosis_training_change_log.md

# Scoliosis Training Change Log

This file is the single run-by-run changelog for Scoliosis1K training and evaluation in this repo.

Use it for:
- what changed between runs
- which dataset/config/checkpoint was used
- what the resulting metrics were
- whether a run is still in progress

## Conventions

- Add one entry before launching a new training run.
- Update the same entry after training/eval completes.
- Record only the delta from the previous relevant run, not a full config dump.
- For skeleton-map control runs, use plain-text `ScoNet-MT-ske` naming in the notes even though the code class is `ScoNet`.

## Runs

| Date | Run | Model | Dataset | Main change vs previous relevant run | Status | Eval result |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| 2026-03-07 | `DRF` | DRF | `Scoliosis1K-drf-pkl-118` | First OpenGait DRF integration on paper `1:1:8` split using shared OpenGait skeleton/PAV path | complete | `58.08 Acc / 78.80 Prec / 60.22 Rec / 56.99 F1` |
| 2026-03-08 | `DRF_paper` | DRF | `Scoliosis1K-drf-pkl-118-paper` | More paper-literal preprocessing: summed/sparser heatmap path, dataset-level PAV normalization, body-prior path refinement | complete | `51.67 Acc / 72.37 Prec / 56.22 Rec / 50.92 F1` |
| 2026-03-08 | `ScoNet_skeleton_118` | ScoNet-MT-ske control | `Scoliosis1K-drf-pkl-118-paper` | Plain skeleton-map baseline on the paper-literal export to isolate DRF vs skeleton-path failure | complete | `38.85 Acc / 61.23 Prec / 46.75 Rec / 35.96 F1` |
| 2026-03-08 | `ScoNet_skeleton_118_sigma8` | ScoNet-MT-ske control | `Scoliosis1K_sigma_8.0/pkl` | Reused upstream/default sigma-8 heatmap export instead of the DRF paper-literal export | complete | `36.45 Acc / 69.17 Prec / 43.82 Rec / 32.78 F1` |
| 2026-03-08 | `ScoNet_skeleton_118_sigma15_bs12x8` | ScoNet-MT-ske control | `Scoliosis1K-drf-pkl-118-sigma15` | Lowered skeleton-map sigma from `8.0` to `1.5` to tighten the pose rasterization | complete | `46.33 Acc / 68.09 Prec / 51.92 Rec / 44.69 F1` |
| 2026-03-09 | `ScoNet_skeleton_118_sigma15_joint8_sharedalign_2gpu_bs12x8` | ScoNet-MT-ske control | `Scoliosis1K-drf-pkl-118-sigma15-joint8-sharedalign` | Fixed limb/joint channel misalignment, used mixed sigma `limb=1.5 / joint=8.0`, kept SGD | complete | `50.47 Acc / 69.31 Prec / 54.58 Rec / 48.63 F1` |
| 2026-03-09 | `ScoNet_skeleton_118_sigma15_joint8_limb4_adamw_2gpu_bs12x8` | ScoNet-MT-ske control | `Scoliosis1K-drf-pkl-118-sigma15-joint8-sharedalign-limb4` | Rebalanced channel intensity with `limb_gain=4.0`; switched optimizer from `SGD` to `AdamW` | complete | `48.60 Acc / 65.97 Prec / 53.19 Rec / 46.41 F1` |
| 2026-03-09 | `ScoNet_skeleton_118_sigma15_joint8_sharedalign_nocut_adamw_1gpu_bs8x8` | ScoNet-MT-ske control | `Scoliosis1K-drf-pkl-118-sigma15-joint8-sharedalign` | Switched runtime transform from `BaseSilCuttingTransform` to `BaseSilTransform` (`no-cut`), kept `AdamW`, reduced `8x8` due to 5070 Ti OOM at `12x8` | interrupted | superseded by proxy route before eval |
| 2026-03-09 | `ScoNet_skeleton_118_sigma15_joint8_sharedalign_nocut_adamw_proxy_1gpu` | ScoNet-MT-ske proxy | `Scoliosis1K-drf-pkl-118-sigma15-joint8-sharedalign` | Fast proxy route: `no-cut`, `AdamW`, `8x8`, `total_iter=2000`, `eval_iter=500`, `test_seq_subset_size=128` | interrupted | superseded by geometry-fixed proxy before completion |
| 2026-03-10 | `ScoNet_skeleton_118_sigma15_joint8_geomfix_proxy_1gpu` | ScoNet-MT-ske proxy | `Scoliosis1K-drf-pkl-118-sigma15-joint8-geomfix` | Geometry ablation: aspect-ratio-preserving crop+pad instead of square-warp resize; `AdamW`, `no-cut`, `8x8`, `total_iter=2000`, `eval_iter=500`, fixed test subset seed `118` | complete | proxy subset unstable: `500 24.22/8.07/33.33/13.00`, `1000 60.16/68.05/58.13/55.25`, `1500 26.56/58.33/35.64/17.68`, `2000 27.34/63.96/37.02/20.14` (Acc/Prec/Rec/F1) |
| 2026-03-10 | `ScoNet_skeleton_118_sigma15_joint8_sharedalign_weightedce_proxy_1gpu` | ScoNet-MT-ske proxy | `Scoliosis1K-drf-pkl-118-sigma15-joint8-sharedalign` | Training-side imbalance ablation: kept the current best shared-align geometry, restored `SGD` baseline settings, and applied weighted CE with class weights `[1.0, 4.0, 4.0]`; `total_iter=2000`, `eval_iter=500`, fixed test subset seed `118` | complete | `500 24.22/8.07/33.33/13.00`, `1000 71.09/48.12/53.93/50.19`, `1500 46.09/52.26/52.34/43.72`, `2000 37.50/47.03/45.45/34.28` (Acc/Prec/Rec/F1) |
| 2026-03-10 | `ScoNet_skeleton_118_sigma15_joint8_bodyonly_proxy_1gpu` | ScoNet-MT-ske proxy | `Scoliosis1K-drf-pkl-118-sigma15-joint8-bodyonly` | Representation ablation: dropped face keypoints and head limbs from the skeleton-map export while keeping the current shared-align `sigma_limb=1.5 / sigma_joint=8.0` setup; `SGD`, `total_iter=2000`, `eval_iter=500`, fixed test subset seed `118` | complete | `500 53.91/40.60/50.68/40.26`, `1000 65.62/42.86/56.30/47.36`, `1500 28.12/50.28/36.55/19.53`, `2000 26.56/74.93/35.64/17.69` (Acc/Prec/Rec/F1) |
| 2026-03-10 | `ScoNet_skeleton_118_sigma15_joint8_bodyonly_weightedce_proxy_1gpu` | ScoNet-MT-ske proxy | `Scoliosis1K-drf-pkl-118-sigma15-joint8-bodyonly` | Combined the two strongest partial clues: body-only skeleton map plus weighted CE with class weights `[1.0, 4.0, 4.0]`; `SGD`, `total_iter=2000`, `eval_iter=500`, fixed test subset seed `118` | interrupted | superseded at `100` iterations by the 2-GPU proxy variant |
| 2026-03-10 | `ScoNet_skeleton_118_sigma15_joint8_bodyonly_weightedce_proxy_2gpu` | ScoNet-MT-ske proxy | `Scoliosis1K-drf-pkl-118-sigma15-joint8-bodyonly` | Same body-only + weighted-CE proxy, relaunched on the `5070 Ti + 3090` pair for faster iteration while keeping the fixed subset seed `118` | complete | `500 68.75/44.46/58.12/49.59`, `1000 57.81/52.49/34.41/26.42`, `1500 41.41/53.84/51.46/40.77`, `2000 39.84/53.76/52.10/39.24` (Acc/Prec/Rec/F1) |
| 2026-03-10 | `ScoNet_skeleton_112_sigma15_joint8_bodyonly_weightedce_bridge_2gpu_10k` | ScoNet-MT-ske bridge | `Scoliosis1K-drf-pkl-118-sigma15-joint8-bodyonly` + `Scoliosis1K_112.json` | Diagnostic bridge on the easier `1:1:2` split using the current best skeleton recipe (`body-only + weighted CE`), extended to `10000` iterations with proportional LR milestones and eval/save every `1000` | complete | best proxy subset at `7000`: `82.03/66.53/88.84/64.91`; full test at `7000`: `81.82/66.21/88.50/65.96` (Acc/Prec/Rec/F1) |
| 2026-03-10 | `ScoNet_skeleton_112_sigma15_joint8_bodyonly_plaince_bridge_2gpu_10k` | ScoNet-MT-ske bridge | `Scoliosis1K-drf-pkl-118-sigma15-joint8-bodyonly` + `Scoliosis1K_112.json` | Same `1:1:2` body-only bridge as above, but removed weighted CE to test whether class weighting was suppressing precision on the easier split | interrupted | superseded before meaningful progress by the user-requested 1-GPU rerun on the `5070 Ti` |
| 2026-03-10 | `ScoNet_skeleton_112_sigma15_joint8_bodyonly_plaince_bridge_1gpu_10k` | ScoNet-MT-ske bridge | `Scoliosis1K-drf-pkl-118-sigma15-joint8-bodyonly` + `Scoliosis1K_112.json` | Same plain-CE `1:1:2` bridge, relaunched on the `5070 Ti` only per user request | complete | best proxy subset at `7000`: `88.28/69.12/74.15/68.80`; full test at `7000`: `83.16/68.24/80.02/68.47`; final proxy at `10000`: `75.00/65.00/63.41/54.55` (Acc/Prec/Rec/F1) |
| 2026-03-10 | `ScoNet_skeleton_112_sigma15_joint8_headlite_plaince_bridge_1gpu_10k` | ScoNet-MT-ske bridge | `Scoliosis1K-drf-pkl-112-sigma15-joint8-headlite` + `Scoliosis1K_112.json` | Added `head-lite` structure (nose plus shoulder links, no eyes/ears) on top of the plain-CE `1:1:2` bridge; first `3090` launch OOMed due unrelated occupancy, then relaunched on the UUID-pinned `5070 Ti` | complete | best proxy subset at `7000`: `86.72/70.15/89.00/70.44`; full test at `7000`: `78.07/65.42/80.50/62.08` (Acc/Prec/Rec/F1) |
| 2026-03-10 | `DRF_skeleton_112_sigma15_joint8_bodyonly_plaince_bridge_1gpu_10k` | DRF bridge | `Scoliosis1K-drf-pkl-118-sigma15-joint8-bodyonly` + `Scoliosis1K_112.json` | First practical DRF run on the winning `1:1:2` skeleton recipe: `body-only`, plain CE, SGD, `10k` bridge schedule, fixed proxy subset seed `112` | complete | best proxy subset at `2000`: `88.28/61.79/60.31/60.93`; full test at `2000`: `80.21/58.92/59.23/57.84` (Acc/Prec/Rec/F1) |
| 2026-03-14 | `DRF_author_eval_112_1gpu` | DRF author checkpoint compat | `Scoliosis1K-drf-pkl` + `Scoliosis1K_112.json` | Re-evaluated the author-provided checkpoint after adding legacy module-name remap and correcting the author class order; kept the stale `112` path to test whether the provided YAML was trustworthy | complete | `85.19/57.98/56.65/57.30` (Acc/Prec/Rec/F1) |
| 2026-03-14 | `DRF_author_eval_118_splitroot_1gpu` | DRF author checkpoint compat | `Scoliosis1K-drf-pkl-118` + `Scoliosis1K_118.json` | Same author-checkpoint compat path, but switched to the `118` split-specific local DRF dataset root while keeping `BaseSilCuttingTransform` and author class order | complete | `77.17/73.61/72.59/72.98` (Acc/Prec/Rec/F1) |
| 2026-03-14 | `DRF_author_eval_118_aligned_1gpu` | DRF author checkpoint compat | `Scoliosis1K-drf-pkl-118-aligned` + `Scoliosis1K_118.json` | Same author-checkpoint compat path, but evaluated on the aligned `118` DRF export; this is currently the best recovered author-checkpoint runtime contract | complete | `80.24/76.73/76.40/76.56` (Acc/Prec/Rec/F1) |
| 2026-03-14 | `DRF_author_eval_118_paper_1gpu` | DRF author checkpoint compat | `Scoliosis1K-drf-pkl-118-paper` + `Scoliosis1K_118.json` | Tested the author checkpoint against the local paper-literal summed-heatmap export with `BaseSilTransform` to see whether it matched the later paper-style preprocessing branch | complete | `27.24/9.08/33.33/14.27` (Acc/Prec/Rec/F1) |
| 2026-03-10 | `ScoNet_skeleton_112_sigma15_joint8_bodyonly_plaince_main_1gpu_20k` | ScoNet-MT-ske mainline | `Scoliosis1K-drf-pkl-118-sigma15-joint8-bodyonly` + `Scoliosis1K_112.json` | Promoted the winning practical skeleton recipe to a longer `20k` run with full `TEST_SET` eval and checkpoint save every `1000`; no proxy subset, same plain CE + SGD setup | interrupted | superseded by the true-resume continuation below |
| 2026-03-10 | `ScoNet_skeleton_112_sigma15_joint8_bodyonly_plaince_resume_1gpu_20k` | ScoNet-MT-ske mainline | `Scoliosis1K-drf-pkl-118-sigma15-joint8-bodyonly` + `Scoliosis1K_112.json` | True continuation of the earlier plain-CE `1:1:2` `10k` bridge from its `latest.pt`, extended to `20k` with full `TEST_SET` eval and checkpoint save every `1000` | interrupted | superseded by the AdamW finetune branch below |
| 2026-03-10 | `ScoNet_skeleton_112_sigma15_joint8_bodyonly_plaince_adamw_finetune_1gpu_20k` | ScoNet-MT-ske finetune | `Scoliosis1K-drf-pkl-118-sigma15-joint8-bodyonly` + `Scoliosis1K_112.json` | AdamW finetune from the earlier plain-CE `1:1:2` `10k` checkpoint; restores model weights only, resets optimizer/scheduler state, keeps full `TEST_SET` eval and checkpoint save every `1000` | interrupted | superseded by the longer overnight 40k finetune below |
| 2026-03-10 | `ScoNet_skeleton_112_sigma15_joint8_bodyonly_plaince_adamw_finetune_1gpu_40k` | ScoNet-MT-ske finetune | `Scoliosis1K-drf-pkl-118-sigma15-joint8-bodyonly` + `Scoliosis1K_112.json` | Longer overnight AdamW finetune from the same `10k` plain-CE checkpoint; restores model weights only, resets optimizer/scheduler state, extends to `40000` total iterations with full `TEST_SET` eval every `1000` | interrupted | superseded by the cosine `80k` finetune below |
| 2026-03-10 to 2026-03-11 | `ScoNet_skeleton_112_sigma15_joint8_bodyonly_plaince_adamw_cosine_finetune_1gpu_80k` | ScoNet-MT-ske finetune | `Scoliosis1K-drf-pkl-118-sigma15-joint8-bodyonly` + `Scoliosis1K_112.json` | Practical long-run finetune from the same `10k` plain-CE checkpoint, but switched to `AdamW` with cosine decay, HDD-backed `output_root`, `save_iter=500`, `eval_iter=1000`, and best-N checkpoint retention | complete | final `80000` eval: `90.64/72.87/93.19/75.74`; verified best retained full-test checkpoint at `27000`: `92.38/90.30/87.39/88.70` (Acc/Prec/Rec/F1) |

## Current best skeleton baseline

Current best `ScoNet-MT-ske`-style result:
- practical best on the easier `1:1:2` split:
  - `ScoNet_skeleton_112_sigma15_joint8_bodyonly_plaince_adamw_cosine_finetune_1gpu_80k` at retained best checkpoint `27000`
  - verified standalone full-test eval: `92.38 Acc / 90.30 Prec / 87.39 Rec / 88.70 F1`
- best result retained on the harder `1:1:8` split:
  - `ScoNet_skeleton_118_sigma15_joint8_sharedalign_2gpu_bs12x8`
  - `50.47 Acc / 69.31 Prec / 54.58 Rec / 48.63 F1`

## Notes

- `ckpt/ScoNet-20000-better.pt` is intentionally not listed here because it is a silhouette checkpoint, not a skeleton-map run.
- `DRF` runs are included because they are part of the same reproduction/debugging loop, but this log should stay focused on train/eval changes, not broader code refactors.
- The long `ScoNet_skeleton_118_sigma15_joint8_sharedalign_nocut_adamw_1gpu_bs8x8` run was intentionally interrupted and superseded by the shorter proxy run once fast-iteration support was added.
- The geometry-fixed proxy run fit the train split quickly but did not produce a stable proxy validation curve, so it should not be promoted to a full 20k run.
- The weighted-CE proxy briefly improved the proxy peak at `1000` iterations, but it also collapsed afterward, so class weighting alone is not a sufficient fix for the skeleton branch.
- The body-only proxy improved the early `500`-iteration proxy result relative to some earlier runs, but it still collapsed badly after `1000`, so removing face/head structure alone is also not a sufficient fix.
- Combining `body-only` with `weighted CE` gave the best `500`-iteration proxy seen so far (`68.75 Acc / 49.59 F1` on the fixed 128-sequence subset), but it still degraded substantially by `1000+`, which points more toward schedule/imbalance dynamics than a single missing representation tweak.
- Full-test check on retained checkpoints from the combined `body-only + weighted CE` run did not confirm a simple early-stop win: the retained `1000` checkpoint scored `56.61 Acc / 52.11 Prec / 34.15 Rec / 25.61 F1` on the full test split, while `2000` scored `41.52 Acc / 56.75 Prec / 54.75 Rec / 40.09 F1`. That means the small proxy subset is useful for screening but not reliable enough to choose the final stopping point by itself.
- The `1:1:2` bridge run is the strongest evidence so far that the skeleton branch is learnable: with the current best skeleton recipe (`body-only + weighted CE`), the full test score at `7000` reached `81.82 Acc / 66.21 Prec / 88.50 Rec / 65.96 F1`. That still trails the ScoNet paper’s stronger `1:1:2` result, but it is dramatically better than the `1:1:8` skeleton runs and makes class distribution a first-order factor in the reproduction gap.
- Removing weighted CE on the `1:1:2` bridge improved the current best full-test result further: `body-only + plain CE` reached `83.16 Acc / 68.24 Prec / 80.02 Rec / 68.47 F1` at `7000`, so weighted CE does not currently look beneficial on the easier split.
- A later full-test rerun of the retained `body-only + plain CE` `7000` checkpoint reproduced the same `83.16 / 68.24 / 80.02 / 68.47` result exactly, so that number is now explicitly reconfirmed rather than just carried forward from the original run log.
- `Head-lite` looked stronger than `body-only` on the fixed 128-sequence proxy subset at `7000`, but it did not transfer to the full test set: `78.07 Acc / 65.42 Prec / 80.50 Rec / 62.08 F1`, which is clearly below the `body-only + plain CE` full-test result.
- The first practical DRF bridge on the winning `1:1:2` recipe did not beat the plain skeleton baseline. Its best retained checkpoint (`2000`) reached only `80.21 Acc / 58.92 Prec / 59.23 Rec / 57.84 F1` on the full test set, versus `83.16 / 68.24 / 80.02 / 68.47` for `body-only + plain CE` at `7000`. The working local interpretation is that the added PAV/PGA path is currently injecting a weak or noisy prior rather than a useful complementary signal.
- The author-provided DRF checkpoint turned out to be partially recoverable once the local integration contract was corrected. The main hidden bug was class-order mismatch: the author stub uses `negative=0, positive=1, neutral=2`, while the repo evaluator assumes `negative=0, neutral=1, positive=2`. After adding label-order compatibility and the legacy `attention_layer -> PGA` remap, the checkpoint became meaningfully usable. The best recovered path so far is the aligned `118` export, which reached `80.24 Acc / 76.73 Prec / 76.40 Rec / 76.56 F1`. The stale author YAML is therefore not a trustworthy source of the original runtime contract; the checkpoint aligns much better with an `118` aligned/OpenGait-like path than with the later local `118-paper` export.
- Extending the practical `1:1:2` body-only plain-CE baseline with an `AdamW` cosine finetune changed the picture substantially. The final `80000` checkpoint still evaluated well (`90.64 / 72.87 / 93.19 / 75.74`), but the real win was an earlier retained checkpoint: `27000` reproduced exactly at `92.38 Acc / 90.30 Prec / 87.39 Rec / 88.70 F1` on a standalone full-test rerun. So for this practical path, the best result is no longer the original SGD bridge checkpoint but the retained `AdamW` cosine finetune checkpoint.