docs: refresh scoliosis status page
This commit is contained in:
@@ -1,8 +1,12 @@
|
|||||||
# ScoNet and DRF: Status, Architecture, and Reproduction Notes
|
# ScoNet and DRF: Status, Architecture, and Reproduction Notes
|
||||||
|
|
||||||
This note records the current Scoliosis1K implementation status in this repo and the main conclusions from the recent reproduction/debugging work.
|
This note is the high-level status page for Scoliosis1K work in this repo.
|
||||||
|
It records what is implemented, what currently works best in practice, and
|
||||||
|
how to interpret the local DRF/ScoNet results.
|
||||||
|
|
||||||
For a stricter paper-vs-local reproducibility breakdown, see [scoliosis_reproducibility_audit.md](scoliosis_reproducibility_audit.md).
|
For the stricter paper-vs-local breakdown, see [scoliosis_reproducibility_audit.md](scoliosis_reproducibility_audit.md).
|
||||||
|
For the concrete experiment queue, see [scoliosis_next_experiments.md](scoliosis_next_experiments.md).
|
||||||
|
For the author-checkpoint compatibility recovery, see [drf_author_checkpoint_compat.md](drf_author_checkpoint_compat.md).
|
||||||
For the recommended long-running local launch workflow, see [systemd-run-training.md](systemd-run-training.md).
|
For the recommended long-running local launch workflow, see [systemd-run-training.md](systemd-run-training.md).
|
||||||
|
|
||||||
## Current status
|
## Current status
|
||||||
@@ -12,6 +16,22 @@ For the recommended long-running local launch workflow, see [systemd-run-trainin
|
|||||||
- `opengait/modeling/models/drf.py` is now implemented as a standalone DRF model in this repo.
|
- `opengait/modeling/models/drf.py` is now implemented as a standalone DRF model in this repo.
|
||||||
- Logging supports TensorBoard and optional Weights & Biases through `opengait/utils/msg_manager.py`.
|
- Logging supports TensorBoard and optional Weights & Biases through `opengait/utils/msg_manager.py`.
|
||||||
|
|
||||||
|
## Current bottom line
|
||||||
|
|
||||||
|
- The current practical winner is the skeleton-map ScoNet path, not DRF.
|
||||||
|
- The best verified local checkpoint is:
|
||||||
|
- `ScoNet_skeleton_112_sigma15_joint8_bodyonly_plaince_adamw_cosine_finetune_1gpu_80k`
|
||||||
|
- retained best checkpoint at `27000`
|
||||||
|
- verified full-test result: `92.38 Acc / 90.30 Prec / 87.39 Rec / 88.70 F1`
|
||||||
|
- The strongest practical recipe behind that checkpoint is:
|
||||||
|
- split: `1:1:2`
|
||||||
|
- representation: `body-only`
|
||||||
|
- losses: plain CE + triplet
|
||||||
|
- baseline training: `SGD`
|
||||||
|
- later finetune: `AdamW` + cosine decay
|
||||||
|
- A local DRF run trained from scratch on the same practical recipe did not improve over the plain skeleton baseline.
|
||||||
|
- The author-provided DRF checkpoint is now usable in-tree after compatibility fixes, but only under the recovered `118-aligned` runtime contract.
|
||||||
|
|
||||||
## Naming clarification
|
## Naming clarification
|
||||||
|
|
||||||
The name `ScoNet` is overloaded across the paper, config files, and checkpoints. Use the mapping below when reading this repo:
|
The name `ScoNet` is overloaded across the paper, config files, and checkpoints. Use the mapping below when reading this repo:
|
||||||
@@ -73,20 +93,47 @@ The main findings so far are:
|
|||||||
- a later full-test rerun confirmed the `body-only + plain CE` `7000` result exactly
|
- a later full-test rerun confirmed the `body-only + plain CE` `7000` result exactly
|
||||||
- an `AdamW` cosine finetune from that same plain-CE checkpoint improved the practical best further; the retained `27000` checkpoint reproduced at `92.38%` accuracy and `88.70%` macro-F1 on the full test set
|
- an `AdamW` cosine finetune from that same plain-CE checkpoint improved the practical best further; the retained `27000` checkpoint reproduced at `92.38%` accuracy and `88.70%` macro-F1 on the full test set
|
||||||
- a `head-lite + plain CE` variant looked promising on the fixed proxy subset but underperformed on the full test set at `7000` (`78.07%` accuracy, `62.08%` macro-F1)
|
- a `head-lite + plain CE` variant looked promising on the fixed proxy subset but underperformed on the full test set at `7000` (`78.07%` accuracy, `62.08%` macro-F1)
|
||||||
|
- The first practical DRF bridge on that same winning `1:1:2` recipe did not improve on the plain skeleton baseline:
|
||||||
|
- best retained DRF checkpoint (`2000`) on the full test set: `80.21 Acc / 58.92 Prec / 59.23 Rec / 57.84 F1`
|
||||||
|
- practical plain skeleton checkpoint (`7000`) on the full test set: `83.16 Acc / 68.24 Prec / 80.02 Rec / 68.47 F1`
|
||||||
|
- The author-provided DRF checkpoint initially looked unusable in this fork, but that turned out to be a compatibility problem, not a pure weight problem.
|
||||||
|
- after recovering the legacy runtime contract, the best compatible path was `Scoliosis1K-drf-pkl-118-aligned`
|
||||||
|
- recovered author-checkpoint result: `80.24 Acc / 76.73 Prec / 76.40 Rec / 76.56 F1`
|
||||||
|
|
||||||
The current working conclusion is:
|
The current working conclusion is:
|
||||||
|
|
||||||
- the core ScoNet trainer is not the problem
|
- the core ScoNet trainer is not the problem
|
||||||
- the strong silhouette checkpoint is not evidence that the skeleton-map path works
|
- the strong silhouette checkpoint is not evidence that the skeleton-map path works
|
||||||
- the main remaining suspect is the skeleton-map representation and preprocessing path
|
- the biggest historical problem was the skeleton-map/runtime contract, not just the optimizer
|
||||||
- for practical model development, `1:1:2` is currently the better working split than `1:1:8`
|
- for practical model development, `1:1:2` is currently the better working split than `1:1:8`
|
||||||
- for practical model development, the current best skeleton recipe is `body-only + plain CE`, and the current best retained checkpoint comes from a later `AdamW` cosine finetune on `1:1:2`
|
- for practical model development, the current best skeleton recipe is `body-only + plain CE`, and the current best retained checkpoint comes from a later `AdamW` cosine finetune on `1:1:2`
|
||||||
- the first practical DRF bridge on that same winning `1:1:2` recipe did not improve on the plain skeleton baseline:
|
- for practical use, DRF is still behind the local ScoNet skeleton winner
|
||||||
- best retained DRF checkpoint (`2000`) on the full test set: `80.21 Acc / 58.92 Prec / 59.23 Rec / 57.84 F1`
|
- for paper-compatibility analysis, the author checkpoint demonstrates that our earlier DRF failure was partly caused by contract mismatch
|
||||||
- current best plain skeleton checkpoint (`7000`) on the full test set: `83.16 Acc / 68.24 Prec / 80.02 Rec / 68.47 F1`
|
|
||||||
|
|
||||||
For readability in this repo's docs, `ScoNet-MT-ske` refers to the skeleton-map variant that the DRF paper writes as `ScoNet-MT^{ske}`.
|
For readability in this repo's docs, `ScoNet-MT-ske` refers to the skeleton-map variant that the DRF paper writes as `ScoNet-MT^{ske}`.
|
||||||
|
|
||||||
|
## DRF compatibility note
|
||||||
|
|
||||||
|
There are now two different DRF stories in this repo:
|
||||||
|
|
||||||
|
1. The local-from-scratch DRF branch.
|
||||||
|
- This is the branch trained directly in our fork on the current practical recipe.
|
||||||
|
- It did not beat the plain skeleton baseline.
|
||||||
|
|
||||||
|
2. The author-checkpoint compatibility branch.
|
||||||
|
- This uses the author-supplied checkpoint plus in-tree compatibility fixes.
|
||||||
|
- The main recovered issues were:
|
||||||
|
- legacy module naming drift: `attention_layer.*` vs `PGA.*`
|
||||||
|
- class-order mismatch between the author stub and our evaluator assumptions
|
||||||
|
- stale/internally inconsistent author YAML
|
||||||
|
- preprocessing/runtime mismatch, where `118-aligned` matched much better than the paper-literal export
|
||||||
|
|
||||||
|
That distinction matters. It means:
|
||||||
|
|
||||||
|
- "our DRF training branch underperformed" is true
|
||||||
|
- "the author DRF checkpoint is unusable" is false
|
||||||
|
- "the author result was drop-in reproducible from the handed-over YAML" is also false
|
||||||
|
|
||||||
## Architecture mapping
|
## Architecture mapping
|
||||||
|
|
||||||
`ScoNet` in this repo maps to the paper as follows:
|
`ScoNet` in this repo maps to the paper as follows:
|
||||||
@@ -115,12 +162,6 @@ The standard Scoliosis1K ScoNet recipe is:
|
|||||||
|
|
||||||
The skeleton-map control used the same recipe, except for the modality-specific changes listed above.
|
The skeleton-map control used the same recipe, except for the modality-specific changes listed above.
|
||||||
|
|
||||||
## Recommended next checks
|
|
||||||
|
|
||||||
1. Train a pure silhouette `1:1:8` baseline from the upstream ScoNet config as a clean sanity control.
|
|
||||||
2. Treat skeleton-map preprocessing as the primary debugging target until a `ScoNet-MT-ske`-style run gets close to the paper.
|
|
||||||
3. Only after the skeleton baseline is credible should DRF/PAV-specific conclusions be treated as decisive.
|
|
||||||
|
|
||||||
## Practical conclusion
|
## Practical conclusion
|
||||||
|
|
||||||
For practical use in this repo, the current winning path is:
|
For practical use in this repo, the current winning path is:
|
||||||
@@ -143,12 +184,15 @@ So the current local recommendation is:
|
|||||||
- keep `1:1:2` as the main practical split
|
- keep `1:1:2` as the main practical split
|
||||||
- treat DRF as an optional research branch, not the mainline model
|
- treat DRF as an optional research branch, not the mainline model
|
||||||
|
|
||||||
|
If the goal is practical deployment/use, use the retained best skeleton checkpoint family first.
|
||||||
|
If the goal is paper audit or author-checkpoint verification, use the dedicated DRF compatibility configs instead.
|
||||||
|
|
||||||
## Remaining useful experiments
|
## Remaining useful experiments
|
||||||
|
|
||||||
At this point, there are only a few experiments that still look high-value:
|
At this point, there are only a few experiments that still look high-value:
|
||||||
|
|
||||||
1. one clean `full-body` finetune under the same successful `1:1:2` recipe, just to confirm that `body-only` is really the best practical representation
|
1. one clean `full-body` finetune under the same successful `1:1:2` recipe, just to confirm that `body-only` is really the best practical representation
|
||||||
2. one DRF rerun on top of the now-stronger practical baseline recipe, only if the goal is to test whether DRF can add value once the skeleton branch is already strong
|
2. one DRF warm-start rerun on top of the now-stronger practical baseline recipe, only if the goal is to test whether DRF can add value once the skeleton branch is already strong
|
||||||
3. a final packaging/evaluation pass around the retained best checkpoints, rather than more broad preprocessing churn
|
3. a final packaging/evaluation pass around the retained best checkpoints, rather than more broad preprocessing churn
|
||||||
|
|
||||||
Everything else looks lower value than simply using the retained best `27000` checkpoint.
|
Everything else looks lower value than simply using the retained best `27000` checkpoint.
|
||||||
|
|||||||
Reference in New Issue
Block a user