Files

T

crosstyan 4d916e71c1 docs: refresh scoliosis status page

2026-03-14 17:17:56 +08:00

11 KiB

Raw Permalink Blame History

ScoNet and DRF: Status, Architecture, and Reproduction Notes

This note is the high-level status page for Scoliosis1K work in this repo. It records what is implemented, what currently works best in practice, and how to interpret the local DRF/ScoNet results.

For the stricter paper-vs-local breakdown, see scoliosis_reproducibility_audit.md. For the concrete experiment queue, see scoliosis_next_experiments.md. For the author-checkpoint compatibility recovery, see drf_author_checkpoint_compat.md. For the recommended long-running local launch workflow, see systemd-run-training.md.

Current status

opengait/modeling/models/sconet.py is still the standard Scoliosis1K baseline in this repo.
The class is named ScoNet, but functionally it is the paper's multi-task variant because training uses both CrossEntropyLoss and TripletLoss.
opengait/modeling/models/drf.py is now implemented as a standalone DRF model in this repo.
Logging supports TensorBoard and optional Weights & Biases through opengait/utils/msg_manager.py.

Current bottom line

The current practical winner is the skeleton-map ScoNet path, not DRF.
The best verified local checkpoint is:
- ScoNet_skeleton_112_sigma15_joint8_bodyonly_plaince_adamw_cosine_finetune_1gpu_80k
- retained best checkpoint at 27000
- verified full-test result: 92.38 Acc / 90.30 Prec / 87.39 Rec / 88.70 F1
The strongest practical recipe behind that checkpoint is:
- split: 1:1:2
- representation: body-only
- losses: plain CE + triplet
- baseline training: SGD
- later finetune: AdamW + cosine decay
A local DRF run trained from scratch on the same practical recipe did not improve over the plain skeleton baseline.
The author-provided DRF checkpoint is now usable in-tree after compatibility fixes, but only under the recovered 118-aligned runtime contract.

Naming clarification

The name ScoNet is overloaded across the paper, config files, and checkpoints. Use the mapping below when reading this repo:

Local name	What it means here	Closest paper name
`ScoNet` model class	`opengait/modeling/models/sconet.py` with both CE and triplet losses	`ScoNet-MT`
`configs/sconet/sconet_scoliosis1k.yaml`	standard Scoliosis1K silhouette training recipe in this repo	`ScoNet-MT` training recipe
`ScoNet-*.pt` checkpoint filenames	local checkpoint naming inherited from the repo/config	usually `ScoNet-MT` if trained with the default config
`ScoNet-MT-ske` in these docs	same ScoNet code path, but fed 2-channel skeleton maps	paper notation `ScoNet-MT^{ske}`
`DRF`	`ScoNet-MT-ske` plus PGA/PAV guidance	`DRF`

So:

paper ScoNet means the single-task CE-only model
repo ScoNet usually means the multi-task variant unless someone explicitly removes triplet loss
a checkpoint named ScoNet-...pt is not enough to tell the modality by itself; check input channels and dataset root

Important modality note

The strongest local ScoNet checkpoint we checked, ckpt/ScoNet-20000-better.pt, is a silhouette checkpoint, not a skeleton-map checkpoint.

Evidence:

its first convolution weight has shape (64, 1, 3, 3), so it expects 1-channel input
the matching eval config points to Scoliosis1K-sil-pkl
the skeleton-map configs in this repo use in_channel: 2

This matters because a good result from ScoNet-20000-better.pt only validates the silhouette path. It does not validate the heatmap/skeleton-map preprocessing used by DRF or by a ScoNet-MT-ske-style control.

What was checked against `f754f6f3831e9f83bb28f4e2f63dd43d8bcf9dc4`

The upstream ScoNet training recipe itself is effectively unchanged:

configs/sconet/sconet_scoliosis1k.yaml is unchanged
opengait/modeling/models/sconet.py is unchanged
opengait/main.py, opengait/modeling/base_model.py, opengait/data/dataset.py, opengait/data/collate_fn.py, and opengait/evaluation/evaluator.py only differ in import cleanup and logging hooks

So the current failure is not explained by a changed optimizer, scheduler, sampler, train loop, or evaluator.

For the skeleton-map control, the only required functional drift from the upstream ScoNet config was:

use a heatmap dataset root instead of Scoliosis1K-sil-pkl
switch the partition to Scoliosis1K_118.json
set model_cfg.backbone_cfg.in_channel: 2
reduce test batch_size to match the local 2-GPU DDP evaluator constraint

Local reproduction findings

The main findings so far are:

ScoNet-20000-better.pt on the 1:1:2 silhouette split reproduced cleanly at 95.05% accuracy and 85.12% macro-F1.
The 1:1:8 skeleton-map control trained with healthy optimization metrics but evaluated very poorly.
A recent ScoNet-MT-ske-style control on Scoliosis1K_sigma_8.0/pkl finished with 36.45% accuracy and 32.78% macro-F1.
That result is far below the paper's 1:1:8 ScoNet-MT range and far below the silhouette baseline behavior.
On the easier 1:1:2 split, the skeleton branch is clearly learnable:
- body-only + weighted CE reached 81.82% accuracy and 65.96% macro-F1 on the full test set at 7000
- body-only + plain CE improved that to 83.16% accuracy and 68.47% macro-F1 at 7000
- a later full-test rerun confirmed the body-only + plain CE 7000 result exactly
- an AdamW cosine finetune from that same plain-CE checkpoint improved the practical best further; the retained 27000 checkpoint reproduced at 92.38% accuracy and 88.70% macro-F1 on the full test set
- a head-lite + plain CE variant looked promising on the fixed proxy subset but underperformed on the full test set at 7000 (78.07% accuracy, 62.08% macro-F1)
The first practical DRF bridge on that same winning 1:1:2 recipe did not improve on the plain skeleton baseline:
- best retained DRF checkpoint (2000) on the full test set: 80.21 Acc / 58.92 Prec / 59.23 Rec / 57.84 F1
- practical plain skeleton checkpoint (7000) on the full test set: 83.16 Acc / 68.24 Prec / 80.02 Rec / 68.47 F1
The author-provided DRF checkpoint initially looked unusable in this fork, but that turned out to be a compatibility problem, not a pure weight problem.
- after recovering the legacy runtime contract, the best compatible path was Scoliosis1K-drf-pkl-118-aligned
- recovered author-checkpoint result: 80.24 Acc / 76.73 Prec / 76.40 Rec / 76.56 F1

The current working conclusion is:

the core ScoNet trainer is not the problem
the strong silhouette checkpoint is not evidence that the skeleton-map path works
the biggest historical problem was the skeleton-map/runtime contract, not just the optimizer
for practical model development, 1:1:2 is currently the better working split than 1:1:8
for practical model development, the current best skeleton recipe is body-only + plain CE, and the current best retained checkpoint comes from a later AdamW cosine finetune on 1:1:2
for practical use, DRF is still behind the local ScoNet skeleton winner
for paper-compatibility analysis, the author checkpoint demonstrates that our earlier DRF failure was partly caused by contract mismatch

For readability in this repo's docs, ScoNet-MT-ske refers to the skeleton-map variant that the DRF paper writes as ScoNet-MT^{ske}.

DRF compatibility note

There are now two different DRF stories in this repo:

The local-from-scratch DRF branch.
- This is the branch trained directly in our fork on the current practical recipe.
- It did not beat the plain skeleton baseline.
The author-checkpoint compatibility branch.
- This uses the author-supplied checkpoint plus in-tree compatibility fixes.
- The main recovered issues were:
  - legacy module naming drift: attention_layer.* vs PGA.*
  - class-order mismatch between the author stub and our evaluator assumptions
  - stale/internally inconsistent author YAML
  - preprocessing/runtime mismatch, where 118-aligned matched much better than the paper-literal export

That distinction matters. It means:

"our DRF training branch underperformed" is true
"the author DRF checkpoint is unusable" is false
"the author result was drop-in reproducible from the handed-over YAML" is also false

Architecture mapping

ScoNet in this repo maps to the paper as follows:

Paper Component	Code Reference	Description
Backbone	`ResNet9` in `opengait/modeling/backbones/resnet.py`	Four residual stages with channels `[64, 128, 256, 512]`.
Temporal aggregation	`PackSequenceWrapper(torch.max)`	Temporal max pooling over frames.
Spatial pooling	`HorizontalPoolingPyramid`	16-bin horizontal partition.
Feature mapping	`SeparateFCs`	Maps pooled features into the embedding space.
Classification head	`SeparateBNNecks`	Produces screening logits.
Losses	`TripletLoss` + `CrossEntropyLoss`	This is why the repo implementation is functionally ScoNet-MT.

Training path summary

The standard Scoliosis1K ScoNet recipe is:

sampler: TripletSampler
train batch layout: 8 x 8
train sample type: fixed_unordered
train frames: 30
transform: BaseSilCuttingTransform
optimizer: SGD(lr=0.1, momentum=0.9, weight_decay=5e-4)
scheduler: MultiStepLR with milestones [10000, 14000, 18000]
total iterations: 20000

The skeleton-map control used the same recipe, except for the modality-specific changes listed above.

Practical conclusion

For practical use in this repo, the current winning path is:

split: 1:1:2
representation: body-only skeleton map
losses: plain CE + triplet
baseline training: SGD
best retained finetune: AdamW + cosine decay

The strongest verified checkpoint so far is:

ScoNet_skeleton_112_sigma15_joint8_bodyonly_plaince_adamw_cosine_finetune_1gpu_80k
retained best checkpoint at 27000
verified full-test result: 92.38 Acc / 90.30 Prec / 87.39 Rec / 88.70 F1

So the current local recommendation is:

keep body-only as the default practical skeleton representation
keep 1:1:2 as the main practical split
treat DRF as an optional research branch, not the mainline model

If the goal is practical deployment/use, use the retained best skeleton checkpoint family first. If the goal is paper audit or author-checkpoint verification, use the dedicated DRF compatibility configs instead.

Remaining useful experiments

At this point, there are only a few experiments that still look high-value:

one clean full-body finetune under the same successful 1:1:2 recipe, just to confirm that body-only is really the best practical representation
one DRF warm-start rerun on top of the now-stronger practical baseline recipe, only if the goal is to test whether DRF can add value once the skeleton branch is already strong
a final packaging/evaluation pass around the retained best checkpoints, rather than more broad preprocessing churn

Everything else looks lower value than simply using the retained best 27000 checkpoint.

11 KiB Raw Permalink Blame History