Files
OpenGait/docs/sconet-drf-status-and-training.md

11 KiB

ScoNet and DRF: Status, Architecture, and Reproduction Notes

This note is the high-level status page for Scoliosis1K work in this repo. It records what is implemented, what currently works best in practice, and how to interpret the local DRF/ScoNet results.

For the stricter paper-vs-local breakdown, see scoliosis_reproducibility_audit.md. For the concrete experiment queue, see scoliosis_next_experiments.md. For the author-checkpoint compatibility recovery, see drf_author_checkpoint_compat.md. For the recommended long-running local launch workflow, see systemd-run-training.md.

Current status

  • opengait/modeling/models/sconet.py is still the standard Scoliosis1K baseline in this repo.
  • The class is named ScoNet, but functionally it is the paper's multi-task variant because training uses both CrossEntropyLoss and TripletLoss.
  • opengait/modeling/models/drf.py is now implemented as a standalone DRF model in this repo.
  • Logging supports TensorBoard and optional Weights & Biases through opengait/utils/msg_manager.py.

Current bottom line

  • The current practical winner is the skeleton-map ScoNet path, not DRF.
  • The best verified local checkpoint is:
    • ScoNet_skeleton_112_sigma15_joint8_bodyonly_plaince_adamw_cosine_finetune_1gpu_80k
    • retained best checkpoint at 27000
    • verified full-test result: 92.38 Acc / 90.30 Prec / 87.39 Rec / 88.70 F1
  • The strongest practical recipe behind that checkpoint is:
    • split: 1:1:2
    • representation: body-only
    • losses: plain CE + triplet
    • baseline training: SGD
    • later finetune: AdamW + cosine decay
  • A local DRF run trained from scratch on the same practical recipe did not improve over the plain skeleton baseline.
  • The author-provided DRF checkpoint is now usable in-tree after compatibility fixes, but only under the recovered 118-aligned runtime contract.

Naming clarification

The name ScoNet is overloaded across the paper, config files, and checkpoints. Use the mapping below when reading this repo:

Local name What it means here Closest paper name
ScoNet model class opengait/modeling/models/sconet.py with both CE and triplet losses ScoNet-MT
configs/sconet/sconet_scoliosis1k.yaml standard Scoliosis1K silhouette training recipe in this repo ScoNet-MT training recipe
ScoNet-*.pt checkpoint filenames local checkpoint naming inherited from the repo/config usually ScoNet-MT if trained with the default config
ScoNet-MT-ske in these docs same ScoNet code path, but fed 2-channel skeleton maps paper notation ScoNet-MT^{ske}
DRF ScoNet-MT-ske plus PGA/PAV guidance DRF

So:

  • paper ScoNet means the single-task CE-only model
  • repo ScoNet usually means the multi-task variant unless someone explicitly removes triplet loss
  • a checkpoint named ScoNet-...pt is not enough to tell the modality by itself; check input channels and dataset root

Important modality note

The strongest local ScoNet checkpoint we checked, ckpt/ScoNet-20000-better.pt, is a silhouette checkpoint, not a skeleton-map checkpoint.

Evidence:

  • its first convolution weight has shape (64, 1, 3, 3), so it expects 1-channel input
  • the matching eval config points to Scoliosis1K-sil-pkl
  • the skeleton-map configs in this repo use in_channel: 2

This matters because a good result from ScoNet-20000-better.pt only validates the silhouette path. It does not validate the heatmap/skeleton-map preprocessing used by DRF or by a ScoNet-MT-ske-style control.

What was checked against f754f6f3831e9f83bb28f4e2f63dd43d8bcf9dc4

The upstream ScoNet training recipe itself is effectively unchanged:

  • configs/sconet/sconet_scoliosis1k.yaml is unchanged
  • opengait/modeling/models/sconet.py is unchanged
  • opengait/main.py, opengait/modeling/base_model.py, opengait/data/dataset.py, opengait/data/collate_fn.py, and opengait/evaluation/evaluator.py only differ in import cleanup and logging hooks

So the current failure is not explained by a changed optimizer, scheduler, sampler, train loop, or evaluator.

For the skeleton-map control, the only required functional drift from the upstream ScoNet config was:

  • use a heatmap dataset root instead of Scoliosis1K-sil-pkl
  • switch the partition to Scoliosis1K_118.json
  • set model_cfg.backbone_cfg.in_channel: 2
  • reduce test batch_size to match the local 2-GPU DDP evaluator constraint

Local reproduction findings

The main findings so far are:

  • ScoNet-20000-better.pt on the 1:1:2 silhouette split reproduced cleanly at 95.05% accuracy and 85.12% macro-F1.
  • The 1:1:8 skeleton-map control trained with healthy optimization metrics but evaluated very poorly.
  • A recent ScoNet-MT-ske-style control on Scoliosis1K_sigma_8.0/pkl finished with 36.45% accuracy and 32.78% macro-F1.
  • That result is far below the paper's 1:1:8 ScoNet-MT range and far below the silhouette baseline behavior.
  • On the easier 1:1:2 split, the skeleton branch is clearly learnable:
    • body-only + weighted CE reached 81.82% accuracy and 65.96% macro-F1 on the full test set at 7000
    • body-only + plain CE improved that to 83.16% accuracy and 68.47% macro-F1 at 7000
    • a later full-test rerun confirmed the body-only + plain CE 7000 result exactly
    • an AdamW cosine finetune from that same plain-CE checkpoint improved the practical best further; the retained 27000 checkpoint reproduced at 92.38% accuracy and 88.70% macro-F1 on the full test set
    • a head-lite + plain CE variant looked promising on the fixed proxy subset but underperformed on the full test set at 7000 (78.07% accuracy, 62.08% macro-F1)
  • The first practical DRF bridge on that same winning 1:1:2 recipe did not improve on the plain skeleton baseline:
    • best retained DRF checkpoint (2000) on the full test set: 80.21 Acc / 58.92 Prec / 59.23 Rec / 57.84 F1
    • practical plain skeleton checkpoint (7000) on the full test set: 83.16 Acc / 68.24 Prec / 80.02 Rec / 68.47 F1
  • The author-provided DRF checkpoint initially looked unusable in this fork, but that turned out to be a compatibility problem, not a pure weight problem.
    • after recovering the legacy runtime contract, the best compatible path was Scoliosis1K-drf-pkl-118-aligned
    • recovered author-checkpoint result: 80.24 Acc / 76.73 Prec / 76.40 Rec / 76.56 F1

The current working conclusion is:

  • the core ScoNet trainer is not the problem
  • the strong silhouette checkpoint is not evidence that the skeleton-map path works
  • the biggest historical problem was the skeleton-map/runtime contract, not just the optimizer
  • for practical model development, 1:1:2 is currently the better working split than 1:1:8
  • for practical model development, the current best skeleton recipe is body-only + plain CE, and the current best retained checkpoint comes from a later AdamW cosine finetune on 1:1:2
  • for practical use, DRF is still behind the local ScoNet skeleton winner
  • for paper-compatibility analysis, the author checkpoint demonstrates that our earlier DRF failure was partly caused by contract mismatch

For readability in this repo's docs, ScoNet-MT-ske refers to the skeleton-map variant that the DRF paper writes as ScoNet-MT^{ske}.

DRF compatibility note

There are now two different DRF stories in this repo:

  1. The local-from-scratch DRF branch.

    • This is the branch trained directly in our fork on the current practical recipe.
    • It did not beat the plain skeleton baseline.
  2. The author-checkpoint compatibility branch.

    • This uses the author-supplied checkpoint plus in-tree compatibility fixes.
    • The main recovered issues were:
      • legacy module naming drift: attention_layer.* vs PGA.*
      • class-order mismatch between the author stub and our evaluator assumptions
      • stale/internally inconsistent author YAML
      • preprocessing/runtime mismatch, where 118-aligned matched much better than the paper-literal export

That distinction matters. It means:

  • "our DRF training branch underperformed" is true
  • "the author DRF checkpoint is unusable" is false
  • "the author result was drop-in reproducible from the handed-over YAML" is also false

Architecture mapping

ScoNet in this repo maps to the paper as follows:

Paper Component Code Reference Description
Backbone ResNet9 in opengait/modeling/backbones/resnet.py Four residual stages with channels [64, 128, 256, 512].
Temporal aggregation PackSequenceWrapper(torch.max) Temporal max pooling over frames.
Spatial pooling HorizontalPoolingPyramid 16-bin horizontal partition.
Feature mapping SeparateFCs Maps pooled features into the embedding space.
Classification head SeparateBNNecks Produces screening logits.
Losses TripletLoss + CrossEntropyLoss This is why the repo implementation is functionally ScoNet-MT.

Training path summary

The standard Scoliosis1K ScoNet recipe is:

  • sampler: TripletSampler
  • train batch layout: 8 x 8
  • train sample type: fixed_unordered
  • train frames: 30
  • transform: BaseSilCuttingTransform
  • optimizer: SGD(lr=0.1, momentum=0.9, weight_decay=5e-4)
  • scheduler: MultiStepLR with milestones [10000, 14000, 18000]
  • total iterations: 20000

The skeleton-map control used the same recipe, except for the modality-specific changes listed above.

Practical conclusion

For practical use in this repo, the current winning path is:

  • split: 1:1:2
  • representation: body-only skeleton map
  • losses: plain CE + triplet
  • baseline training: SGD
  • best retained finetune: AdamW + cosine decay

The strongest verified checkpoint so far is:

  • ScoNet_skeleton_112_sigma15_joint8_bodyonly_plaince_adamw_cosine_finetune_1gpu_80k
  • retained best checkpoint at 27000
  • verified full-test result: 92.38 Acc / 90.30 Prec / 87.39 Rec / 88.70 F1

So the current local recommendation is:

  • keep body-only as the default practical skeleton representation
  • keep 1:1:2 as the main practical split
  • treat DRF as an optional research branch, not the mainline model

If the goal is practical deployment/use, use the retained best skeleton checkpoint family first. If the goal is paper audit or author-checkpoint verification, use the dedicated DRF compatibility configs instead.

Remaining useful experiments

At this point, there are only a few experiments that still look high-value:

  1. one clean full-body finetune under the same successful 1:1:2 recipe, just to confirm that body-only is really the best practical representation
  2. one DRF warm-start rerun on top of the now-stronger practical baseline recipe, only if the goal is to test whether DRF can add value once the skeleton branch is already strong
  3. a final packaging/evaluation pass around the retained best checkpoints, rather than more broad preprocessing churn

Everything else looks lower value than simply using the retained best 27000 checkpoint.