Files
OpenGait/docs/sconet-drf-status-and-training.md
T

5.8 KiB

ScoNet and DRF: Status, Architecture, and Reproduction Notes

This note records the current Scoliosis1K implementation status in this repo and the main conclusions from the recent reproduction/debugging work.

For a stricter paper-vs-local reproducibility breakdown, see scoliosis_reproducibility_audit.md.

Current status

  • opengait/modeling/models/sconet.py is still the standard Scoliosis1K baseline in this repo.
  • The class is named ScoNet, but functionally it is the paper's multi-task variant because training uses both CrossEntropyLoss and TripletLoss.
  • opengait/modeling/models/drf.py is now implemented as a standalone DRF model in this repo.
  • Logging supports TensorBoard and optional Weights & Biases through opengait/utils/msg_manager.py.

Naming clarification

The name ScoNet is overloaded across the paper, config files, and checkpoints. Use the mapping below when reading this repo:

Local name What it means here Closest paper name
ScoNet model class opengait/modeling/models/sconet.py with both CE and triplet losses ScoNet-MT
configs/sconet/sconet_scoliosis1k.yaml standard Scoliosis1K silhouette training recipe in this repo ScoNet-MT training recipe
ScoNet-*.pt checkpoint filenames local checkpoint naming inherited from the repo/config usually ScoNet-MT if trained with the default config
ScoNet-MT-ske in these docs same ScoNet code path, but fed 2-channel skeleton maps paper notation ScoNet-MT^{ske}
DRF ScoNet-MT-ske plus PGA/PAV guidance DRF

So:

  • paper ScoNet means the single-task CE-only model
  • repo ScoNet usually means the multi-task variant unless someone explicitly removes triplet loss
  • a checkpoint named ScoNet-...pt is not enough to tell the modality by itself; check input channels and dataset root

Important modality note

The strongest local ScoNet checkpoint we checked, ckpt/ScoNet-20000-better.pt, is a silhouette checkpoint, not a skeleton-map checkpoint.

Evidence:

  • its first convolution weight has shape (64, 1, 3, 3), so it expects 1-channel input
  • the matching eval config points to Scoliosis1K-sil-pkl
  • the skeleton-map configs in this repo use in_channel: 2

This matters because a good result from ScoNet-20000-better.pt only validates the silhouette path. It does not validate the heatmap/skeleton-map preprocessing used by DRF or by a ScoNet-MT-ske-style control.

What was checked against f754f6f3831e9f83bb28f4e2f63dd43d8bcf9dc4

The upstream ScoNet training recipe itself is effectively unchanged:

  • configs/sconet/sconet_scoliosis1k.yaml is unchanged
  • opengait/modeling/models/sconet.py is unchanged
  • opengait/main.py, opengait/modeling/base_model.py, opengait/data/dataset.py, opengait/data/collate_fn.py, and opengait/evaluation/evaluator.py only differ in import cleanup and logging hooks

So the current failure is not explained by a changed optimizer, scheduler, sampler, train loop, or evaluator.

For the skeleton-map control, the only required functional drift from the upstream ScoNet config was:

  • use a heatmap dataset root instead of Scoliosis1K-sil-pkl
  • switch the partition to Scoliosis1K_118.json
  • set model_cfg.backbone_cfg.in_channel: 2
  • reduce test batch_size to match the local 2-GPU DDP evaluator constraint

Local reproduction findings

The main findings so far are:

  • ScoNet-20000-better.pt on the 1:1:2 silhouette split reproduced cleanly at 95.05% accuracy and 85.12% macro-F1.
  • The 1:1:8 skeleton-map control trained with healthy optimization metrics but evaluated very poorly.
  • A recent ScoNet-MT-ske-style control on Scoliosis1K_sigma_8.0/pkl finished with 36.45% accuracy and 32.78% macro-F1.
  • That result is far below the paper's 1:1:8 ScoNet-MT range and far below the silhouette baseline behavior.

The current working conclusion is:

  • the core ScoNet trainer is not the problem
  • the strong silhouette checkpoint is not evidence that the skeleton-map path works
  • the main remaining suspect is the skeleton-map representation and preprocessing path

For readability in this repo's docs, ScoNet-MT-ske refers to the skeleton-map variant that the DRF paper writes as ScoNet-MT^{ske}.

Architecture mapping

ScoNet in this repo maps to the paper as follows:

Paper Component Code Reference Description
Backbone ResNet9 in opengait/modeling/backbones/resnet.py Four residual stages with channels [64, 128, 256, 512].
Temporal aggregation PackSequenceWrapper(torch.max) Temporal max pooling over frames.
Spatial pooling HorizontalPoolingPyramid 16-bin horizontal partition.
Feature mapping SeparateFCs Maps pooled features into the embedding space.
Classification head SeparateBNNecks Produces screening logits.
Losses TripletLoss + CrossEntropyLoss This is why the repo implementation is functionally ScoNet-MT.

Training path summary

The standard Scoliosis1K ScoNet recipe is:

  • sampler: TripletSampler
  • train batch layout: 8 x 8
  • train sample type: fixed_unordered
  • train frames: 30
  • transform: BaseSilCuttingTransform
  • optimizer: SGD(lr=0.1, momentum=0.9, weight_decay=5e-4)
  • scheduler: MultiStepLR with milestones [10000, 14000, 18000]
  • total iterations: 20000

The skeleton-map control used the same recipe, except for the modality-specific changes listed above.

  1. Train a pure silhouette 1:1:8 baseline from the upstream ScoNet config as a clean sanity control.
  2. Treat skeleton-map preprocessing as the primary debugging target until a ScoNet-MT-ske-style run gets close to the paper.
  3. Only after the skeleton baseline is credible should DRF/PAV-specific conclusions be treated as decisive.