ScoNet and DRF: Status, Architecture, and Reproduction Notes

This note records the current Scoliosis1K implementation status in this repo and the main conclusions from the recent reproduction/debugging work.

For a stricter paper-vs-local reproducibility breakdown, see scoliosis_reproducibility_audit.md.

Current status

opengait/modeling/models/sconet.py is still the standard Scoliosis1K baseline in this repo.
The class is named ScoNet, but functionally it is the paper's multi-task variant because training uses both CrossEntropyLoss and TripletLoss.
opengait/modeling/models/drf.py is now implemented as a standalone DRF model in this repo.
Logging supports TensorBoard and optional Weights & Biases through opengait/utils/msg_manager.py.

Naming clarification

The name ScoNet is overloaded across the paper, config files, and checkpoints. Use the mapping below when reading this repo:

Local name	What it means here	Closest paper name
`ScoNet` model class	`opengait/modeling/models/sconet.py` with both CE and triplet losses	`ScoNet-MT`
`configs/sconet/sconet_scoliosis1k.yaml`	standard Scoliosis1K silhouette training recipe in this repo	`ScoNet-MT` training recipe
`ScoNet-*.pt` checkpoint filenames	local checkpoint naming inherited from the repo/config	usually `ScoNet-MT` if trained with the default config
`ScoNet-MT-ske` in these docs	same ScoNet code path, but fed 2-channel skeleton maps	paper notation `ScoNet-MT^{ske}`
`DRF`	`ScoNet-MT-ske` plus PGA/PAV guidance	`DRF`

So:

paper ScoNet means the single-task CE-only model
repo ScoNet usually means the multi-task variant unless someone explicitly removes triplet loss
a checkpoint named ScoNet-...pt is not enough to tell the modality by itself; check input channels and dataset root

Important modality note

The strongest local ScoNet checkpoint we checked, ckpt/ScoNet-20000-better.pt, is a silhouette checkpoint, not a skeleton-map checkpoint.

Evidence:

its first convolution weight has shape (64, 1, 3, 3), so it expects 1-channel input
the matching eval config points to Scoliosis1K-sil-pkl
the skeleton-map configs in this repo use in_channel: 2

This matters because a good result from ScoNet-20000-better.pt only validates the silhouette path. It does not validate the heatmap/skeleton-map preprocessing used by DRF or by a ScoNet-MT-ske-style control.

What was checked against `f754f6f3831e9f83bb28f4e2f63dd43d8bcf9dc4`

The upstream ScoNet training recipe itself is effectively unchanged:

configs/sconet/sconet_scoliosis1k.yaml is unchanged
opengait/modeling/models/sconet.py is unchanged
opengait/main.py, opengait/modeling/base_model.py, opengait/data/dataset.py, opengait/data/collate_fn.py, and opengait/evaluation/evaluator.py only differ in import cleanup and logging hooks

So the current failure is not explained by a changed optimizer, scheduler, sampler, train loop, or evaluator.

For the skeleton-map control, the only required functional drift from the upstream ScoNet config was:

use a heatmap dataset root instead of Scoliosis1K-sil-pkl
switch the partition to Scoliosis1K_118.json
set model_cfg.backbone_cfg.in_channel: 2
reduce test batch_size to match the local 2-GPU DDP evaluator constraint

Local reproduction findings

The main findings so far are:

ScoNet-20000-better.pt on the 1:1:2 silhouette split reproduced cleanly at 95.05% accuracy and 85.12% macro-F1.
The 1:1:8 skeleton-map control trained with healthy optimization metrics but evaluated very poorly.
A recent ScoNet-MT-ske-style control on Scoliosis1K_sigma_8.0/pkl finished with 36.45% accuracy and 32.78% macro-F1.
That result is far below the paper's 1:1:8 ScoNet-MT range and far below the silhouette baseline behavior.

The current working conclusion is:

the core ScoNet trainer is not the problem
the strong silhouette checkpoint is not evidence that the skeleton-map path works
the main remaining suspect is the skeleton-map representation and preprocessing path

For readability in this repo's docs, ScoNet-MT-ske refers to the skeleton-map variant that the DRF paper writes as ScoNet-MT^{ske}.

Architecture mapping

ScoNet in this repo maps to the paper as follows:

Paper Component	Code Reference	Description
Backbone	`ResNet9` in `opengait/modeling/backbones/resnet.py`	Four residual stages with channels `[64, 128, 256, 512]`.
Temporal aggregation	`PackSequenceWrapper(torch.max)`	Temporal max pooling over frames.
Spatial pooling	`HorizontalPoolingPyramid`	16-bin horizontal partition.
Feature mapping	`SeparateFCs`	Maps pooled features into the embedding space.
Classification head	`SeparateBNNecks`	Produces screening logits.
Losses	`TripletLoss` + `CrossEntropyLoss`	This is why the repo implementation is functionally ScoNet-MT.

Training path summary

The standard Scoliosis1K ScoNet recipe is:

sampler: TripletSampler
train batch layout: 8 x 8
train sample type: fixed_unordered
train frames: 30
transform: BaseSilCuttingTransform
optimizer: SGD(lr=0.1, momentum=0.9, weight_decay=5e-4)
scheduler: MultiStepLR with milestones [10000, 14000, 18000]
total iterations: 20000

The skeleton-map control used the same recipe, except for the modality-specific changes listed above.

Recommended next checks

Train a pure silhouette 1:1:8 baseline from the upstream ScoNet config as a clean sanity control.
Treat skeleton-map preprocessing as the primary debugging target until a ScoNet-MT-ske-style run gets close to the paper.
Only after the skeleton baseline is credible should DRF/PAV-specific conclusions be treated as decisive.

5.8 KiB Raw Blame History