5.8 KiB
ScoNet and DRF: Status, Architecture, and Training Guide
This document provides a technical overview of the Scoliosis screening models in OpenGait, mapping paper concepts to the repository's implementation status.
DRF implementation status in OpenGait
As of the current version, the Dual Representation Framework (DRF) described in the MICCAI 2025 paper "Pose as Clinical Prior: Learning Dual Representations for Scoliosis Screening" is not yet explicitly implemented as a standalone model in this repository.
Current State
- ScoNet-MT (Functional Implementation): While the class in
opengait/modeling/models/sconet.pyis namedScoNet, it is functionally the ScoNet-MT (Multi-Task) variant described in the MICCAI 2024 paper. It utilizes both classification and triplet losses. - Dual Representation (DRF): While
opengait/modeling/models/skeletongait++.pyimplements a dual-representation (silhouette + pose heatmap) architecture for gait recognition, the specific DRF screening model (MICCAI 2025) is not yet explicitly implemented as a standalone class. - Naming Note: The repository uses the base name
ScoNetfor the multi-task implementation, as it is the high-performance variant recommended for use.
Implementation Blueprint for DRF
To implement DRF within the OpenGait framework, follow this structure:
- Model Location: Create
opengait/modeling/models/drf.pyinheriting fromBaseModel. - Input Handling: Extend
inputs_pretreamentto handle both silhouettes and pose heatmaps (refer toSkeletonGaitPP.inputs_pretreamentinskeletongait++.py). - Dual-Branch Backbone: Use separate early layers for silhouette and skeleton map streams, then fuse via
AttentionFusion(fromskeletongait++.py:135) or a PAV-Guided Attention module as described in the DRF paper. - Forward Contract:
training_feat: Must includetriplet(for identity/feature consistency) andsoftmax(for screening classification).visual_summary: Includeimage/silsandimage/heatmapsfor TensorBoard visualization.inference_feat: Returnlogitsfor classification.
- Config: Create
configs/drf/drf_scoliosis1k.yamlspecifyingmodel: DRFand configuring the dual-stream backbone. - Evaluator: Use
eval_func: evaluate_scoliosisin the config to leverage the existing screening metrics (Accuracy, Precision, Recall, F1). - Dataset: Requires the Scoliosis1K-Pose dataset which provides 17 anatomical keypoints in MS-COCO format alongside the existing silhouettes.
ScoNet/ScoNet-MT architecture mapping
Important
Naming Clarification: The implementation in this repository is ScoNet-MT, not the single-task ScoNet.
- ScoNet (Single-Task): Defined in the paper as using only CrossEntropyLoss.
- ScoNet-MT (Multi-Task): Defined as using
L_{total} = L_{ce} + L_{triplet}.Evidence for ScoNet-MT in this repo:
- Dual Loss Configuration:
configs/sconet/sconet_scoliosis1k.yaml(lines 24-33) defines bothTripletLoss(margin: 0.2) andCrossEntropyLoss.- Dual-Key Forward Pass:
sconet.py(lines 42-46) returns both'triplet'and'softmax'keys in thetraining_featdictionary.- Triplet Sampling: The trainer uses
TripletSamplerwithbatch_size: [8, 8](P=8, K=8) to support triplet mining (config lines 92-99).A "pure" ScoNet implementation would require removing the
TripletLoss, switching to a standardInferenceSampler, and removing thetripletkey from the model'sforwardreturn.
The ScoNet (functionally ScoNet-MT) implementation in opengait/modeling/models/sconet.py maps to the paper as follows:
| Paper Component | Code Reference | Description |
|---|---|---|
| Backbone | ResNet9 in backbones/resnet.py |
A customized ResNet with 4 layers and configurable channels. |
| Temporal Aggregation | self.TP (Temporal Pooling) |
Uses PackSequenceWrapper(torch.max) to aggregate frame features. |
| Spatial Features | self.HPP (Horizontal Pooling) |
HorizontalPoolingPyramid with bin_num: 16. |
| Feature Mapping | self.FCs (SeparateFCs) |
Maps pooled features to a latent embedding space. |
| Classification Head | self.BNNecks (SeparateBNNecks) |
Produces logits for the 3-class screening task. |
| Label Mapping | sconet.py lines 21-23 |
negative: 0, neutral: 1, positive: 2. |
Training guide (dataloader, optimizer, logging)
Dataloader Setup
The training configuration is defined in configs/sconet/sconet_scoliosis1k.yaml:
- Sampler:
TripletSampler(standard for OpenGait). - Batch Size:
[8, 8](8 identities, 8 sequences per identity). - Sequence Sampling:
fixed_unorderedwithframes_num_fixed: 30. - Transform:
BaseSilCuttingTransformfor silhouette preprocessing.
Optimizer and Scheduler
- Optimizer: SGD
lr: 0.1momentum: 0.9weight_decay: 0.0005
- Scheduler:
MultiStepLRmilestones: [10000, 14000, 18000]gamma: 0.1
- Total Iterations: 20,000.
Logging
- TensorBoard: OpenGait natively supports TensorBoard logging. Training losses (
triplet,softmax) and accuracies are logged everylog_iter: 100. - WandB: There is no native Weights & Biases (WandB) integration in the current codebase. Users wishing to use WandB must manually integrate it into
opengait/utils/msg_manager.pyoropengait/main.py. - Evaluation: Metrics (Accuracy, Precision, Recall, F1) are computed by
evaluate_scoliosisinopengait/evaluation/evaluator.pyand logged to the console/file.
Evidence References
- Model Implementation:
opengait/modeling/models/sconet.py - Training Config:
configs/sconet/sconet_scoliosis1k.yaml - Evaluation Logic:
opengait/evaluation/evaluator.py::evaluate_scoliosis - Backbone Definition:
opengait/modeling/backbones/resnet.py::ResNet9