Document ScoNet naming and Scoliosis eval configs

This commit is contained in:
2026-03-08 17:34:33 +08:00
parent 295d951206
commit 4e0b0a18dc
5 changed files with 315 additions and 74 deletions
@@ -0,0 +1,101 @@
data_cfg:
dataset_name: Scoliosis1K
dataset_root: /mnt/public/data/Scoliosis1K/Scoliosis1K-sil-pkl
dataset_partition: /mnt/public/data/Scoliosis1K/Scoliosis1K_1116.json
num_workers: 1
remove_no_gallery: false
test_dataset_name: Scoliosis1K
evaluator_cfg:
enable_float16: true
restore_ckpt_strict: true
restore_hint: ./ckpt/ScoNet-20000-better.pt
save_name: ScoNet_better
eval_func: evaluate_scoliosis
sampler:
batch_shuffle: false
batch_size: 2
sample_type: all_ordered
frames_all_limit: 720
metric: euc
transform:
- type: BaseSilCuttingTransform
loss_cfg:
- loss_term_weight: 1.0
margin: 0.2
type: TripletLoss
log_prefix: triplet
- loss_term_weight: 1.0
scale: 16
type: CrossEntropyLoss
log_prefix: softmax
log_accuracy: true
model_cfg:
model: ScoNet
backbone_cfg:
type: ResNet9
block: BasicBlock
channels:
- 64
- 128
- 256
- 512
layers:
- 1
- 1
- 1
- 1
strides:
- 1
- 2
- 2
- 1
maxpool: false
SeparateFCs:
in_channels: 512
out_channels: 256
parts_num: 16
SeparateBNNecks:
class_num: 3
in_channels: 256
parts_num: 16
bin_num:
- 16
optimizer_cfg:
lr: 0.1
momentum: 0.9
solver: SGD
weight_decay: 0.0005
scheduler_cfg:
gamma: 0.1
milestones:
- 10000
- 14000
- 18000
scheduler: MultiStepLR
trainer_cfg:
enable_float16: true
fix_BN: false
with_test: false
log_iter: 100
restore_ckpt_strict: true
restore_hint: 0
save_iter: 20000
save_name: ScoNet_better
sync_BN: true
total_iter: 20000
sampler:
batch_shuffle: true
batch_size:
- 8
- 8
frames_num_fixed: 30
sample_type: fixed_unordered
type: TripletSampler
transform:
- type: BaseSilCuttingTransform
@@ -0,0 +1,101 @@
data_cfg:
dataset_name: Scoliosis1K
dataset_root: /mnt/public/data/Scoliosis1K/Scoliosis1K-sil-pkl
dataset_partition: /mnt/public/data/Scoliosis1K/Scoliosis1K_112.json
num_workers: 1
remove_no_gallery: false
test_dataset_name: Scoliosis1K
evaluator_cfg:
enable_float16: true
restore_ckpt_strict: true
restore_hint: ./ckpt/ScoNet-20000-better.pt
save_name: ScoNet_better_112
eval_func: evaluate_scoliosis
sampler:
batch_shuffle: false
batch_size: 2
sample_type: all_ordered
frames_all_limit: 720
metric: euc
transform:
- type: BaseSilCuttingTransform
loss_cfg:
- loss_term_weight: 1.0
margin: 0.2
type: TripletLoss
log_prefix: triplet
- loss_term_weight: 1.0
scale: 16
type: CrossEntropyLoss
log_prefix: softmax
log_accuracy: true
model_cfg:
model: ScoNet
backbone_cfg:
type: ResNet9
block: BasicBlock
channels:
- 64
- 128
- 256
- 512
layers:
- 1
- 1
- 1
- 1
strides:
- 1
- 2
- 2
- 1
maxpool: false
SeparateFCs:
in_channels: 512
out_channels: 256
parts_num: 16
SeparateBNNecks:
class_num: 3
in_channels: 256
parts_num: 16
bin_num:
- 16
optimizer_cfg:
lr: 0.1
momentum: 0.9
solver: SGD
weight_decay: 0.0005
scheduler_cfg:
gamma: 0.1
milestones:
- 10000
- 14000
- 18000
scheduler: MultiStepLR
trainer_cfg:
enable_float16: true
fix_BN: false
with_test: false
log_iter: 100
restore_ckpt_strict: true
restore_hint: 0
save_iter: 20000
save_name: ScoNet_better_112
sync_BN: true
total_iter: 20000
sampler:
batch_shuffle: true
batch_size:
- 8
- 8
frames_num_fixed: 30
sample_type: fixed_unordered
type: TripletSampler
transform:
- type: BaseSilCuttingTransform
@@ -1,10 +1,7 @@
data_cfg: data_cfg:
dataset_name: Scoliosis1K dataset_name: Scoliosis1K
dataset_root: /mnt/public/data/Scoliosis1K/Scoliosis1K-drf-pkl-118-aligned dataset_root: /mnt/public/data/Scoliosis1K/Scoliosis1K_sigma_8.0/pkl
dataset_partition: ./datasets/Scoliosis1K/Scoliosis1K_118.json dataset_partition: ./datasets/Scoliosis1K/Scoliosis1K_118.json
data_in_use:
- true
- false
num_workers: 1 num_workers: 1
remove_no_gallery: false remove_no_gallery: false
test_dataset_name: Scoliosis1K test_dataset_name: Scoliosis1K
@@ -13,7 +10,7 @@ evaluator_cfg:
enable_float16: true enable_float16: true
restore_ckpt_strict: true restore_ckpt_strict: true
restore_hint: 20000 restore_hint: 20000
save_name: ScoNet_skeleton_118 save_name: ScoNet_skeleton_118_sigma8
eval_func: evaluate_scoliosis eval_func: evaluate_scoliosis
sampler: sampler:
batch_shuffle: false batch_shuffle: false
@@ -90,7 +87,7 @@ trainer_cfg:
restore_ckpt_strict: true restore_ckpt_strict: true
restore_hint: 0 restore_hint: 0
save_iter: 20000 save_iter: 20000
save_name: ScoNet_skeleton_118 save_name: ScoNet_skeleton_118_sigma8
sync_BN: true sync_BN: true
total_iter: 20000 total_iter: 20000
sampler: sampler:
+20
View File
@@ -69,6 +69,26 @@ python -m torch.distributed.launch --nproc_per_node=4 \
opengait/main.py --cfgs configs/sconet/sconet_scoliosis1k.yaml --phase test --log_to_file opengait/main.py --cfgs configs/sconet/sconet_scoliosis1k.yaml --phase test --log_to_file
``` ```
### Modality sanity check
The silhouette and skeleton-map pipelines are different experiments and should not be mixed when you interpret results.
* `Scoliosis1K-sil-pkl` is the silhouette modality used by the standard ScoNet configs.
* pose-derived heatmap roots such as `Scoliosis1K_sigma_8.0/pkl` or DRF exports are skeleton-map inputs and require `in_channel: 2`.
Naming note:
* in this repo, the local `ScoNet` training config and model class are usually the paper's `ScoNet-MT`, not the CE-only paper `ScoNet`
* in these docs, `ScoNet-MT-ske` means the skeleton-map variant of that same model path
* checkpoint filenames like `ScoNet-20000-better.pt` do not identify the modality by name alone
A strong silhouette checkpoint does not validate the skeleton-map path. In particular, `ckpt/ScoNet-20000-better.pt` is a silhouette checkpoint:
* its first convolution expects 1-channel input
* the matching eval config points to `Scoliosis1K-sil-pkl`
So if you are debugging DRF or `ScoNet-MT-ske` reproduction, do not use `ScoNet-20000-better.pt` as evidence that the heatmap preprocessing is correct.
--- ---
## Pose-to-Heatmap Conversion ## Pose-to-Heatmap Conversion
+90 -68
View File
@@ -1,86 +1,108 @@
# ScoNet and DRF: Status, Architecture, and Training Guide # ScoNet and DRF: Status, Architecture, and Reproduction Notes
This document provides a technical overview of the Scoliosis screening models in OpenGait, mapping paper concepts to the repository's implementation status. This note records the current Scoliosis1K implementation status in this repo and the main conclusions from the recent reproduction/debugging work.
## DRF implementation status in OpenGait ## Current status
As of the current version, the **Dual Representation Framework (DRF)** described in the MICCAI 2025 paper *"Pose as Clinical Prior: Learning Dual Representations for Scoliosis Screening"* is **not yet explicitly implemented** as a standalone model in this repository. - `opengait/modeling/models/sconet.py` is still the standard Scoliosis1K baseline in this repo.
- The class is named `ScoNet`, but functionally it is the paper's multi-task variant because training uses both `CrossEntropyLoss` and `TripletLoss`.
- `opengait/modeling/models/drf.py` is now implemented as a standalone DRF model in this repo.
- Logging supports TensorBoard and optional Weights & Biases through `opengait/utils/msg_manager.py`.
### Current State ## Naming clarification
- **ScoNet-MT (Functional Implementation)**: While the class in `opengait/modeling/models/sconet.py` is named `ScoNet`, it is functionally the **ScoNet-MT** (Multi-Task) variant described in the MICCAI 2024 paper. It utilizes both classification and triplet losses.
- **Dual Representation (DRF)**: While `opengait/modeling/models/skeletongait++.py` implements a dual-representation (silhouette + pose heatmap) architecture for gait recognition, the specific DRF screening model (MICCAI 2025) is not yet explicitly implemented as a standalone class.
- **Naming Note**: The repository uses the base name `ScoNet` for the multi-task implementation, as it is the high-performance variant recommended for use.
### Implementation Blueprint for DRF The name `ScoNet` is overloaded across the paper, config files, and checkpoints. Use the mapping below when reading this repo:
To implement DRF within the OpenGait framework, follow this structure:
1. **Model Location**: Create `opengait/modeling/models/drf.py` inheriting from `BaseModel`.
2. **Input Handling**: Extend `inputs_pretreament` to handle both silhouettes and pose heatmaps (refer to `SkeletonGaitPP.inputs_pretreament` in `skeletongait++.py`).
3. **Dual-Branch Backbone**: Use separate early layers for silhouette and skeleton map streams, then fuse via `AttentionFusion` (from `skeletongait++.py:135`) or a PAV-Guided Attention module as described in the DRF paper.
4. **Forward Contract**:
- `training_feat`: Must include `triplet` (for identity/feature consistency) and `softmax` (for screening classification).
- `visual_summary`: Include `image/sils` and `image/heatmaps` for TensorBoard visualization.
- `inference_feat`: Return `logits` for classification.
5. **Config**: Create `configs/drf/drf_scoliosis1k.yaml` specifying `model: DRF` and configuring the dual-stream backbone.
6. **Evaluator**: Use `eval_func: evaluate_scoliosis` in the config to leverage the existing screening metrics (Accuracy, Precision, Recall, F1).
7. **Dataset**: Requires the **Scoliosis1K-Pose** dataset which provides 17 anatomical keypoints in MS-COCO format alongside the existing silhouettes.
--- | Local name | What it means here | Closest paper name |
| :--- | :--- | :--- |
| `ScoNet` model class | `opengait/modeling/models/sconet.py` with both CE and triplet losses | `ScoNet-MT` |
| `configs/sconet/sconet_scoliosis1k.yaml` | standard Scoliosis1K silhouette training recipe in this repo | `ScoNet-MT` training recipe |
| `ScoNet-*.pt` checkpoint filenames | local checkpoint naming inherited from the repo/config | usually `ScoNet-MT` if trained with the default config |
| `ScoNet-MT-ske` in these docs | same ScoNet code path, but fed 2-channel skeleton maps | paper notation `ScoNet-MT^{ske}` |
| `DRF` | `ScoNet-MT-ske` plus PGA/PAV guidance | `DRF` |
## ScoNet/ScoNet-MT architecture mapping So:
> [!IMPORTANT] - paper `ScoNet` means the single-task CE-only model
> **Naming Clarification**: The implementation in this repository is **ScoNet-MT**, not the single-task ScoNet. - repo `ScoNet` usually means the multi-task variant unless someone explicitly removes triplet loss
> - **ScoNet (Single-Task)**: Defined in the paper as using only CrossEntropyLoss. - a checkpoint named `ScoNet-...pt` is not enough to tell the modality by itself; check input channels and dataset root
> - **ScoNet-MT (Multi-Task)**: Defined as using $L_{total} = L_{ce} + L_{triplet}$.
>
> **Evidence for ScoNet-MT in this repo:**
> 1. **Dual Loss Configuration**: `configs/sconet/sconet_scoliosis1k.yaml` (lines 24-33) defines both `TripletLoss` (margin: 0.2) and `CrossEntropyLoss`.
> 2. **Dual-Key Forward Pass**: `sconet.py` (lines 42-46) returns both `'triplet'` and `'softmax'` keys in the `training_feat` dictionary.
> 3. **Triplet Sampling**: The trainer uses `TripletSampler` with `batch_size: [8, 8]` (P=8, K=8) to support triplet mining (config lines 92-99).
>
> A "pure" ScoNet implementation would require removing the `TripletLoss`, switching to a standard `InferenceSampler`, and removing the `triplet` key from the model's `forward` return.
The `ScoNet` (functionally ScoNet-MT) implementation in `opengait/modeling/models/sconet.py` maps to the paper as follows: ## Important modality note
The strongest local ScoNet checkpoint we checked, `ckpt/ScoNet-20000-better.pt`, is a silhouette checkpoint, not a skeleton-map checkpoint.
Evidence:
- its first convolution weight has shape `(64, 1, 3, 3)`, so it expects 1-channel input
- the matching eval config points to `Scoliosis1K-sil-pkl`
- the skeleton-map configs in this repo use `in_channel: 2`
This matters because a good result from `ScoNet-20000-better.pt` only validates the silhouette path. It does not validate the heatmap/skeleton-map preprocessing used by DRF or by a `ScoNet-MT-ske`-style control.
## What was checked against `f754f6f3831e9f83bb28f4e2f63dd43d8bcf9dc4`
The upstream ScoNet training recipe itself is effectively unchanged:
- `configs/sconet/sconet_scoliosis1k.yaml` is unchanged
- `opengait/modeling/models/sconet.py` is unchanged
- `opengait/main.py`, `opengait/modeling/base_model.py`, `opengait/data/dataset.py`, `opengait/data/collate_fn.py`, and `opengait/evaluation/evaluator.py` only differ in import cleanup and logging hooks
So the current failure is not explained by a changed optimizer, scheduler, sampler, train loop, or evaluator.
For the skeleton-map control, the only required functional drift from the upstream ScoNet config was:
- use a heatmap dataset root instead of `Scoliosis1K-sil-pkl`
- switch the partition to `Scoliosis1K_118.json`
- set `model_cfg.backbone_cfg.in_channel: 2`
- reduce test `batch_size` to match the local 2-GPU DDP evaluator constraint
## Local reproduction findings
The main findings so far are:
- `ScoNet-20000-better.pt` on the `1:1:2` silhouette split reproduced cleanly at `95.05%` accuracy and `85.12%` macro-F1.
- The `1:1:8` skeleton-map control trained with healthy optimization metrics but evaluated very poorly.
- A recent `ScoNet-MT-ske`-style control on `Scoliosis1K_sigma_8.0/pkl` finished with `36.45%` accuracy and `32.78%` macro-F1.
- That result is far below the paper's `1:1:8` ScoNet-MT range and far below the silhouette baseline behavior.
The current working conclusion is:
- the core ScoNet trainer is not the problem
- the strong silhouette checkpoint is not evidence that the skeleton-map path works
- the main remaining suspect is the skeleton-map representation and preprocessing path
For readability in this repo's docs, `ScoNet-MT-ske` refers to the skeleton-map variant that the DRF paper writes as `ScoNet-MT^{ske}`.
## Architecture mapping
`ScoNet` in this repo maps to the paper as follows:
| Paper Component | Code Reference | Description | | Paper Component | Code Reference | Description |
| :--- | :--- | :--- | | :--- | :--- | :--- |
| **Backbone** | `ResNet9` in `backbones/resnet.py` | A customized ResNet with 4 layers and configurable channels. | | Backbone | `ResNet9` in `opengait/modeling/backbones/resnet.py` | Four residual stages with channels `[64, 128, 256, 512]`. |
| **Temporal Aggregation** | `self.TP` (Temporal Pooling) | Uses `PackSequenceWrapper(torch.max)` to aggregate frame features. | | Temporal aggregation | `PackSequenceWrapper(torch.max)` | Temporal max pooling over frames. |
| **Spatial Features** | `self.HPP` (Horizontal Pooling) | `HorizontalPoolingPyramid` with `bin_num: 16`. | | Spatial pooling | `HorizontalPoolingPyramid` | 16-bin horizontal partition. |
| **Feature Mapping** | `self.FCs` (`SeparateFCs`) | Maps pooled features to a latent embedding space. | | Feature mapping | `SeparateFCs` | Maps pooled features into the embedding space. |
| **Classification Head** | `self.BNNecks` (`SeparateBNNecks`) | Produces logits for the 3-class screening task. | | Classification head | `SeparateBNNecks` | Produces screening logits. |
| **Label Mapping** | `sconet.py` lines 21-23 | `negative: 0`, `neutral: 1`, `positive: 2`. | | Losses | `TripletLoss` + `CrossEntropyLoss` | This is why the repo implementation is functionally ScoNet-MT. |
--- ## Training path summary
## Training guide (dataloader, optimizer, logging) The standard Scoliosis1K ScoNet recipe is:
### Dataloader Setup - sampler: `TripletSampler`
The training configuration is defined in `configs/sconet/sconet_scoliosis1k.yaml`: - train batch layout: `8 x 8`
- **Sampler**: `TripletSampler` (standard for OpenGait). - train sample type: `fixed_unordered`
- **Batch Size**: `[8, 8]` (8 identities, 8 sequences per identity). - train frames: `30`
- **Sequence Sampling**: `fixed_unordered` with `frames_num_fixed: 30`. - transform: `BaseSilCuttingTransform`
- **Transform**: `BaseSilCuttingTransform` for silhouette preprocessing. - optimizer: `SGD(lr=0.1, momentum=0.9, weight_decay=5e-4)`
- scheduler: `MultiStepLR` with milestones `[10000, 14000, 18000]`
- total iterations: `20000`
### Optimizer and Scheduler The skeleton-map control used the same recipe, except for the modality-specific changes listed above.
- **Optimizer**: SGD
- `lr: 0.1`
- `momentum: 0.9`
- `weight_decay: 0.0005`
- **Scheduler**: `MultiStepLR`
- `milestones: [10000, 14000, 18000]`
- `gamma: 0.1`
- **Total Iterations**: 20,000.
### Logging ## Recommended next checks
- **TensorBoard**: OpenGait natively supports TensorBoard logging. Training losses (`triplet`, `softmax`) and accuracies are logged every `log_iter: 100`.
- **WandB**: There is **no native Weights & Biases (WandB) integration** in the current codebase. Users wishing to use WandB must manually integrate it into `opengait/utils/msg_manager.py` or `opengait/main.py`.
- **Evaluation**: Metrics (Accuracy, Precision, Recall, F1) are computed by `evaluate_scoliosis` in `opengait/evaluation/evaluator.py` and logged to the console/file.
--- 1. Train a pure silhouette `1:1:8` baseline from the upstream ScoNet config as a clean sanity control.
2. Treat skeleton-map preprocessing as the primary debugging target until a `ScoNet-MT-ske`-style run gets close to the paper.
## Evidence References 3. Only after the skeleton baseline is credible should DRF/PAV-specific conclusions be treated as decisive.
- **Model Implementation**: `opengait/modeling/models/sconet.py`
- **Training Config**: `configs/sconet/sconet_scoliosis1k.yaml`
- **Evaluation Logic**: `opengait/evaluation/evaluator.py::evaluate_scoliosis`
- **Backbone Definition**: `opengait/modeling/backbones/resnet.py::ResNet9`