231 lines
9.0 KiB
Markdown
231 lines
9.0 KiB
Markdown
# Tutorial for [Scoliosis1K](https://zhouzi180.github.io/Scoliosis1K)
|
|
## Download the Scoliosis1K Dataset
|
|
|
|
You can download the dataset from the [official website](https://zhouzi180.github.io/Scoliosis1K).
|
|
The dataset is provided as four compressed files:
|
|
|
|
* `Scoliosis1K-sil-raw.zip`
|
|
* `Scoliosis1K-sil-pkl.zip`
|
|
* `Scoliosis1K-pose-raw.zip`
|
|
* `Scoliosis1K-pose-pkl.zip`
|
|
|
|
We recommend using the provided pickle (`.pkl`) files for convenience.
|
|
Decompress them with the following commands:
|
|
|
|
```bash
|
|
unzip -P <password> Scoliosis1K-sil-pkl.zip
|
|
unzip -P <password> Scoliosis1K-pose-pkl.zip
|
|
```
|
|
|
|
> **Note**: The \<password\> can be obtained by signing the [release agreement](https://zhouzi180.github.io/Scoliosis1K/static/resources/Scoliosis1k_release_agreement.pdf) and sending it to **[12331257@mail.sustech.edu.cn](mailto:12331257@mail.sustech.edu.cn)**.
|
|
|
|
### Dataset Structure
|
|
|
|
After decompression, you will get the following structure:
|
|
|
|
```
|
|
├── Scoliosis1K-sil-pkl
|
|
│ ├── 00000 # Identity
|
|
│ │ ├── Positive # Class
|
|
│ │ │ ├── 000_180 # View
|
|
│ │ │ └── 000_180.pkl # Estimated Silhouette (PP-HumanSeg v2)
|
|
│
|
|
├── Scoliosis1K-pose-pkl
|
|
│ ├── 00000 # Identity
|
|
│ │ ├── Positive # Class
|
|
│ │ │ ├── 000_180 # View
|
|
│ │ │ └── 000_180.pkl # Estimated 2D Pose (ViTPose)
|
|
```
|
|
|
|
### Processing from RAW Dataset (optional)
|
|
|
|
If you prefer, you can process the raw dataset into `.pkl` format.
|
|
|
|
```bash
|
|
# For silhouette raw data
|
|
python datasets/pretreatment.py --input_path=<path_to_raw_silhouettes> --output_path=<output_path>
|
|
|
|
# For pose raw data
|
|
python datasets/pretreatment.py --input_path=<path_to_raw_pose> --output_path=<output_path> --pose --dataset=OUMVLP
|
|
```
|
|
---
|
|
|
|
## Training and Testing
|
|
|
|
Before training or testing, modify the `dataset_root` field in
|
|
`configs/sconet/sconet_scoliosis1k.yaml`.
|
|
|
|
Then run the following commands:
|
|
|
|
```bash
|
|
# Training
|
|
CUDA_VISIBLE_DEVICES=0,1,2,3 \
|
|
python -m torch.distributed.launch --nproc_per_node=4 \
|
|
opengait/main.py --cfgs configs/sconet/sconet_scoliosis1k.yaml --phase train --log_to_file
|
|
|
|
# Testing
|
|
CUDA_VISIBLE_DEVICES=0,1,2,3 \
|
|
python -m torch.distributed.launch --nproc_per_node=4 \
|
|
opengait/main.py --cfgs configs/sconet/sconet_scoliosis1k.yaml --phase test --log_to_file
|
|
```
|
|
|
|
### Fixed-pool ratio comparison
|
|
|
|
If you want to compare `1:1:2` against `1:1:8` without changing the evaluation
|
|
pool, do not compare `Scoliosis1K_112.json` against `Scoliosis1K_118.json`
|
|
directly. Those two files differ substantially in train/test membership.
|
|
|
|
For a cleaner same-pool comparison, use:
|
|
|
|
* `datasets/Scoliosis1K/Scoliosis1K_118.json`
|
|
* original `1:1:8` split
|
|
* `datasets/Scoliosis1K/Scoliosis1K_118_fixedpool_train112.json`
|
|
* same `TEST_SET` as `118`
|
|
* same positive/neutral `TRAIN_SET` ids as `118`
|
|
* downsampled `TRAIN_SET` negatives to `148`, giving train counts
|
|
`74 positive / 74 neutral / 148 negative`
|
|
|
|
The helper used to generate that derived partition is:
|
|
|
|
```bash
|
|
uv run python scripts/build_scoliosis_fixedpool_partition.py \
|
|
--base-partition datasets/Scoliosis1K/Scoliosis1K_118.json \
|
|
--dataset-root /mnt/public/data/Scoliosis1K/Scoliosis1K-sil-pkl \
|
|
--negative-multiplier 2 \
|
|
--output-path datasets/Scoliosis1K/Scoliosis1K_118_fixedpool_train112.json \
|
|
--seed 118
|
|
```
|
|
|
|
### Modality sanity check
|
|
|
|
The silhouette and skeleton-map pipelines are different experiments and should not be mixed when you interpret results.
|
|
|
|
* `Scoliosis1K-sil-pkl` is the silhouette modality used by the standard ScoNet configs.
|
|
* pose-derived heatmap roots such as `Scoliosis1K_sigma_8.0/pkl` or DRF exports are skeleton-map inputs and require `in_channel: 2`.
|
|
* DRF does **not** use the silhouette stream as an input. It uses `0_heatmap.pkl` plus `1_pav.pkl`.
|
|
|
|
Naming note:
|
|
|
|
* in this repo, the local `ScoNet` training config and model class are usually the paper's `ScoNet-MT`, not the CE-only paper `ScoNet`
|
|
* in these docs, `ScoNet-MT-ske` means the skeleton-map variant of that same model path
|
|
* checkpoint filenames like `ScoNet-20000-better.pt` do not identify the modality by name alone
|
|
|
|
A strong silhouette checkpoint does not validate the skeleton-map path. In particular, `ckpt/ScoNet-20000-better.pt` is a silhouette checkpoint:
|
|
|
|
* its first convolution expects 1-channel input
|
|
* the matching eval config points to `Scoliosis1K-sil-pkl`
|
|
|
|
So if you are debugging DRF or `ScoNet-MT-ske` reproduction, do not use `ScoNet-20000-better.pt` as evidence that the heatmap preprocessing is correct.
|
|
|
|
### Overlay caveat
|
|
|
|
Do not treat a direct overlay between `Scoliosis1K-sil-pkl` and pose-derived skeleton maps as a valid alignment test.
|
|
|
|
Reason:
|
|
|
|
* the released silhouette modality is an estimated segmentation output from `PP-HumanSeg v2`
|
|
* the released pose modality is an estimated keypoint output from `ViTPose`
|
|
* the two modalities are normalized by different preprocessing pipelines before they reach OpenGait
|
|
|
|
So a silhouette-vs-skeleton mismatch in a debug figure is usually a cross-modality frame-of-reference issue, not proof that the raw dataset is bad. The more important check for skeleton-map debugging is whether the **limb and joint channels align with each other** inside `0_heatmap.pkl`.
|
|
|
|
---
|
|
|
|
## Pose-to-Heatmap Conversion
|
|
|
|
*From our paper: **Pose as Clinical Prior: Learning Dual Representations for Scoliosis Screening (MICCAI 2025)***
|
|
|
|
```bash
|
|
CUDA_VISIBLE_DEVICES=0,1,2,3 \
|
|
python -m torch.distributed.launch --nproc_per_node=4 \
|
|
datasets/pretreatment_heatmap.py \
|
|
--pose_data_path=<path_to_pose_pkl> \
|
|
--save_root=<output_path> \
|
|
--dataset_name=OUMVLP
|
|
```
|
|
|
|
## DRF Preprocessing
|
|
|
|
For the DRF model, OpenGait expects a combined runtime dataset with:
|
|
|
|
* `0_heatmap.pkl`: the two-channel skeleton map sequence
|
|
* `1_pav.pkl`: the paper-style Postural Asymmetry Vector (PAV), repeated along the sequence axis so it matches OpenGait's multi-input loader contract
|
|
|
|
The PAV pass is implemented from the paper:
|
|
|
|
1. convert pose to COCO17 if needed
|
|
2. pad missing joints
|
|
3. pelvis-center and height normalize the sequence
|
|
4. compute vertical, midline, and angular deviations for the 8 symmetric joint pairs
|
|
5. apply IQR filtering per metric
|
|
6. average over time
|
|
7. min-max normalize across the full dataset (paper default), or across `TRAIN_SET` when `--stats_partition` is provided as an anti-leakage variant
|
|
|
|
Run:
|
|
|
|
```bash
|
|
uv run python datasets/pretreatment_scoliosis_drf.py \
|
|
--pose_data_path=<path_to_pose_pkl> \
|
|
--output_path=<path_to_drf_pkl>
|
|
```
|
|
|
|
The script uses `configs/drf/pretreatment_heatmap_drf.yaml` by default.
|
|
That keeps the upstream OpenGait/SkeletonGait heatmap behavior from
|
|
commit `f754f6f3831e9f83bb28f4e2f63dd43d8bcf9dc4` for the skeleton-map
|
|
branch while still building the DRF-specific two-channel output.
|
|
|
|
If you explicitly want the more paper-literal summed heatmap ablation, add:
|
|
|
|
```bash
|
|
--heatmap_reduction=sum
|
|
```
|
|
|
|
If you explicitly want train-only PAV min-max statistics, add:
|
|
|
|
```bash
|
|
--stats_partition=./datasets/Scoliosis1K/Scoliosis1K_118.json
|
|
```
|
|
|
|
### Heatmap debugging notes
|
|
|
|
Current confirmed findings from local debugging:
|
|
|
|
* the raw pose dataset itself looks healthy; poor `ScoNet-MT-ske` results are not explained by obvious missing-joint collapse
|
|
* a larger heatmap sigma can materially blur away the articulated structure; `sigma=8` was much broader than the silhouette geometry, while smaller sigma values recovered more structure
|
|
* an earlier bug aligned the limb and joint channels separately; that made the two channels of `0_heatmap.pkl` slightly misregistered
|
|
* the heatmap path is now patched so limb and joint channels share one alignment crop
|
|
* the heatmap aligner now also supports `align_args.scope: sequence`, which applies one shared crop box to the whole sequence instead of recomputing it frame by frame
|
|
* the heatmap config can also rebalance the two channels after alignment with `channel_gain_limb` / `channel_gain_joint`; this keeps the crop geometry fixed while changing limb-vs-joint strength
|
|
|
|
Remaining caution:
|
|
|
|
* the exported skeleton map is stored as `64x64`
|
|
* if the runtime config uses `BaseSilCuttingTransform`, the network actually sees `64x44`
|
|
* that symmetric left/right crop is not automatically wrong, but it is still a meaningful ablation point for skeleton-map experiments
|
|
|
|
The output layout is:
|
|
|
|
```text
|
|
<path_to_drf_pkl>/
|
|
├── pav_stats.pkl
|
|
├── 00000/
|
|
│ ├── Positive/
|
|
│ │ ├── 000_180/
|
|
│ │ │ ├── 0_heatmap.pkl
|
|
│ │ │ └── 1_pav.pkl
|
|
```
|
|
|
|
Point `configs/drf/drf_scoliosis1k.yaml:data_cfg.dataset_root` to this output directory before training or testing.
|
|
|
|
## DRF Training and Testing
|
|
|
|
```bash
|
|
CUDA_VISIBLE_DEVICES=0,1,2,3 \
|
|
uv run python -m torch.distributed.launch --nproc_per_node=4 \
|
|
opengait/main.py --cfgs configs/drf/drf_scoliosis1k.yaml --phase train
|
|
|
|
CUDA_VISIBLE_DEVICES=0,1,2,3 \
|
|
uv run python -m torch.distributed.launch --nproc_per_node=4 \
|
|
opengait/main.py --cfgs configs/drf/drf_scoliosis1k.yaml --phase test
|
|
```
|