OpenGait/docs/drf_author_checkpoint_compat.md

# DRF Author Checkpoint Compatibility Note

This note records what happened when evaluating the author-provided DRF bundle in this repo:

- checkpoint: `artifact/scoliosis_drf_author_118_compat/DRF_118_unordered_iter2w_lr0.001_8830-08000.pt`
- config: `ckpt/drf_author/drf_scoliosis1k_20000.yaml`

The short version:
- the weight file is real and structurally usable
- the provided YAML is not a reliable source of truth
- the main problem was integration-contract mismatch, not a broken checkpoint

## What Was Wrong

The author bundle was internally inconsistent in several ways.

### 1. Split mismatch

The DRF paper says the main experiment uses `1:1:8`, i.e. the `118` split.

But the provided YAML pointed to:
- `./datasets/Scoliosis1K/Scoliosis1K_112.json`

while the checkpoint filename itself says:
- `DRF_118_...`

So the bundle already disagreed with itself.

### 2. Class-order mismatch

The biggest hidden bug was class ordering.

The current repo evaluator assumes:
- `negative = 0`
- `neutral = 1`
- `positive = 2`

But the author stub in `research/drf.py` uses:
- `negative = 0`
- `positive = 1`
- `neutral = 2`

That means an otherwise good checkpoint can look very bad if logits are interpreted in the wrong class order.

### 3. Legacy module-name mismatch

The author checkpoint stores PGA weights under:
- `attention_layer.*`

The current repo uses:
- `PGA.*`

This is a small compatibility issue, but it must be remapped before loading.

### 4. Preprocessing/runtime-contract mismatch

The author checkpoint does not line up with the stale YAML’s full runtime contract.

Most importantly, it did **not** work well with the more paper-literal local export:
- `Scoliosis1K-drf-pkl-118-paper`

It worked much better with the more OpenGait-like aligned export:
- `Scoliosis1K-drf-pkl-118-aligned`

That strongly suggests the checkpoint was trained against a preprocessing/runtime path closer to the aligned OpenGait integration than to the later local “paper-literal” summed-heatmap ablation.

## What Was Added In-Tree

The current repo now has a small compatibility layer in:
- `opengait/modeling/models/drf.py`

It does two things:
- remaps legacy keys `attention_layer.* -> PGA.*`
- supports configurable `model_cfg.label_order`

The model also canonicalizes inference logits back into the repo’s evaluator order, so author checkpoints can be evaluated without modifying the evaluator itself.

## Tested Compatibility Results

### Best usable author-checkpoint path

Config:
- `configs/drf/drf_author_eval_118_aligned_1gpu.yaml`

Dataset/runtime:
- dataset root: `Scoliosis1K-drf-pkl-118-aligned`
- partition: `Scoliosis1K_118.json`
- transform: `BaseSilCuttingTransform`
- label order:
  - `negative`
  - `positive`
  - `neutral`

Result:
- `80.24 Acc / 76.73 Prec / 76.40 Rec / 76.56 F1`

This is the strongest recovered path so far.

### Verified provenance of `Scoliosis1K-drf-pkl-118-aligned`

The `118-aligned` root is no longer just an informed guess. It was verified
directly against the raw pose source:
- `/mnt/public/data/Scoliosis1K/Scoliosis1K-pose-pkl`

The matching preprocessing path is:
- `datasets/pretreatment_scoliosis_drf.py`
- default heatmap config:
  - `configs/drf/pretreatment_heatmap_drf.yaml`
- archived equivalent config:
  - `configs/drf/pretreatment_heatmap_drf_118_aligned.yaml`

That means the aligned root was produced with:
- shared `sigma: 8.0`
- `align: True`
- `final_img_size: 64`
- default `heatmap_reduction=upstream`
- no `--stats_partition`, i.e. dataset-level PAV min-max stats

Equivalent command:

```bash
uv run python datasets/pretreatment_scoliosis_drf.py \
  --pose_data_path /mnt/public/data/Scoliosis1K/Scoliosis1K-pose-pkl \
  --output_path /mnt/public/data/Scoliosis1K/Scoliosis1K-drf-pkl-118-aligned
```

Verification evidence:
- a regenerated `0_heatmap.pkl` sample from the raw pose input matched the stored
  `Scoliosis1K-drf-pkl-118-aligned` sample exactly (`array_equal == True`)
- a full recomputation of `pav_stats.pkl` from the raw pose input matched the
  stored `pav_min`, `pav_max`, and `stats_partition=None` exactly

So `118-aligned` is the old default OpenGait-style DRF export, not the later:
- `118-paper` paper-literal summed-heatmap export
- `118` train-only-stats splitroot export
- `sigma15` / `sigma15_joint8` exports

### Targeted preprocessing ablations around the recovered path

After verifying the aligned root provenance, a few focused runtime/data ablations
were tested against the author checkpoint to see which part of the contract still
mattered most.

Baseline:
- `118-aligned`
- `BaseSilCuttingTransform`
- result:
  - `80.24 Acc / 76.73 Prec / 76.40 Rec / 76.56 F1`

Hybrid 1:
- aligned heatmap + splitroot PAV
- result:
  - `77.30 Acc / 73.70 Prec / 73.04 Rec / 73.28 F1`

Hybrid 2:
- splitroot heatmap + aligned PAV
- result:
  - `80.37 Acc / 77.16 Prec / 76.48 Rec / 76.80 F1`

Runtime ablation:
- `118-aligned` + `BaseSilTransform` (`no-cut`)
- result:
  - `49.93 Acc / 50.49 Prec / 51.58 Rec / 47.75 F1`

What these ablations suggest:
- `BaseSilCuttingTransform` is necessary; `no-cut` breaks the checkpoint badly
- dataset-level PAV stats (`stats_partition=None`) matter more than the exact
  aligned-vs-splitroot heatmap writer
- the heatmap export is still part of the contract, but it is no longer the
  dominant remaining mismatch

### Other tested paths

`configs/drf/drf_author_eval_118_splitroot_1gpu.yaml`
- dataset root: `Scoliosis1K-drf-pkl-118`
- result:
  - `77.17 Acc / 73.61 Prec / 72.59 Rec / 72.98 F1`

`configs/drf/drf_author_eval_112_1gpu.yaml`
- dataset root: `Scoliosis1K-drf-pkl`
- partition: `Scoliosis1K_112.json`
- result:
  - `85.19 Acc / 57.98 Prec / 56.65 Rec / 57.30 F1`

`configs/drf/drf_author_eval_118_paper_1gpu.yaml`
- dataset root: `Scoliosis1K-drf-pkl-118-paper`
- transform: `BaseSilTransform`
- result:
  - `27.24 Acc / 9.08 Prec / 33.33 Rec / 14.27 F1`

## Interpretation

What these results mean:

- the checkpoint is not garbage
- the original “very bad” local eval was mostly a compatibility failure
- the largest single hidden bug was the class-order mismatch
- the author checkpoint is also sensitive to which local DRF dataset root is used
- the recovered runtime is now good enough to make the checkpoint believable, but
  preprocessing alone did not recover the paper DRF headline row

What they do **not** mean:

- we have perfectly reconstructed the author’s original training path
- the provided YAML is trustworthy as-is
- the paper’s full DRF claim is fully reproduced here

One practical caveat on `1:1:2` vs `1:1:8` comparisons in this repo:
- local `Scoliosis1K_112.json` and `Scoliosis1K_118.json` are not the same train/test
  split with only a different class ratio
- they differ substantially in membership
- so local `112` vs `118` results should not be overinterpreted as a pure
  class-balance ablation unless the train/test pool is explicitly held fixed

To support a clean same-pool comparison, the repo now also includes:
- `datasets/Scoliosis1K/Scoliosis1K_118_fixedpool_train112.json`

That partition keeps the full `118` `TEST_SET` unchanged and keeps the same
positive/neutral `TRAIN_SET` ids as `118`, but downsamples `TRAIN_SET` negatives
to `148` so the train ratio becomes `74 / 74 / 148` (`1:1:2`).

The strongest recovered result:
- `80.24 / 76.73 / 76.40 / 76.56`

This is close to the paper’s reported `ScoNet-MT^ske` F1 and much better than our earlier broken compat evals, but it is still below the paper’s DRF headline result:
- paper DRF: `86.0 Acc / 84.1 Prec / 79.2 Rec / 80.8 F1`

## Practical Recommendation

If someone wants to use the author checkpoint in this repo today, the recommended path is:

1. use `configs/drf/drf_author_eval_118_aligned_1gpu.yaml`
2. keep the author label order:
   - `negative, positive, neutral`
3. keep the legacy `attention_layer -> PGA` remap in the model
4. do **not** assume the stale `112` YAML is the correct training/eval contract

If someone wants to push this further, the highest-value next step is:
- finetune from the author checkpoint on the aligned `118` path instead of starting DRF from scratch

## How To Run

Recommended eval:

```bash
CUDA_VISIBLE_DEVICES=GPU-9cc7b26e-90d4-0c49-4d4c-060e528ffba6 \
uv run torchrun --nproc_per_node=1 --master_port=29693 \
  opengait/main.py \
  --cfgs ./configs/drf/drf_author_eval_118_aligned_1gpu.yaml \
  --phase test
```

Other compatibility checks:

```bash
CUDA_VISIBLE_DEVICES=GPU-9cc7b26e-90d4-0c49-4d4c-060e528ffba6 \
uv run torchrun --nproc_per_node=1 --master_port=29695 \
  opengait/main.py \
  --cfgs ./configs/drf/drf_author_eval_112_1gpu.yaml \
  --phase test

CUDA_VISIBLE_DEVICES=GPU-9cc7b26e-90d4-0c49-4d4c-060e528ffba6 \
uv run torchrun --nproc_per_node=1 --master_port=29696 \
  opengait/main.py \
  --cfgs ./configs/drf/drf_author_eval_118_splitroot_1gpu.yaml \
  --phase test

CUDA_VISIBLE_DEVICES=GPU-9cc7b26e-90d4-0c49-4d4c-060e528ffba6 \
uv run torchrun --nproc_per_node=1 --master_port=29697 \
  opengait/main.py \
  --cfgs ./configs/drf/drf_author_eval_118_paper_1gpu.yaml \
  --phase test
```

If someone wants to reproduce this on another machine, the usual paths to change are:
- `data_cfg.dataset_root`
- `data_cfg.dataset_partition`
- `evaluator_cfg.restore_hint`

The archived artifact bundle is:
- `artifact/scoliosis_drf_author_118_compat`