DATASET PREP KNOWLEDGE BASE

OVERVIEW

datasets/ is a script-heavy preprocessing workspace. It transforms raw benchmarks into OpenGait’s required pickle layout and partition metadata.

STRUCTURE

datasets/
├── pretreatment.py              # generic image->pkl pipeline (and pose mode)
├── pretreatment_heatmap.py      # heatmap generation for skeleton workflows
├── <DatasetName>/README.md      # dataset-specific acquisition + conversion steps
├── <DatasetName>/*.json         # train/test partition files
└── <DatasetName>/*.py           # extract/rearrange/convert scripts

WHERE TO LOOK

Task	Location	Notes
Generic preprocessing	`pretreatment.py`	handles multiple datasets, pose switch
OUMVLP pose index flow	`OUMVLP/README.md`, `OUMVLP/pose_index_extractor.py`	required for temporal consistency
Heatmap + skeleton prep	`pretreatment_heatmap.py`, `ln_sil_heatmap.py`, `configs/skeletongait/README.md`	multi-step pipeline
Dataset splits	`<Dataset>/<Dataset>.json`	consumed by runtime `data_cfg.dataset_partition`

CONVENTIONS

Final runtime-ready format is id/type/view/*.pkl.
Many dataset folders provide both rearrange and extraction scripts; follow README ordering strictly.
Some pipelines require auxiliary artifacts (e.g., OUMVLP pose match indices) before pretreatment.

ANTI-PATTERNS

Don’t point runtime to raw image trees; training expects pkl-converted structure.
Don’t skip dataset-specific rearrange steps; many raw layouts are incompatible with runtime parser.
Don’t ignore documented optional/required flags in per-dataset README commands.

1.7 KiB Raw Blame History Unescape Escape

DATASET PREP KNOWLEDGE BASE

OVERVIEW

STRUCTURE

WHERE TO LOOK

CONVENTIONS

ANTI-PATTERNS

1.7 KiB

Raw Blame History