Files
OpenGait/datasets/AGENTS.md
T

33 lines
1.7 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# DATASET PREP KNOWLEDGE BASE
## OVERVIEW
`datasets/` is a script-heavy preprocessing workspace. It transforms raw benchmarks into OpenGaits required pickle layout and partition metadata.
## STRUCTURE
```text
datasets/
├── pretreatment.py # generic image->pkl pipeline (and pose mode)
├── pretreatment_heatmap.py # heatmap generation for skeleton workflows
├── <DatasetName>/README.md # dataset-specific acquisition + conversion steps
├── <DatasetName>/*.json # train/test partition files
└── <DatasetName>/*.py # extract/rearrange/convert scripts
```
## WHERE TO LOOK
| Task | Location | Notes |
|------|----------|-------|
| Generic preprocessing | `pretreatment.py` | handles multiple datasets, pose switch |
| OUMVLP pose index flow | `OUMVLP/README.md`, `OUMVLP/pose_index_extractor.py` | required for temporal consistency |
| Heatmap + skeleton prep | `pretreatment_heatmap.py`, `ln_sil_heatmap.py`, `configs/skeletongait/README.md` | multi-step pipeline |
| Dataset splits | `<Dataset>/<Dataset>.json` | consumed by runtime `data_cfg.dataset_partition` |
## CONVENTIONS
- Final runtime-ready format is `id/type/view/*.pkl`.
- Many dataset folders provide both rearrange and extraction scripts; follow README ordering strictly.
- Some pipelines require auxiliary artifacts (e.g., OUMVLP pose match indices) before pretreatment.
## ANTI-PATTERNS
- Dont point runtime to raw image trees; training expects pkl-converted structure.
- Dont skip dataset-specific rearrange steps; many raw layouts are incompatible with runtime parser.
- Dont ignore documented optional/required flags in per-dataset README commands.