Add comprehensive knowledge base documentation across multiple domains
This commit is contained in:
@@ -0,0 +1,32 @@
|
||||
# DATASET PREP KNOWLEDGE BASE
|
||||
|
||||
## OVERVIEW
|
||||
`datasets/` is a script-heavy preprocessing workspace. It transforms raw benchmarks into OpenGait’s required pickle layout and partition metadata.
|
||||
|
||||
## STRUCTURE
|
||||
```text
|
||||
datasets/
|
||||
├── pretreatment.py # generic image->pkl pipeline (and pose mode)
|
||||
├── pretreatment_heatmap.py # heatmap generation for skeleton workflows
|
||||
├── <DatasetName>/README.md # dataset-specific acquisition + conversion steps
|
||||
├── <DatasetName>/*.json # train/test partition files
|
||||
└── <DatasetName>/*.py # extract/rearrange/convert scripts
|
||||
```
|
||||
|
||||
## WHERE TO LOOK
|
||||
| Task | Location | Notes |
|
||||
|------|----------|-------|
|
||||
| Generic preprocessing | `pretreatment.py` | handles multiple datasets, pose switch |
|
||||
| OUMVLP pose index flow | `OUMVLP/README.md`, `OUMVLP/pose_index_extractor.py` | required for temporal consistency |
|
||||
| Heatmap + skeleton prep | `pretreatment_heatmap.py`, `ln_sil_heatmap.py`, `configs/skeletongait/README.md` | multi-step pipeline |
|
||||
| Dataset splits | `<Dataset>/<Dataset>.json` | consumed by runtime `data_cfg.dataset_partition` |
|
||||
|
||||
## CONVENTIONS
|
||||
- Final runtime-ready format is `id/type/view/*.pkl`.
|
||||
- Many dataset folders provide both rearrange and extraction scripts; follow README ordering strictly.
|
||||
- Some pipelines require auxiliary artifacts (e.g., OUMVLP pose match indices) before pretreatment.
|
||||
|
||||
## ANTI-PATTERNS
|
||||
- Don’t point runtime to raw image trees; training expects pkl-converted structure.
|
||||
- Don’t skip dataset-specific rearrange steps; many raw layouts are incompatible with runtime parser.
|
||||
- Don’t ignore documented optional/required flags in per-dataset README commands.
|
||||
Reference in New Issue
Block a user