crosstyan/OpenGait

Files

T

crosstyan 0fdd35bd78 Add comprehensive knowledge base documentation across multiple domains

2026-02-12 14:36:37 +08:00

1.1 KiB

Raw Permalink Blame History

DATA PIPELINE KNOWLEDGE BASE

OVERVIEW

opengait/data/ converts preprocessed dataset trees into training/evaluation batches for all models.

WHERE TO LOOK

Task	Location	Notes
Dataset parsing + file loading	`dataset.py`	expects partition json and `.pkl` sequence files
Sequence sampling strategy	`collate_fn.py`	fixed/unfixed/all + ordered/unordered behavior
Augmentations/transforms	`transform.py`	transform factories resolved from config
Batch identity sampling	`sampler.py`	sampler types referenced from config

CONVENTIONS

Dataset root layout is id/type/view/*.pkl after preprocessing.
dataset_partition JSON with TRAIN_SET / TEST_SET is required.
sample_type drives control flow (fixed_unordered, all_ordered, etc.) and shape semantics downstream.

ANTI-PATTERNS

Never pass non-.pkl sequence files (dataset.py raises hard ValueError).
Don’t violate expected batch_size semantics for triplet samplers ([P, K] list).
Don’t assume all models use identical feature counts; collate is feature-index sensitive.