Add comprehensive knowledge base documentation across multiple domains
This commit is contained in:
@@ -0,0 +1,33 @@
|
||||
# OPENGAIT RUNTIME KNOWLEDGE BASE
|
||||
|
||||
## OVERVIEW
|
||||
`opengait/` is the runtime package: distributed launch entry, model lifecycle orchestration, data/evaluation integration.
|
||||
|
||||
## STRUCTURE
|
||||
```text
|
||||
opengait/
|
||||
├── main.py # DDP entrypoint + config load + model dispatch
|
||||
├── modeling/ # BaseModel + model/backbone/loss registries
|
||||
├── data/ # dataset parser + sampler/collate/transform
|
||||
├── evaluation/ # benchmark-specific evaluation functions
|
||||
└── utils/ # config merge, DDP passthrough, logging helpers
|
||||
```
|
||||
|
||||
## WHERE TO LOOK
|
||||
| Task | Location | Notes |
|
||||
|------|----------|-------|
|
||||
| Start train/test flow | `main.py` | parses `--cfgs`/`--phase`, initializes DDP |
|
||||
| Resolve model name from YAML | `modeling/models/__init__.py` | class auto-registration via iter_modules |
|
||||
| Build full train loop | `modeling/base_model.py` | loaders, optimizer/scheduler, ckpt, inference |
|
||||
| Merge config with defaults | `utils/common.py::config_loader` | overlays onto `configs/default.yaml` |
|
||||
| Shared logging | `utils/msg_manager.py` | global message manager |
|
||||
|
||||
## CONVENTIONS
|
||||
- Imports are package-relative-at-runtime (`from modeling...`, `from data...`, `from utils...`) because `opengait/main.py` is launched as script target.
|
||||
- Runtime is DDP-first; non-DDP assumptions are usually invalid.
|
||||
- Losses and models are configured by names, not direct imports in `main.py`.
|
||||
|
||||
## ANTI-PATTERNS
|
||||
- Don’t bypass `config_loader`; default config merge is expected by all modules.
|
||||
- Don’t instantiate models outside registry path (`modeling/models`), or YAML `model_cfg.model` lookup breaks.
|
||||
- Don’t bypass `get_ddp_module`; attribute passthrough wrapper is used for downstream method access.
|
||||
@@ -0,0 +1,22 @@
|
||||
# DATA PIPELINE KNOWLEDGE BASE
|
||||
|
||||
## OVERVIEW
|
||||
`opengait/data/` converts preprocessed dataset trees into training/evaluation batches for all models.
|
||||
|
||||
## WHERE TO LOOK
|
||||
| Task | Location | Notes |
|
||||
|------|----------|-------|
|
||||
| Dataset parsing + file loading | `dataset.py` | expects partition json and `.pkl` sequence files |
|
||||
| Sequence sampling strategy | `collate_fn.py` | fixed/unfixed/all + ordered/unordered behavior |
|
||||
| Augmentations/transforms | `transform.py` | transform factories resolved from config |
|
||||
| Batch identity sampling | `sampler.py` | sampler types referenced from config |
|
||||
|
||||
## CONVENTIONS
|
||||
- Dataset root layout is `id/type/view/*.pkl` after preprocessing.
|
||||
- `dataset_partition` JSON with `TRAIN_SET` / `TEST_SET` is required.
|
||||
- `sample_type` drives control flow (`fixed_unordered`, `all_ordered`, etc.) and shape semantics downstream.
|
||||
|
||||
## ANTI-PATTERNS
|
||||
- Never pass non-`.pkl` sequence files (`dataset.py` raises hard ValueError).
|
||||
- Don’t violate expected `batch_size` semantics for triplet samplers (`[P, K]` list).
|
||||
- Don’t assume all models use identical feature counts; collate is feature-index sensitive.
|
||||
@@ -0,0 +1,33 @@
|
||||
# MODELING DOMAIN KNOWLEDGE BASE
|
||||
|
||||
## OVERVIEW
|
||||
`opengait/modeling/` defines model contracts and algorithm implementations: `BaseModel`, loss aggregation, backbones, concrete model classes.
|
||||
|
||||
## STRUCTURE
|
||||
```text
|
||||
opengait/modeling/
|
||||
├── base_model.py # canonical train/test lifecycle
|
||||
├── loss_aggregator.py # training_feat -> weighted summed loss
|
||||
├── modules.py # shared NN building blocks
|
||||
├── backbones/ # backbone registry + implementations
|
||||
├── losses/ # loss registry + implementations
|
||||
└── models/ # concrete methods (Baseline, ScoNet, DeepGaitV2, ...)
|
||||
```
|
||||
|
||||
## WHERE TO LOOK
|
||||
| Task | Location | Notes |
|
||||
|------|----------|-------|
|
||||
| Add new model | `models/*.py` + `docs/4.how_to_create_your_model.md` | must inherit `BaseModel` |
|
||||
| Add new loss | `losses/*.py` | expose via dynamic registry |
|
||||
| Change training lifecycle | `base_model.py` | affects every model |
|
||||
| Debug feature/loss key mismatches | `loss_aggregator.py` | checks `training_feat` keys vs `loss_cfg.log_prefix` |
|
||||
|
||||
## CONVENTIONS
|
||||
- `forward()` output contract is fixed dict with keys: `training_feat`, `visual_summary`, `inference_feat`.
|
||||
- `training_feat` subkeys must align with configured `loss_cfg[*].log_prefix`.
|
||||
- Backbones/losses/models are discovered dynamically via package `__init__.py`; filenames matter operationally.
|
||||
|
||||
## ANTI-PATTERNS
|
||||
- Do not return arbitrary forward outputs; `LossAggregator` and evaluator assume fixed contract.
|
||||
- Do not put model classes outside `models/`; config lookup by `getattr(models, name)` depends on registry.
|
||||
- Do not ignore DDP loss wrapping (`get_ddp_module`) in loss construction.
|
||||
@@ -0,0 +1,23 @@
|
||||
# MODEL ZOO IMPLEMENTATION KNOWLEDGE BASE
|
||||
|
||||
## OVERVIEW
|
||||
This directory is the algorithm zoo. Each file usually contributes one `BaseModel` subclass selected by `model_cfg.model`.
|
||||
|
||||
## WHERE TO LOOK
|
||||
| Task | Location | Notes |
|
||||
|------|----------|-------|
|
||||
| Baseline pattern | `baseline.py` | minimal template for silhouette models |
|
||||
| Scoliosis pipeline | `sconet.py` | label remapping + screening-specific head |
|
||||
| Large-model fusion | `BiggerGait_DINOv2.py`, `BigGait.py` | external pretrained dependencies |
|
||||
| Diffusion/noise handling | `denoisinggait.py`, `diffgait_utils/` | high-complexity flow/feature fusion |
|
||||
| Skeleton variants | `skeletongait++.py`, `gaitgraph1.py`, `gaitgraph2.py` | pose-map/graph assumptions |
|
||||
|
||||
## CONVENTIONS
|
||||
- Most models follow: preprocess input -> backbone -> temporal pooling -> horizontal pooling -> neck/head -> contract dict.
|
||||
- Input modality assumptions differ by model (silhouette / RGB / pose / multimodal); config and preprocess script must match.
|
||||
- Many models rely on utilities from `modeling/modules.py`; shared changes there are high blast-radius.
|
||||
|
||||
## ANTI-PATTERNS
|
||||
- Don’t mix modality assumptions silently (e.g., pose tensor layout vs silhouette layout).
|
||||
- Don’t rename classes without updating `model_cfg.model` references in configs.
|
||||
- Don’t treat `BigGait_utils`/`diffgait_utils` as generic utilities; they are model-family specific.
|
||||
Reference in New Issue
Block a user