Add comprehensive knowledge base documentation across multiple domains

2026-02-12 14:36:37 +08:00
parent f754f6f383
commit 0fdd35bd78
8 changed files with 336 additions and 0 deletions
@@ -0,0 +1,30 @@
+# CONFIG SURFACE KNOWLEDGE BASE
+
+## OVERVIEW
+`configs/` is the operational API for experiments. Runtime behavior is primarily configured here, not hardcoded.
+
+## STRUCTURE
+```text
+configs/
+├── default.yaml          # base config merged into every run
+├── <model-family>/*.yaml # experiment overlays
+└── */README.md           # family-specific instructions (when present)
+```
+
+## WHERE TO LOOK
+| Task | Location | Notes |
+|------|----------|-------|
+| Global defaults | `default.yaml` | base for all runs |
+| Model selection | `model_cfg.model` | must match class name in `modeling/models` |
+| Data split binding | `data_cfg.dataset_partition` | points to `datasets/*/*.json` |
+| Sampler behavior | `trainer_cfg.sampler`, `evaluator_cfg.sampler` | directly controls collate/sampler path |
+
+## CONVENTIONS
+- Config files are overlays merged into `default.yaml` via `MergeCfgsDict`.
+- Keys accepted by classes/functions are validated at runtime; unknown keys are logged as unexpected.
+- Paths and names here directly determine output directory keying (`output/<dataset>/<model>/<save_name>`).
+
+## ANTI-PATTERNS
+- Don’t use model names not registered in `opengait/modeling/models`.
+- Don’t treat `batch_size` as scalar in triplet training regimes when config expects `[P, K]`.
+- Don’t bypass dataset partition files; loader expects explicit train/test pid sets.