chore: update demo runtime, tests, and agent docs

2026-03-02 12:33:17 +08:00
parent 1f8f959ad7
commit cbb3284c13
14 changed files with 1491 additions and 236 deletions
@@ -1,84 +1,191 @@
-# PROJECT KNOWLEDGE BASE
+# OpenGait Agent Guide

-**Generated:** 2026-02-11T10:53:29Z  
-**Commit:** f754f6f  
-**Branch:** master
+This file is for autonomous coding agents working in this repository.
+Use it as the default playbook for commands, conventions, and safety checks.

-## OVERVIEW
-OpenGait is a research-grade, config-driven gait analysis framework centered on distributed PyTorch training/testing.  
-Core runtime lives in `opengait/`; `configs/` and `datasets/` are first-class operational surfaces, not just support folders.
+## Scope and Ground Truth

-## STRUCTURE
-```text
-OpenGait/
-├── opengait/          # runtime package (train/test, model/data/eval pipelines)
-├── configs/           # model- and dataset-specific experiment specs
-├── datasets/          # preprocessing/rearrangement scripts + partitions
-├── docs/              # user workflow docs
-├── train.sh           # launch patterns (DDP)
-└── test.sh            # eval launch patterns (DDP)
-```
+- Repository: `OpenGait`
+- Runtime package: `opengait/`
+- Primary entrypoint: `opengait/main.py`
+- Package/runtime tool: `uv`

-## WHERE TO LOOK
-| Task | Location | Notes |
-|------|----------|-------|
-| Train/test entry | `opengait/main.py` | DDP init + config load + model dispatch |
-| Model registration | `opengait/modeling/models/__init__.py` | dynamic class import/registration |
-| Backbone/loss registration | `opengait/modeling/backbones/__init__.py`, `opengait/modeling/losses/__init__.py` | same dynamic pattern |
-| Config merge behavior | `opengait/utils/common.py::config_loader` | merges into `configs/default.yaml` |
-| Data loading contract | `opengait/data/dataset.py`, `opengait/data/collate_fn.py` | `.pkl` only, sequence sampling modes |
-| Evaluation dispatch | `opengait/evaluation/evaluator.py` | dataset-specific eval routines |
-| Dataset preprocessing | `datasets/pretreatment.py` + dataset subdirs | many standalone CLI tools |
+Critical source-of-truth rule:
+- `opengait/demo` is an implementation layer and may contain project-specific behavior.
+- When asked to “refer to the paper” or verify methodology, use the paper and official citations as ground truth.
+- Do not treat demo/runtime behavior as proof of paper method unless explicitly cited by the paper.

-## CODE MAP
-| Symbol / Module | Type | Location | Refs | Role |
-|-----------------|------|----------|------|------|
-| `config_loader` | function | `opengait/utils/common.py` | high | YAML merge + default overlay |
-| `get_ddp_module` | function | `opengait/utils/common.py` | high | wraps modules with DDP passthrough |
-| `BaseModel` | class | `opengait/modeling/base_model.py` | high | canonical train/test lifecycle |
-| `LossAggregator` | class | `opengait/modeling/loss_aggregator.py` | medium | consumes `training_feat` contract |
-| `DataSet` | class | `opengait/data/dataset.py` | high | dataset partition + sequence loading |
-| `CollateFn` | class | `opengait/data/collate_fn.py` | high | fixed/unfixed/all sampling policy |
-| `evaluate_*` funcs | functions | `opengait/evaluation/evaluator.py` | medium | metric/report orchestration |
-| `models` package registry | dynamic module | `opengait/modeling/models/__init__.py` | high | config string → model class |
+## Environment Setup

-## CONVENTIONS
- Launch pattern is DDP-first (`python -m torch.distributed.launch ... opengait/main.py --cfgs ... --phase ...`).
- DDP Constraints: `world_size` must equal number of visible GPUs; test `evaluator_cfg.sampler.batch_size` must equal `world_size`.
- Model/loss/backbone discoverability is filesystem-driven via package-level dynamic imports.
- Experiment config semantics: custom YAML overlays `configs/default.yaml` (local key precedence).
- Outputs are keyed by config identity: `output/${dataset_name}/${model}/${save_name}`.
+Install dependencies with uv:

-## ANTI-PATTERNS (THIS PROJECT)
- Do not feed non-`.pkl` sequence files into runtime loaders (`opengait/data/dataset.py`).
- Do not violate sampler shape assumptions (`trainer_cfg.sampler.batch_size` is `[P, K]` for triplet regimes).
- Do not ignore DDP cleanup guidance; abnormal exits can leave zombie processes (`misc/clean_process.sh`).
- Do not add unregistered model/loss classes outside expected directories (`opengait/modeling/models`, `opengait/modeling/losses`).
-
-## UNIQUE STYLES
- `datasets/` is intentionally script-heavy (rearrange/extract/pretreat), not a pure library package.
- Research model zoo is broad; many model files co-exist as first-class references.
- Recent repo trajectory includes scoliosis screening models (ScoNet lineage), not only person-ID gait benchmarks.
-
-## COMMANDS
 ```bash
-# install (uv)
 uv sync --extra torch
-
-# train (uv)
-CUDA_VISIBLE_DEVICES=0,1 uv run python -m torch.distributed.launch --nproc_per_node=2 opengait/main.py --cfgs ./configs/baseline/baseline.yaml --phase train
-
-# test (uv)
-CUDA_VISIBLE_DEVICES=0,1 uv run python -m torch.distributed.launch --nproc_per_node=2 opengait/main.py --cfgs ./configs/baseline/baseline.yaml --phase test
-
-# ScoNet 1-GPU eval
-CUDA_VISIBLE_DEVICES=0 uv run python -m torch.distributed.launch --nproc_per_node=1 opengait/main.py --cfgs ./configs/sconet/sconet_scoliosis1k_local_eval_1gpu.yaml --phase test
-
-# preprocess (generic)
-python datasets/pretreatment.py --input_path <raw_or_rearranged> --output_path <pkl_root>
 ```

-## NOTES
- LSP symbol map can be enabled via uv dev dependency `basedpyright`; `basedpyright` and `basedpyright-langserver` are available in `.venv` after `uv sync`.
- `train.sh` / `test.sh` are canonical launch examples across datasets/models.
- Academic-use-only restriction is stated in repository README.
+Notes from `pyproject.toml`:
+- Python requirement: `>=3.10`
+- Dev tooling includes `pytest` and `basedpyright`
+- Optional extras include `torch` and `parquet`
+
+## Build / Run Commands
+
+Train (DDP):
+
+```bash
+CUDA_VISIBLE_DEVICES=0,1 uv run python -m torch.distributed.launch \
+  --nproc_per_node=2 opengait/main.py \
+  --cfgs ./configs/baseline/baseline.yaml --phase train
+```
+
+Test (DDP):
+
+```bash
+CUDA_VISIBLE_DEVICES=0,1 uv run python -m torch.distributed.launch \
+  --nproc_per_node=2 opengait/main.py \
+  --cfgs ./configs/baseline/baseline.yaml --phase test
+```
+
+Single-GPU eval example:
+
+```bash
+CUDA_VISIBLE_DEVICES=0 uv run python -m torch.distributed.launch \
+  --nproc_per_node=1 opengait/main.py \
+  --cfgs ./configs/sconet/sconet_scoliosis1k_local_eval_1gpu.yaml --phase test
+```
+
+Demo CLI entry:
+
+```bash
+uv run python -m opengait.demo --help
+```
+
+## DDP Constraints (Important)
+
+- `--nproc_per_node` must match visible GPU count in `CUDA_VISIBLE_DEVICES`.
+- Test/evaluator sampling settings are strict and can fail if world size mismatches config.
+- If interrupted DDP leaves stale processes:
+
+```bash
+sh misc/clean_process.sh
+```
+
+## Test Commands (especially single test)
+
+Run all tests:
+
+```bash
+uv run pytest tests
+```
+
+Run one file:
+
+```bash
+uv run pytest tests/demo/test_pipeline.py -v
+```
+
+Run one test function:
+
+```bash
+uv run pytest tests/demo/test_pipeline.py::test_resolve_stride_modes -v
+```
+
+Run by keyword:
+
+```bash
+uv run pytest tests/demo/test_window.py -k "stride" -v
+```
+
+## Lint / Typecheck
+
+Typecheck with basedpyright:
+
+```bash
+uv run basedpyright opengait tests
+```
+
+Project currently has no enforced formatter config in root tool files.
+Follow existing local formatting and keep edits minimal.
+
+## High-Value Paths
+
+- `opengait/main.py` — runtime bootstrap
+- `opengait/modeling/base_model.py` — model lifecycle contract
+- `opengait/modeling/models/` — model zoo implementations
+- `opengait/data/dataset.py` — dataset loading rules
+- `opengait/data/collate_fn.py` — frame sampling behavior
+- `opengait/evaluation/evaluator.py` — evaluation dispatch
+- `configs/` — experiment definitions
+- `datasets/` — preprocessing and partitions
+
+## Code Style Guidelines
+
+### Imports
+- Keep ordering consistent: stdlib, third-party, local.
+- Prefer explicit imports; avoid wildcard imports.
+- Avoid introducing heavy imports in hot paths unless needed.
+
+### Formatting
+- Match surrounding file style (spacing, wrapping, structure).
+- Avoid unrelated formatting churn.
+- Keep diffs surgical.
+
+### Types
+- Add type annotations for new public APIs and non-trivial helpers.
+- Reuse established typing style: `typing`, `numpy.typing`, `jaxtyping` where already used.
+- Do not suppress type safety with blanket casts; keep unavoidable casts narrow.
+
+### Naming
+- `snake_case` for functions/variables
+- `PascalCase` for classes
+- `UPPER_SNAKE_CASE` for constants
+- Preserve existing config key names and schema conventions
+
+### Error Handling
+- Raise explicit, actionable errors on invalid inputs.
+- Fail fast for missing files, bad args, invalid shapes, and runtime preconditions.
+- Never swallow exceptions silently.
+- Preserve CLI error semantics (clear messages, non-zero exits).
+
+### Logging
+- Use module-level logger pattern already in codebase.
+- Keep logs concise and operational.
+- Avoid excessive per-frame logging in realtime/demo loops.
+
+## Model and Config Contracts
+
+- New models should conform to `BaseModel` expectations.
+- Respect forward output dictionary contract used by loss/evaluator pipeline.
+- Keep model registration/discovery patterns consistent with current package layout.
+- Respect sampler semantics from config (`fixed_unordered`, `all_ordered`, etc.).
+
+## Data Contracts
+
+- Runtime data expects preprocessed `.pkl` sequence files.
+- Partition JSON files are required for train/test split behavior.
+- Do not mix modalities accidentally (silhouette / pose / pointcloud) across pipelines.
+
+## Research-Verification Policy
+
+When answering methodology questions:
+- Prefer primary sources (paper PDF, official project docs, official code tied to publication).
+- Quote/cite paper statements when concluding method behavior.
+- If local implementation differs from paper, state divergence explicitly.
+- For this repo specifically, remember: `opengait/demo` may differ from paper intent.
+
+## Cursor / Copilot Rules Check
+
+Checked these paths:
+- `.cursor/rules/`
+- `.cursorrules`
+- `.github/copilot-instructions.md`
+
+Current status: no Cursor/Copilot instruction files found.
+
+## Agent Checklist Before Finishing
+
+- Commands executed with `uv run ...` where applicable
+- Targeted tests for changed files pass
+- Typecheck is clean for modified code
+- Behavior/documentation updated together for user-facing changes
+- Paper-vs-implementation claims clearly separated when relevant