1515 lines
70 KiB
Markdown
1515 lines
70 KiB
Markdown
# Real-Time Scoliosis Screening Pipeline (ScoNet)
|
||
|
||
## TL;DR
|
||
|
||
> **Quick Summary**: Build a production-oriented, real-time scoliosis screening pipeline inside the OpenGait repo. Reads video from cv-mmap shared memory or OpenCV, runs YOLO11n-seg for person detection/segmentation/tracking, extracts 64×44 binary silhouettes, feeds a sliding window of 30 frames into a DDP-free ScoNet wrapper, and publishes classification results (positive/neutral/negative) as JSON over NATS.
|
||
>
|
||
> **Deliverables**:
|
||
> - `ScoNetDemo` — standalone `nn.Module` wrapper for ScoNet inference (no DDP)
|
||
> - Silhouette preprocessing module — deterministic mask→64×44 float tensor pipeline
|
||
> - Ring buffer / sliding window manager — per-track frame accumulation with reset logic
|
||
> - Input adapters — cv-mmap async client + OpenCV VideoCapture fallback
|
||
> - NATS publisher — JSON result output
|
||
> - Main pipeline application — orchestrates all components
|
||
> - pytest test suite — preprocessing, windowing, single-person policy, recovery
|
||
> - Sample video for smoke testing
|
||
>
|
||
> **Estimated Effort**: Large
|
||
> **Parallel Execution**: YES — 4 waves
|
||
> **Critical Path**: Task 1 (ScoNetDemo) → Task 3 (Silhouette Preprocess) → Task 5 (Ring Buffer) → Task 7 (Pipeline App) → Task 10 (Integration Tests)
|
||
|
||
---
|
||
|
||
## Context
|
||
|
||
### Original Request
|
||
Build a camera-to-classification pipeline for scoliosis screening using OpenGait's ScoNet model, targeting Jetson AGX Orin deployment, with cv-mmap shared-memory video framework integration.
|
||
|
||
### Interview Summary
|
||
**Key Discussions**:
|
||
- **Input**: Both camera (via cv-mmap) and video files (OpenCV fallback). Single person only.
|
||
- **CV Stack**: YOLO11n-seg with built-in ByteTrack (replaces paper's BYTETracker + PP-HumanSeg)
|
||
- **Inference**: Sliding window of 30 frames, continuous classification
|
||
- **Output**: JSON over NATS (decided over binary protocol — simpler, cross-language)
|
||
- **DDP Bypass**: Create `ScoNetDemo(nn.Module)` following All-in-One-Gait's `BaselineDemo` pattern
|
||
- **Build Location**: Inside repo (opengait lacks `__init__.py`, config system hardcodes paths)
|
||
- **Test Strategy**: pytest, tests after implementation
|
||
- **Hardware**: Dev on i7-14700KF + RTX 5070 Ti/3090; deploy on Jetson AGX Orin
|
||
|
||
**Research Findings**:
|
||
- ScoNet input: `[N, 1, S, 64, 44]` float32 [0,1]. Output: `logits [N, 3, 16]` → `argmax(mean(-1))` → class index
|
||
- `.pkl` preprocessing: grayscale → crop vertical → resize h=64 → center-crop w=64 → cut 10px sides → /255.0
|
||
- `BaseSilCuttingTransform`: cuts `int(W // 64) * 10` px each side + divides by 255
|
||
- All-in-One-Gait `BaselineDemo`: extends `nn.Module`, uses `torch.load()` + `load_state_dict()`, `training=False`
|
||
- YOLO11n-seg: 6MB, ~50-60 FPS, `model.track(frame, persist=True)` → bbox + mask + track_id
|
||
- cv-mmap Python client: `async for im, meta in CvMmapClient("name")` — zero-copy numpy
|
||
|
||
### Metis Review
|
||
**Identified Gaps** (addressed):
|
||
- **Single-person policy undefined** → Defined: largest-bbox selection, ignore others, reset window on ID change
|
||
- **Sliding window stride undefined** → Defined: stride=1 (every frame updates buffer), classify every N frames (configurable)
|
||
- **No-detection / empty mask handling** → Defined: skip frame, don't reset window unless gap exceeds threshold
|
||
- **Mask quality / partial body** → Defined: minimum mask area threshold to accept frame
|
||
- **Track ID reset / re-identification** → Defined: reset ring buffer on track ID change
|
||
- **YOLO letterboxing** → Defined: use `result.masks.data` in original frame coords, not letterboxed
|
||
- **Async/sync impedance** → Defined: synchronous pull-process-publish loop (no async queues in MVP)
|
||
- **Scope creep lockdown** → Explicitly excluded: TensorRT, DeepStream, multi-person, GUI, recording, auto-tuning
|
||
|
||
---
|
||
|
||
## Work Objectives
|
||
|
||
### Core Objective
|
||
Create a self-contained scoliosis screening pipeline that runs standalone (no DDP, no OpenGait training infrastructure) while reusing ScoNet's trained weights and matching its exact preprocessing contract.
|
||
|
||
### Prerequisites (already present in repo)
|
||
- **Checkpoint**: `./ckpt/ScoNet-20000.pt` — trained ScoNet weights (verified working, 80.88% accuracy on Scoliosis1K eval). Already exists in the repo, no download needed.
|
||
- **Config**: `./configs/sconet/sconet_scoliosis1k.yaml` — ScoNet architecture config. Already exists.
|
||
|
||
### Concrete Deliverables
|
||
- `opengait/demo/sconet_demo.py` — ScoNetDemo nn.Module wrapper
|
||
- `opengait/demo/preprocess.py` — Silhouette extraction and normalization
|
||
- `opengait/demo/window.py` — Sliding window / ring buffer manager
|
||
- `opengait/demo/input.py` — Input adapters (cv-mmap + OpenCV)
|
||
- `opengait/demo/output.py` — NATS JSON publisher
|
||
- `opengait/demo/pipeline.py` — Main pipeline orchestrator
|
||
- `opengait/demo/__main__.py` — CLI entry point
|
||
- `tests/demo/test_preprocess.py` — Preprocessing unit tests
|
||
- `tests/demo/test_window.py` — Ring buffer + single-person policy tests
|
||
- `tests/demo/test_pipeline.py` — Integration / smoke tests
|
||
- `tests/demo/test_pipeline.py` — Integration / smoke tests
|
||
|
||
### Definition of Done
|
||
- [x] `uv run python -m opengait.demo --source ./assets/sample.mp4 --checkpoint ./ckpt/ScoNet-20000.pt --max-frames 120` exits 0 and prints predictions (no NATS by default when `--nats-url` not provided)
|
||
- [x] `uv run pytest tests/demo/ -q` passes all tests
|
||
- [x] Pipeline processes ≥15 FPS on desktop GPU with 720p input
|
||
- [x] JSON schema validated: `{"frame": int, "track_id": int, "label": str, "confidence": float, "window": int, "timestamp_ns": int}`
|
||
|
||
### Must Have
|
||
- Deterministic preprocessing matching ScoNet training data exactly (64×44, float32, [0,1])
|
||
- Single-person selection (largest bbox) with consistent tracking
|
||
- Sliding window of 30 frames with reset on track loss/ID change
|
||
- Graceful handling of: no detection, end of video, cv-mmap disconnect
|
||
- CLI with `--source`, `--checkpoint`, `--device`, `--window`, `--stride`, `--nats-url`, `--max-frames` flags (using `click`)
|
||
- Works without NATS server when `--nats-url` is omitted (console output fallback)
|
||
- All tensor/array function signatures annotated with `jaxtyping` types (e.g., `Float[Tensor, 'batch 1 seq 64 44']`) and checked at runtime with `beartype` via `@jaxtyped(typechecker=beartype)` decorators
|
||
- Generator-based input adapters — any `Iterable[tuple[np.ndarray, dict]]` works as a source
|
||
|
||
### Must NOT Have (Guardrails)
|
||
- **No DDP**: Demo must never import or call `torch.distributed` anything
|
||
- **No BaseModel subclassing**: ScoNetDemo extends `nn.Module` directly
|
||
- **No repo restructuring**: Don't touch existing opengait training/eval/data code
|
||
- **No TensorRT/DeepStream**: Jetson acceleration is out of MVP scope
|
||
- **No multi-person**: Single tracked person only
|
||
- **No GUI/visualization**: Output is JSON, not rendered frames
|
||
- **No dataset recording/auto-labeling**: This is inference only
|
||
- **No OpenCV GStreamer builds**: Use pip-installed OpenCV
|
||
- **No magic preprocessing**: Every transform step must be explicit and testable
|
||
- **No unbounded buffers**: Every queue/buffer has a max size and drop policy
|
||
|
||
---
|
||
|
||
## Verification Strategy
|
||
|
||
> **ZERO HUMAN INTERVENTION** — ALL verification is agent-executed. No exceptions.
|
||
|
||
### Test Decision
|
||
- **Infrastructure exists**: NO (creating with this plan)
|
||
- **Automated tests**: Tests after implementation (pytest)
|
||
- **Framework**: pytest (via `uv run pytest`)
|
||
- **Setup**: Add pytest to dev dependencies in pyproject.toml
|
||
|
||
### QA Policy
|
||
Every task MUST include agent-executed QA scenarios.
|
||
Evidence saved to `.sisyphus/evidence/task-{N}-{scenario-slug}.{ext}`.
|
||
|
||
- **CLI/Pipeline**: Use Bash — run pipeline with sample video, validate output
|
||
- **Unit Tests**: Use Bash — `uv run pytest` specific test files
|
||
- **NATS Integration**: Use Bash — start NATS container, run pipeline, subscribe and validate JSON
|
||
|
||
---
|
||
|
||
## Execution Strategy
|
||
|
||
### Parallel Execution Waves
|
||
|
||
```
|
||
Wave 1 (Foundation — all independent, start immediately):
|
||
├── Task 1: Project scaffolding (opengait/demo/, tests/demo/, deps) [quick]
|
||
├── Task 2: ScoNetDemo nn.Module wrapper [deep]
|
||
├── Task 3: Silhouette preprocessing module [deep]
|
||
└── Task 4: Input adapters (cv-mmap + OpenCV) [unspecified-high]
|
||
|
||
Wave 2 (Core logic — depends on Wave 1 foundations):
|
||
├── Task 5: Ring buffer / sliding window manager (depends: 3) [unspecified-high]
|
||
├── Task 6: NATS JSON publisher (depends: 1) [quick]
|
||
├── Task 7: Unit tests — preprocessing (depends: 3) [unspecified-high]
|
||
└── Task 8: Unit tests — ScoNetDemo forward pass (depends: 2) [unspecified-high]
|
||
|
||
Wave 3 (Integration — combines all components):
|
||
├── Task 9: Main pipeline application + CLI (depends: 2,3,4,5,6) [deep]
|
||
├── Task 10: Single-person policy tests (depends: 5) [unspecified-high]
|
||
└── Task 11: Sample video acquisition (depends: 1) [quick]
|
||
|
||
Wave 4 (Verification — end-to-end):
|
||
├── Task 12: Integration tests + smoke test (depends: 9,11) [deep]
|
||
└── Task 13: NATS integration test (depends: 9,6) [unspecified-high]
|
||
|
||
Wave FINAL (Independent review — 4 parallel):
|
||
├── Task F1: Plan compliance audit (oracle)
|
||
├── Task F2: Code quality review (unspecified-high)
|
||
├── Task F3: Real manual QA (unspecified-high)
|
||
└── Task F4: Scope fidelity check (deep)
|
||
|
||
Critical Path: Task 1 → Task 2 → Task 3 → Task 5 → Task 9 → Task 12 → F1-F4
|
||
Parallel Speedup: ~60% faster than sequential
|
||
Max Concurrent: 4 (Waves 1 & 2)
|
||
```
|
||
|
||
### Dependency Matrix
|
||
|
||
| Task | Depends On | Blocks | Wave |
|
||
|------|-----------|--------|------|
|
||
| 1 | — | 6, 11 | 1 |
|
||
| 2 | — | 8, 9 | 1 |
|
||
| 3 | — | 5, 7, 9 | 1 |
|
||
| 4 | — | 9 | 1 |
|
||
| 5 | 3 | 9, 10 | 2 |
|
||
| 6 | 1 | 9, 13 | 2 |
|
||
| 7 | 3 | — | 2 |
|
||
| 8 | 2 | — | 2 |
|
||
| 9 | 2, 3, 4, 5, 6 | 12, 13 | 3 |
|
||
| 10 | 5 | — | 3 |
|
||
| 11 | 1 | 12 | 3 |
|
||
| 12 | 9, 11 | F1-F4 | 4 |
|
||
| 13 | 9, 6 | F1-F4 | 4 |
|
||
| F1-F4 | 12, 13 | — | FINAL |
|
||
|
||
### Agent Dispatch Summary
|
||
|
||
- **Wave 1**: **4** — T1 → `quick`, T2 → `deep`, T3 → `deep`, T4 → `unspecified-high`
|
||
- **Wave 2**: **4** — T5 → `unspecified-high`, T6 → `quick`, T7 → `unspecified-high`, T8 → `unspecified-high`
|
||
- **Wave 3**: **3** — T9 → `deep`, T10 → `unspecified-high`, T11 → `quick`
|
||
- **Wave 4**: **2** — T12 → `deep`, T13 → `unspecified-high`
|
||
- **FINAL**: **4** — F1 → `oracle`, F2 → `unspecified-high`, F3 → `unspecified-high`, F4 → `deep`
|
||
|
||
---
|
||
|
||
## TODOs
|
||
|
||
> Implementation + Test = ONE Task. Never separate.
|
||
> EVERY task MUST have: Recommended Agent Profile + Parallelization info + QA Scenarios.
|
||
|
||
---
|
||
|
||
- [x] 1. Project Scaffolding + Dependencies
|
||
|
||
**What to do**:
|
||
- Create `opengait/demo/__init__.py` (empty, makes it a package)
|
||
- Create `opengait/demo/__main__.py` (stub: `from .pipeline import main; main()`)
|
||
- Create `tests/demo/__init__.py` and `tests/__init__.py` if missing
|
||
- Create `tests/demo/conftest.py` with shared fixtures (sample tensor, mock frame)
|
||
- Add dev dependencies to `pyproject.toml`: `pytest`, `nats-py`, `ultralytics`, `jaxtyping`, `beartype`, `click`
|
||
- Verify: `uv sync --extra torch` succeeds with new deps
|
||
- Verify: `uv run python -c "import ultralytics; import nats; import jaxtyping; import beartype; import click"` works
|
||
|
||
**Must NOT do**:
|
||
- Don't modify existing opengait code or imports
|
||
- Don't add runtime deps that aren't needed (no flask, no fastapi, etc.)
|
||
|
||
**Recommended Agent Profile**:
|
||
- **Category**: `quick`
|
||
- Reason: Boilerplate file creation and dependency management, no complex logic
|
||
- **Skills**: []
|
||
- **Skills Evaluated but Omitted**:
|
||
- `explore`: Not needed — we know exactly what files to create
|
||
|
||
**Parallelization**:
|
||
- **Can Run In Parallel**: YES
|
||
- **Parallel Group**: Wave 1 (with Tasks 1, 2, 3)
|
||
- **Blocks**: Tasks 6, 11
|
||
- **Blocked By**: None (can start immediately)
|
||
|
||
**References**:
|
||
|
||
**Pattern References**:
|
||
- `opengait/modeling/models/__init__.py` — Example of package init in this repo
|
||
- `pyproject.toml` — Current dependency structure; add to `[project.optional-dependencies]` or `[dependency-groups]`
|
||
|
||
**External References**:
|
||
- ultralytics pip package: `pip install ultralytics` (includes YOLO + ByteTrack)
|
||
- nats-py: `pip install nats-py` (async NATS client)
|
||
|
||
**WHY Each Reference Matters**:
|
||
- `pyproject.toml`: Must match existing dep management style (uv + groups) to avoid breaking `uv sync`
|
||
- `opengait/modeling/models/__init__.py`: Shows the repo's package init convention (dynamic imports vs empty)
|
||
|
||
**Acceptance Criteria**:
|
||
- [x] `opengait/demo/__init__.py` exists
|
||
- [x] `opengait/demo/__main__.py` exists with stub entry point
|
||
- [x] `tests/demo/conftest.py` exists with at least one fixture
|
||
- [x] `uv sync` succeeds without errors
|
||
- [x] `uv run python -c "import ultralytics; import nats; import jaxtyping; import beartype; import click; print('OK')"` prints OK
|
||
|
||
**QA Scenarios:**
|
||
|
||
```
|
||
Scenario: Dependencies install correctly
|
||
Tool: Bash
|
||
Preconditions: Clean uv environment
|
||
Steps:
|
||
1. Run `uv sync --extra torch`
|
||
2. Run `uv run python -c "import ultralytics; import nats; import pytest; import jaxtyping; import beartype; import click; print('ALL_DEPS_OK')"`
|
||
Expected Result: Both commands exit 0, second prints 'ALL_DEPS_OK'
|
||
Failure Indicators: ImportError, uv sync failure, missing package
|
||
Evidence: .sisyphus/evidence/task-4-deps-install.txt
|
||
|
||
Scenario: Package structure is importable
|
||
Tool: Bash
|
||
Preconditions: uv sync completed
|
||
Steps:
|
||
1. Run `uv run python -c "from opengait.demo import __main__; print('IMPORT_OK')"`
|
||
Expected Result: Prints 'IMPORT_OK' without errors
|
||
Failure Indicators: ModuleNotFoundError, ImportError
|
||
Evidence: .sisyphus/evidence/task-4-import-check.txt
|
||
```
|
||
|
||
**Commit**: YES
|
||
- Message: `chore(demo): scaffold demo package and test infrastructure`
|
||
- Files: `opengait/demo/__init__.py`, `opengait/demo/__main__.py`, `tests/demo/conftest.py`, `tests/demo/__init__.py`, `tests/__init__.py`, `pyproject.toml`
|
||
- Pre-commit: `uv sync && uv run python -c "import ultralytics; import nats; import jaxtyping; import beartype; import click"`
|
||
|
||
- [x] 2. ScoNetDemo — DDP-Free Inference Wrapper
|
||
|
||
**What to do**:
|
||
- Create `opengait/demo/sconet_demo.py`
|
||
- Class `ScoNetDemo(nn.Module)` — NOT a BaseModel subclass
|
||
- Constructor takes `cfg_path: str` and `checkpoint_path: str`
|
||
- Use `config_loader` from `opengait/utils/common.py` to parse YAML config
|
||
- Build the ScoNet architecture layers directly:
|
||
- `Backbone` (ResNet9 from `opengait/modeling/backbones/resnet.py`)
|
||
- `TemporalPool` (from `opengait/modeling/modules.py`)
|
||
- `HorizontalPoolingPyramid` (from `opengait/modeling/modules.py`)
|
||
- `SeparateFCs` (from `opengait/modeling/modules.py`)
|
||
- `SeparateBNNecks` (from `opengait/modeling/modules.py`)
|
||
- Load checkpoint: `torch.load(checkpoint_path, map_location=device)` → extract state_dict → `load_state_dict()`
|
||
- Handle checkpoint format: may be `{'model': state_dict, ...}` or plain state_dict
|
||
- Strip `module.` prefix from DDP-wrapped keys if present
|
||
- All public methods decorated with `@jaxtyped(typechecker=beartype)` for runtime shape checking
|
||
- `forward(sils: Float[Tensor, 'batch 1 seq 64 44']) -> dict` where seq=30 (window size)
|
||
- Use jaxtyping: `from jaxtyping import Float, Int, jaxtyped`
|
||
- Use beartype: `from beartype import beartype`
|
||
- Returns `{'logits': Float[Tensor, 'batch 3 16'], 'label': int, 'confidence': float}`
|
||
- `predict(sils: Float[Tensor, 'batch 1 seq 64 44']) -> tuple[str, float]` convenience method: returns `('positive'|'neutral'|'negative', confidence)`
|
||
- Prediction logic: `argmax(logits.mean(dim=-1), dim=-1)` → index → label string
|
||
- Confidence: `softmax(logits.mean(dim=-1)).max()` — probability of chosen class
|
||
- Class mapping: `{0: 'negative', 1: 'neutral', 2: 'positive'}`
|
||
|
||
**Must NOT do**:
|
||
- Do NOT import anything from `torch.distributed`
|
||
- Do NOT subclass `BaseModel`
|
||
- Do NOT use `ddp_all_gather` or `get_ddp_module`
|
||
- Do NOT modify `sconet.py` or any existing model file
|
||
|
||
**Recommended Agent Profile**:
|
||
- **Category**: `deep`
|
||
- Reason: Requires understanding ScoNet's architecture, checkpoint format, and careful weight loading — high complexity, correctness-critical
|
||
- **Skills**: []
|
||
- **Skills Evaluated but Omitted**:
|
||
- `explore`: Agent should read referenced files directly, not search broadly
|
||
|
||
**Parallelization**:
|
||
- **Can Run In Parallel**: YES
|
||
- **Parallel Group**: Wave 1 (with Tasks 2, 3, 4)
|
||
- **Blocks**: Tasks 8, 9
|
||
- **Blocked By**: None (can start immediately)
|
||
|
||
**References**:
|
||
|
||
**Pattern References**:
|
||
- `opengait/modeling/models/sconet.py` — ScoNet model definition. Study `__init__` to see which submodules are built and how `forward()` assembles the pipeline. Lines ~10-54.
|
||
- `opengait/modeling/base_model.py` — BaseModel class. Study `__init__` (lines ~30-80) to see how it builds backbone/heads from config. Replicate the build logic WITHOUT DDP calls.
|
||
- All-in-One-Gait `BaselineDemo` pattern: extends `nn.Module` directly, uses `torch.load()` + `load_state_dict()` with `training=False`
|
||
|
||
**API/Type References**:
|
||
- `opengait/modeling/backbones/resnet.py` — ResNet9 backbone class. Constructor signature and forward signature.
|
||
- `opengait/modeling/modules.py` — `TemporalPool`, `HorizontalPoolingPyramid`, `SeparateFCs`, `SeparateBNNecks` classes. Constructor args come from config YAML.
|
||
- `opengait/utils/common.py::config_loader` — Loads YAML config, merges with default.yaml. Returns dict.
|
||
|
||
**Config References**:
|
||
- `configs/sconet/sconet_scoliosis1k.yaml` — ScoNet config specifying backbone, head, loss params. The `model_cfg` section defines architecture hyperparams.
|
||
- `configs/default.yaml` — Default config merged by config_loader
|
||
|
||
**Checkpoint Reference**:
|
||
- `./ckpt/ScoNet-20000.pt` — Trained ScoNet checkpoint. Verify format: `torch.load()` and inspect keys.
|
||
|
||
**Inference Logic Reference**:
|
||
- `opengait/evaluation/evaluator.py:evaluate_scoliosis()` (line ~418) — Shows `argmax(logits.mean(-1))` prediction logic and label mapping
|
||
|
||
**WHY Each Reference Matters**:
|
||
- `sconet.py`: Defines the exact forward pass we must replicate — TP → backbone → HPP → FCs → BNNecks
|
||
- `base_model.py`: Shows how config is parsed into submodule instantiation — we copy this logic minus DDP
|
||
- `modules.py`: Constructor signatures tell us what config keys to extract
|
||
- `evaluator.py`: The prediction aggregation (mean over parts, argmax) is the canonical inference logic
|
||
- `sconet_scoliosis1k.yaml`: Contains the exact hyperparams (channels, num_parts, etc.) for building layers
|
||
|
||
**Acceptance Criteria**:
|
||
- [x] `opengait/demo/sconet_demo.py` exists with `ScoNetDemo(nn.Module)` class
|
||
- [x] No `torch.distributed` imports in the file
|
||
- [x] `ScoNetDemo` does not inherit from `BaseModel`
|
||
- [x] `uv run python -c "from opengait.demo.sconet_demo import ScoNetDemo; print('OK')"` works
|
||
|
||
**QA Scenarios:**
|
||
|
||
```
|
||
Scenario: ScoNetDemo loads checkpoint and produces correct output shape
|
||
Tool: Bash
|
||
Preconditions: ./ckpt/ScoNet-20000.pt exists, CUDA available
|
||
Steps:
|
||
1. Run `uv run python -c "`
|
||
```python
|
||
import torch
|
||
from opengait.demo.sconet_demo import ScoNetDemo
|
||
model = ScoNetDemo('./configs/sconet/sconet_scoliosis1k.yaml', './ckpt/ScoNet-20000.pt', device='cuda:0')
|
||
model.eval()
|
||
dummy = torch.rand(1, 1, 30, 64, 44, device='cuda:0')
|
||
with torch.no_grad():
|
||
result = model(dummy)
|
||
assert result['logits'].shape == (1, 3, 16), f'Bad shape: {result["logits"].shape}'
|
||
label, conf = model.predict(dummy)
|
||
assert label in ('negative', 'neutral', 'positive'), f'Bad label: {label}'
|
||
assert 0.0 <= conf <= 1.0, f'Bad confidence: {conf}'
|
||
print(f'SCONET_OK label={label} conf={conf:.3f}')
|
||
```
|
||
Expected Result: Prints 'SCONET_OK label=... conf=...' with valid label and confidence
|
||
Failure Indicators: ImportError, state_dict key mismatch, shape error, CUDA error
|
||
Evidence: .sisyphus/evidence/task-1-sconet-forward.txt
|
||
|
||
Scenario: ScoNetDemo rejects DDP-wrapped usage
|
||
Tool: Bash
|
||
Preconditions: File exists
|
||
Steps:
|
||
1. Run `grep -c 'torch.distributed' opengait/demo/sconet_demo.py`
|
||
2. Run `grep -c 'BaseModel' opengait/demo/sconet_demo.py`
|
||
Expected Result: Both commands output '0'
|
||
Failure Indicators: Any count > 0
|
||
Evidence: .sisyphus/evidence/task-1-no-ddp.txt
|
||
```
|
||
|
||
**Commit**: YES
|
||
- Message: `feat(demo): add ScoNetDemo DDP-free inference wrapper`
|
||
- Files: `opengait/demo/sconet_demo.py`
|
||
- Pre-commit: `uv run python -c "from opengait.demo.sconet_demo import ScoNetDemo"`
|
||
|
||
- [x] 3. Silhouette Preprocessing Module
|
||
|
||
**What to do**:
|
||
- Create `opengait/demo/preprocess.py`
|
||
- All public functions decorated with `@jaxtyped(typechecker=beartype)` for runtime shape checking
|
||
- Function `mask_to_silhouette(mask: UInt8[ndarray, 'h w'], bbox: tuple[int,int,int,int]) -> Float[ndarray, '64 44'] | None`:
|
||
- Uses jaxtyping: `from jaxtyping import Float, UInt8, jaxtyped` and `from numpy import ndarray`
|
||
- Input: binary mask (H, W) uint8 from YOLO, bounding box (x1, y1, x2, y2)
|
||
- Crop mask to bbox region
|
||
- Find vertical extent of foreground pixels (top/bottom rows with nonzero)
|
||
- Crop to tight vertical bounding box (remove empty rows above/below)
|
||
- Resize height to 64, maintaining aspect ratio
|
||
- Center-crop or center-pad width to 64
|
||
- Cut 10px from each side → final 64×44
|
||
- Return float32 array [0.0, 1.0] (divide by 255)
|
||
- Return `None` if mask area below `MIN_MASK_AREA` threshold (default: 500 pixels)
|
||
- Function `frame_to_person_mask(result, min_area: int = 500) -> tuple[UInt8[ndarray, 'h w'], tuple[int,int,int,int]] | None`:
|
||
- Extract single-person mask + bbox from YOLO result object
|
||
- Uses `result.masks.data` and `result.boxes.xyxy`
|
||
- Returns `None` if no valid detection
|
||
- Constants: `SIL_HEIGHT = 64`, `SIL_WIDTH = 44`, `SIL_FULL_WIDTH = 64`, `SIDE_CUT = 10`, `MIN_MASK_AREA = 500`
|
||
- Each step must match the preprocessing in `datasets/pretreatment.py` (grayscale → crop → resize → center) and `BaseSilCuttingTransform` (cut sides → /255)
|
||
|
||
**Must NOT do**:
|
||
- Don't import or modify `datasets/pretreatment.py`
|
||
- Don't add color/texture features — binary silhouettes only
|
||
- Don't resize to arbitrary sizes — must be exactly 64×44 output
|
||
|
||
**Recommended Agent Profile**:
|
||
- **Category**: `deep`
|
||
- Reason: Correctness-critical — must exactly match training data preprocessing. Subtle pixel-level bugs will silently degrade accuracy.
|
||
- **Skills**: []
|
||
|
||
**Parallelization**:
|
||
- **Can Run In Parallel**: YES
|
||
- **Parallel Group**: Wave 1 (with Tasks 1, 2, 4)
|
||
- **Blocks**: Tasks 5, 7, 9
|
||
- **Blocked By**: None
|
||
|
||
**References**:
|
||
|
||
**Pattern References**:
|
||
- `datasets/pretreatment.py:18-96` (function `imgs2pickle`) — The canonical preprocessing pipeline. Study lines 45-80 carefully: `cv2.imread(GRAYSCALE)` → find contours → crop to person bbox → `cv2.resize(img, (int(64 * ratio), 64))` → center-crop width. This is the EXACT sequence to replicate for live masks.
|
||
- `opengait/data/transform.py:46-58` (`BaseSilCuttingTransform`) — The runtime transform applied during training/eval. `cutting = int(w // 64) * 10` then slices `[:, :, cutting:-cutting]` then divides by 255.0. For w=64 input, cutting=10, output width=44.
|
||
|
||
**API/Type References**:
|
||
- Ultralytics `Results` object: `result.masks.data` → `Tensor[N, H, W]` binary masks; `result.boxes.xyxy` → `Tensor[N, 4]` bounding boxes; `result.boxes.id` → track IDs (may be None)
|
||
|
||
**WHY Each Reference Matters**:
|
||
- `pretreatment.py`: Defines the ground truth preprocessing. If our live pipeline differs by even 1 pixel of padding, ScoNet accuracy degrades.
|
||
- `BaseSilCuttingTransform`: The 10px side cut + /255 is applied at runtime during training. We must apply the SAME transform.
|
||
- Ultralytics masks: Need to know exact API to extract binary masks from YOLO output
|
||
|
||
**Acceptance Criteria**:
|
||
- [x] `opengait/demo/preprocess.py` exists
|
||
- [x] `mask_to_silhouette()` returns `np.ndarray` of shape `(64, 44)` dtype `float32` with values in `[0, 1]`
|
||
- [x] Returns `None` for masks below MIN_MASK_AREA
|
||
|
||
**QA Scenarios:**
|
||
|
||
```
|
||
Scenario: Preprocessing produces correct output shape and range
|
||
Tool: Bash
|
||
Preconditions: Module importable
|
||
Steps:
|
||
1. Run `uv run python -c "`
|
||
```python
|
||
import numpy as np
|
||
from opengait.demo.preprocess import mask_to_silhouette
|
||
# Create a synthetic mask: 200x100 person-shaped blob
|
||
mask = np.zeros((480, 640), dtype=np.uint8)
|
||
mask[100:400, 250:400] = 255 # person region
|
||
bbox = (250, 100, 400, 400)
|
||
sil = mask_to_silhouette(mask, bbox)
|
||
assert sil is not None, 'Should not be None for valid mask'
|
||
assert sil.shape == (64, 44), f'Bad shape: {sil.shape}'
|
||
assert sil.dtype == np.float32, f'Bad dtype: {sil.dtype}'
|
||
assert 0.0 <= sil.min() and sil.max() <= 1.0, f'Bad range: [{sil.min()}, {sil.max()}]'
|
||
assert sil.max() > 0, 'Should have nonzero pixels'
|
||
print('PREPROCESS_OK')
|
||
```
|
||
Expected Result: Prints 'PREPROCESS_OK'
|
||
Failure Indicators: Shape mismatch, dtype error, range error
|
||
Evidence: .sisyphus/evidence/task-3-preprocess-shape.txt
|
||
|
||
Scenario: Small masks are rejected
|
||
Tool: Bash
|
||
Preconditions: Module importable
|
||
Steps:
|
||
1. Run `uv run python -c "`
|
||
```python
|
||
import numpy as np
|
||
from opengait.demo.preprocess import mask_to_silhouette
|
||
# Tiny mask: only 10x10 = 100 pixels (below MIN_MASK_AREA=500)
|
||
mask = np.zeros((480, 640), dtype=np.uint8)
|
||
mask[100:110, 100:110] = 255
|
||
bbox = (100, 100, 110, 110)
|
||
sil = mask_to_silhouette(mask, bbox)
|
||
assert sil is None, f'Should be None for tiny mask, got {type(sil)}'
|
||
print('SMALL_MASK_REJECTED_OK')
|
||
```
|
||
Expected Result: Prints 'SMALL_MASK_REJECTED_OK'
|
||
Failure Indicators: Returns non-None for tiny mask
|
||
Evidence: .sisyphus/evidence/task-3-small-mask-reject.txt
|
||
```
|
||
|
||
**Commit**: YES
|
||
- Message: `feat(demo): add silhouette preprocessing module`
|
||
- Files: `opengait/demo/preprocess.py`
|
||
- Pre-commit: `uv run python -c "from opengait.demo.preprocess import mask_to_silhouette"`
|
||
|
||
- [x] 4. Input Adapters (cv-mmap + OpenCV)
|
||
|
||
**What to do**:
|
||
- Create `opengait/demo/input.py`
|
||
- The pipeline contract is simple: it consumes any `Iterable[tuple[np.ndarray, dict]]` — any generator or iterator that yields `(frame_bgr_uint8, metadata_dict)` works
|
||
- Type alias: `FrameStream = Iterable[tuple[np.ndarray, dict]]`
|
||
- Generator function `opencv_source(path: str | int, max_frames: int | None = None) -> Generator[tuple[np.ndarray, dict], None, None]`:
|
||
- `path` can be video file path or camera index (int)
|
||
- Opens `cv2.VideoCapture(path)`
|
||
- Yields `(frame, {'frame_count': int, 'timestamp_ns': int})` tuples
|
||
- Handles end-of-video gracefully (just returns)
|
||
- Handles camera disconnect (log warning, return)
|
||
- Respects `max_frames` limit
|
||
- Generator function `cvmmap_source(name: str, max_frames: int | None = None) -> Generator[tuple[np.ndarray, dict], None, None]`:
|
||
- Wraps `CvMmapClient` from `/home/crosstyan/Code/cv-mmap/client/cvmmap/`
|
||
- Since cv-mmap is async (anyio), this adapter must bridge async→sync:
|
||
- Run anyio event loop in a background thread, drain frames via `queue.Queue`
|
||
- Or use `anyio.from_thread` / `asyncio.run()` with `async for` internally
|
||
- Choose simplest correct approach
|
||
- Yields same `(frame, metadata_dict)` tuple format as opencv_source
|
||
- Handles cv-mmap disconnect/offline events gracefully
|
||
- Import should be conditional (try/except ImportError) so pipeline works without cv-mmap installed
|
||
- Factory function `create_source(source: str, max_frames: int | None = None) -> FrameStream`:
|
||
- If source starts with `cvmmap://` → `cvmmap_source(name)`
|
||
- If source is a digit string → `opencv_source(int(source))` (camera index)
|
||
- Otherwise → `opencv_source(source)` (file path)
|
||
- The key design point: **any user-written generator that yields `(np.ndarray, dict)` plugs in directly** — no class inheritance needed
|
||
|
||
**Must NOT do**:
|
||
- Don't build GStreamer pipelines
|
||
- Don't add async to the main pipeline loop — keep synchronous pull model
|
||
- Don't use abstract base classes or heavy OOP — plain generator functions are the interface
|
||
- Don't buffer frames internally (no unbounded queue between source and consumer)
|
||
|
||
**Recommended Agent Profile**:
|
||
- **Category**: `unspecified-high`
|
||
- Reason: Integration with external library (cv-mmap) requires careful async→sync bridging
|
||
- **Skills**: []
|
||
|
||
**Parallelization**:
|
||
- **Can Run In Parallel**: YES
|
||
- **Parallel Group**: Wave 1 (with Tasks 1, 3, 4)
|
||
- **Blocks**: Task 9
|
||
- **Blocked By**: None
|
||
|
||
**References**:
|
||
|
||
**Pattern References**:
|
||
- `/home/crosstyan/Code/cv-mmap/client/cvmmap/__init__.py` — `CvMmapClient` class. Async iterator: `async for im, meta in client`. Understand the `__aiter__`/`__anext__` protocol.
|
||
- `/home/crosstyan/Code/cv-mmap/client/test_cvmmap.py` — Example consumer pattern using `anyio.run()`
|
||
- `/home/crosstyan/Code/cv-mmap/client/cvmmap/msg.py` — `FrameMetadata` and `FrameInfo` dataclasses. Fields: `frame_count`, `timestamp_ns`, `info.width`, `info.height`, `info.pixel_format`
|
||
|
||
**API/Type References**:
|
||
- `cv2.VideoCapture` — OpenCV video capture. `cap.read()` returns `(bool, np.ndarray)`. `cap.get(cv2.CAP_PROP_FRAME_COUNT)` for total frames.
|
||
|
||
**WHY Each Reference Matters**:
|
||
- `CvMmapClient`: The async iterator yields `(numpy_array, FrameMetadata)` — need to know exact types for sync bridging
|
||
- `msg.py`: Metadata fields must be mapped to our generic `dict` metadata format
|
||
- `test_cvmmap.py`: Shows the canonical consumer pattern we must wrap
|
||
|
||
**Acceptance Criteria**:
|
||
- [x] `opengait/demo/input.py` exists with `opencv_source`, `cvmmap_source`, `create_source` as functions (not classes)
|
||
- [x] `create_source('./some/video.mp4')` returns a generator/iterable
|
||
- [x] `create_source('cvmmap://default')` returns a generator (or raises if cv-mmap not installed)
|
||
- [x] `create_source('0')` returns a generator for camera index 0
|
||
- [x] Any custom generator `def my_source(): yield (frame, meta)` can be used directly by the pipeline
|
||
|
||
**QA Scenarios:**
|
||
|
||
```
|
||
Scenario: opencv_source reads frames from a video file
|
||
Tool: Bash
|
||
Preconditions: Any .mp4 or .avi file available (use sample video or create synthetic one)
|
||
Steps:
|
||
1. Create a short test video if none exists:
|
||
`uv run python -c "import cv2; import numpy as np; w=cv2.VideoWriter('/tmp/test.avi', cv2.VideoWriter_fourcc(*'MJPG'), 15, (640,480)); [w.write(np.random.randint(0,255,(480,640,3),dtype=np.uint8)) for _ in range(30)]; w.release(); print('VIDEO_CREATED')"`
|
||
2. Run `uv run python -c "`
|
||
```python
|
||
from opengait.demo.input import create_source
|
||
src = create_source('/tmp/test.avi', max_frames=10)
|
||
count = 0
|
||
for frame, meta in src:
|
||
assert frame.shape[2] == 3, f'Not BGR: {frame.shape}'
|
||
assert 'frame_count' in meta
|
||
count += 1
|
||
assert count == 10, f'Expected 10 frames, got {count}'
|
||
print('OPENCV_SOURCE_OK')
|
||
```
|
||
Expected Result: Prints 'OPENCV_SOURCE_OK'
|
||
Failure Indicators: Shape error, missing metadata, wrong frame count
|
||
Evidence: .sisyphus/evidence/task-2-opencv-source.txt
|
||
|
||
Scenario: Custom generator works as pipeline input
|
||
Tool: Bash
|
||
Steps:
|
||
1. Run `uv run python -c "`
|
||
```python
|
||
import numpy as np
|
||
from opengait.demo.input import FrameStream
|
||
import typing
|
||
# Any generator works — no class needed
|
||
def my_source():
|
||
for i in range(5):
|
||
yield np.zeros((480, 640, 3), dtype=np.uint8), {'frame_count': i}
|
||
src = my_source()
|
||
frames = list(src)
|
||
assert len(frames) == 5
|
||
print('CUSTOM_GENERATOR_OK')
|
||
```
|
||
Expected Result: Prints 'CUSTOM_GENERATOR_OK'
|
||
Failure Indicators: Type error, protocol mismatch
|
||
Evidence: .sisyphus/evidence/task-2-custom-gen.txt
|
||
```
|
||
|
||
**Commit**: YES
|
||
- Message: `feat(demo): add generator-based input adapters for cv-mmap and OpenCV`
|
||
- Files: `opengait/demo/input.py`
|
||
- Pre-commit: `uv run python -c "from opengait.demo.input import create_source"`
|
||
|
||
- [x] 5. Sliding Window / Ring Buffer Manager
|
||
|
||
**What to do**:
|
||
- Create `opengait/demo/window.py`
|
||
- Class `SilhouetteWindow`:
|
||
- Constructor: `__init__(self, window_size: int = 30, stride: int = 1, gap_threshold: int = 15)`
|
||
- Internal storage: `collections.deque(maxlen=window_size)` of `np.ndarray` (64×44 float32)
|
||
- `push(sil: np.ndarray, frame_idx: int, track_id: int) -> None`:
|
||
- If `track_id` differs from current tracked ID → reset buffer, update tracked ID
|
||
- If `frame_idx - last_frame_idx > gap_threshold` → reset buffer (too many missed frames)
|
||
- Append silhouette to deque
|
||
- Increment internal frame counter
|
||
- `is_ready() -> bool`: returns `len(buffer) == window_size`
|
||
- `should_classify() -> bool`: returns `is_ready() and (frames_since_last_classify >= stride)`
|
||
- `get_tensor(device: str = 'cpu') -> torch.Tensor`:
|
||
- Stack buffer into `np.array` shape `[window_size, 64, 44]`
|
||
- Convert to `torch.Tensor` shape `[1, 1, window_size, 64, 44]` on `device`
|
||
- This is the exact input shape for ScoNetDemo
|
||
- `reset() -> None`: clear buffer and counters
|
||
- `mark_classified() -> None`: reset frames_since_last_classify counter
|
||
- Properties: `current_track_id`, `frame_count`, `fill_level` (len/window_size as float)
|
||
- **Single-person selection policy** (function or small helper):
|
||
- `select_person(results) -> tuple[np.ndarray, tuple, int] | None`
|
||
- From YOLO results, select the detection with the **largest bounding box area**
|
||
- Return `(mask, bbox, track_id)` or `None` if no valid detection
|
||
- If `result.boxes.id` is None (tracker not yet initialized), skip frame
|
||
|
||
**Must NOT do**:
|
||
- No unbounded buffers — deque with maxlen enforces this
|
||
- No multi-person tracking — single person only, select largest bbox
|
||
- No time-based windowing — frame-count based only
|
||
|
||
**Recommended Agent Profile**:
|
||
- **Category**: `unspecified-high`
|
||
- Reason: Data structure + policy logic with edge cases (ID changes, gaps, resets)
|
||
- **Skills**: []
|
||
|
||
**Parallelization**:
|
||
- **Can Run In Parallel**: YES
|
||
- **Parallel Group**: Wave 2 (with Tasks 6, 7, 8)
|
||
- **Blocks**: Tasks 9, 10
|
||
- **Blocked By**: Task 3 (needs silhouette shape constants from preprocess.py)
|
||
|
||
**References**:
|
||
|
||
**Pattern References**:
|
||
- `opengait/demo/preprocess.py` (Task 3) — `SIL_HEIGHT`, `SIL_WIDTH` constants. The window stores arrays of this shape.
|
||
- `opengait/data/dataset.py` — Shows how OpenGait's DataSet samples fixed-length sequences. The `seqL` parameter controls sequence length (our window_size=30).
|
||
|
||
**API/Type References**:
|
||
- Ultralytics `Results.boxes.id` — Track IDs tensor, may be `None` if tracker hasn't assigned IDs yet
|
||
- Ultralytics `Results.boxes.xyxy` — Bounding boxes `[N, 4]` for area calculation
|
||
- Ultralytics `Results.masks.data` — Binary masks `[N, H, W]`
|
||
|
||
**WHY Each Reference Matters**:
|
||
- `preprocess.py`: Window must store silhouettes of the exact shape produced by preprocessing
|
||
- `dataset.py`: Understanding how training samples sequences helps ensure our window matches
|
||
- Ultralytics API: Need to handle `None` track IDs and extract correct tensors
|
||
|
||
**Acceptance Criteria**:
|
||
- [x] `opengait/demo/window.py` exists with `SilhouetteWindow` class and `select_person` function
|
||
- [x] Buffer is bounded (deque with maxlen)
|
||
- [x] `get_tensor()` returns shape `[1, 1, 30, 64, 44]` when full
|
||
- [x] Track ID change triggers reset
|
||
- [x] Gap exceeding threshold triggers reset
|
||
|
||
**QA Scenarios:**
|
||
|
||
```
|
||
Scenario: Window fills and produces correct tensor shape
|
||
Tool: Bash
|
||
Preconditions: Module importable
|
||
Steps:
|
||
1. Run `uv run python -c "`
|
||
```python
|
||
import numpy as np
|
||
from opengait.demo.window import SilhouetteWindow
|
||
win = SilhouetteWindow(window_size=30, stride=1)
|
||
for i in range(30):
|
||
sil = np.random.rand(64, 44).astype(np.float32)
|
||
win.push(sil, frame_idx=i, track_id=1)
|
||
assert win.is_ready(), 'Window should be ready after 30 frames'
|
||
t = win.get_tensor()
|
||
assert t.shape == (1, 1, 30, 64, 44), f'Bad shape: {t.shape}'
|
||
assert t.dtype.is_floating_point, f'Bad dtype: {t.dtype}'
|
||
print('WINDOW_FILL_OK')
|
||
```
|
||
Expected Result: Prints 'WINDOW_FILL_OK'
|
||
Failure Indicators: Shape mismatch, not ready after 30 pushes
|
||
Evidence: .sisyphus/evidence/task-5-window-fill.txt
|
||
|
||
Scenario: Track ID change resets buffer
|
||
Tool: Bash
|
||
Steps:
|
||
1. Run `uv run python -c "`
|
||
```python
|
||
import numpy as np
|
||
from opengait.demo.window import SilhouetteWindow
|
||
win = SilhouetteWindow(window_size=30)
|
||
for i in range(20):
|
||
win.push(np.random.rand(64, 44).astype(np.float32), frame_idx=i, track_id=1)
|
||
assert win.frame_count == 20
|
||
# Switch track ID — should reset
|
||
win.push(np.random.rand(64, 44).astype(np.float32), frame_idx=20, track_id=2)
|
||
assert win.frame_count == 1, f'Should reset to 1, got {win.frame_count}'
|
||
assert win.current_track_id == 2
|
||
print('TRACK_RESET_OK')
|
||
```
|
||
Expected Result: Prints 'TRACK_RESET_OK'
|
||
Failure Indicators: Buffer not reset, wrong track ID
|
||
Evidence: .sisyphus/evidence/task-5-track-reset.txt
|
||
```
|
||
|
||
**Commit**: YES
|
||
- Message: `feat(demo): add sliding window manager with single-person selection`
|
||
- Files: `opengait/demo/window.py`
|
||
- Pre-commit: `uv run python -c "from opengait.demo.window import SilhouetteWindow"`
|
||
|
||
- [x] 6. NATS JSON Publisher
|
||
|
||
**What to do**:
|
||
- Create `opengait/demo/output.py`
|
||
- Class `ResultPublisher(Protocol)` — any object with `publish(result: dict) -> None`
|
||
- Function `console_publisher() -> Generator` or simple class `ConsolePublisher`:
|
||
- Prints JSON to stdout (default when `--nats-url` is not provided)
|
||
- Format: one JSON object per line (JSONL)
|
||
- Class `NatsPublisher`:
|
||
- Constructor: `__init__(self, url: str = 'nats://127.0.0.1:4222', subject: str = 'scoliosis.result')`
|
||
- Uses `nats-py` async client, bridged to sync `publish()` method
|
||
- Handles connection failures gracefully (log warning, retry with backoff, don't crash pipeline)
|
||
- Handles reconnection automatically (nats-py does this by default)
|
||
- `publish(result: dict) -> None`: serializes to JSON, publishes to subject
|
||
- `close() -> None`: drain and close NATS connection
|
||
- Context manager support (`__enter__`/`__exit__`)
|
||
- JSON schema for results:
|
||
```json
|
||
{
|
||
"frame": 1234,
|
||
"track_id": 1,
|
||
"label": "positive",
|
||
"confidence": 0.82,
|
||
"window": 30,
|
||
"timestamp_ns": 1234567890000
|
||
}
|
||
```
|
||
- Factory: `create_publisher(nats_url: str | None, subject: str = 'scoliosis.result') -> ResultPublisher`
|
||
- If `nats_url` is None → ConsolePublisher
|
||
- Otherwise → NatsPublisher(url, subject)
|
||
|
||
**Must NOT do**:
|
||
- Don't use JetStream (plain NATS PUB/SUB is sufficient)
|
||
- Don't build custom binary protocol
|
||
- Don't buffer/batch results — publish immediately
|
||
|
||
**Recommended Agent Profile**:
|
||
- **Category**: `quick`
|
||
- Reason: Straightforward JSON serialization + NATS client wrapper, well-documented library
|
||
- **Skills**: []
|
||
|
||
**Parallelization**:
|
||
- **Can Run In Parallel**: YES
|
||
- **Parallel Group**: Wave 2 (with Tasks 5, 7, 8)
|
||
- **Blocks**: Tasks 9, 13
|
||
- **Blocked By**: Task 1 (needs project scaffolding for nats-py dependency)
|
||
|
||
**References**:
|
||
|
||
**External References**:
|
||
- nats-py docs: `import nats; nc = await nats.connect(); await nc.publish(subject, data)` — async API
|
||
- `/home/crosstyan/Code/cv-mmap-gui/` — Uses NATS.c for messaging; our Python publisher sends to the same broker
|
||
|
||
**WHY Each Reference Matters**:
|
||
- nats-py: Need to bridge async NATS client to sync `publish()` call
|
||
- cv-mmap-gui: Confirms NATS is the right transport for this ecosystem
|
||
|
||
**Acceptance Criteria**:
|
||
- [x] `opengait/demo/output.py` exists with `ConsolePublisher`, `NatsPublisher`, `create_publisher`
|
||
- [x] ConsolePublisher prints valid JSON to stdout
|
||
- [x] NatsPublisher connects and publishes without crashing (when NATS available)
|
||
- [x] NatsPublisher logs warning and doesn't crash when NATS unavailable
|
||
|
||
**QA Scenarios:**
|
||
|
||
```
|
||
Scenario: ConsolePublisher outputs valid JSONL
|
||
Tool: Bash
|
||
Steps:
|
||
1. Run `uv run python -c "`
|
||
```python
|
||
import json, io, sys
|
||
from opengait.demo.output import create_publisher
|
||
pub = create_publisher(nats_url=None)
|
||
result = {'frame': 100, 'track_id': 1, 'label': 'positive', 'confidence': 0.85, 'window': 30, 'timestamp_ns': 0}
|
||
pub.publish(result) # should print to stdout
|
||
print('CONSOLE_PUB_OK')
|
||
```
|
||
Expected Result: Prints a JSON line followed by 'CONSOLE_PUB_OK'
|
||
Failure Indicators: Invalid JSON, missing fields, crash
|
||
Evidence: .sisyphus/evidence/task-6-console-pub.txt
|
||
|
||
Scenario: NatsPublisher handles missing server gracefully
|
||
Tool: Bash
|
||
Steps:
|
||
1. Run `uv run python -c "`
|
||
```python
|
||
from opengait.demo.output import create_publisher
|
||
try:
|
||
pub = create_publisher(nats_url='nats://127.0.0.1:14222') # wrong port, no server
|
||
pub.publish({'frame': 0, 'label': 'test'})
|
||
except SystemExit:
|
||
print('SHOULD_NOT_EXIT')
|
||
raise
|
||
print('NATS_GRACEFUL_OK')
|
||
```
|
||
Expected Result: Prints 'NATS_GRACEFUL_OK' (warning logged, no crash)
|
||
Failure Indicators: Unhandled exception, SystemExit, hang
|
||
Evidence: .sisyphus/evidence/task-6-nats-graceful.txt
|
||
```
|
||
|
||
**Commit**: YES
|
||
- Message: `feat(demo): add NATS JSON publisher and console fallback`
|
||
- Files: `opengait/demo/output.py`
|
||
- Pre-commit: `uv run python -c "from opengait.demo.output import create_publisher"`
|
||
|
||
- [x] 7. Unit Tests — Silhouette Preprocessing
|
||
|
||
**What to do**:
|
||
- Create `tests/demo/test_preprocess.py`
|
||
- Test `mask_to_silhouette()` with:
|
||
- Valid mask (person-shaped blob) → output shape (64, 44), dtype float32, range [0, 1]
|
||
- Tiny mask below MIN_MASK_AREA → returns None
|
||
- Empty mask (all zeros) → returns None
|
||
- Full-frame mask (all 255) → produces valid output (edge case: very wide person)
|
||
- Tall narrow mask → verify aspect ratio handling (resize h=64, center-crop width)
|
||
- Wide short mask → verify handling (should still produce 64×44)
|
||
- Test determinism: same input always produces same output
|
||
- Test against a reference `.pkl` sample if available:
|
||
- Load a known `.pkl` file from Scoliosis1K
|
||
- Extract one frame
|
||
- Compare our preprocessing output to the stored frame (should be close/identical)
|
||
- Verify jaxtyping annotations are present and beartype checks fire on wrong shapes
|
||
|
||
**Must NOT do**:
|
||
- Don't test YOLO integration here — only test the `mask_to_silhouette` function in isolation
|
||
- Don't require GPU — all preprocessing is CPU numpy ops
|
||
|
||
**Recommended Agent Profile**:
|
||
- **Category**: `unspecified-high`
|
||
- Reason: Must verify pixel-level correctness against training data contract, multiple edge cases
|
||
- **Skills**: []
|
||
|
||
**Parallelization**:
|
||
- **Can Run In Parallel**: YES
|
||
- **Parallel Group**: Wave 2 (with Tasks 5, 6, 8)
|
||
- **Blocks**: None (verification task)
|
||
- **Blocked By**: Task 3 (preprocess module must exist)
|
||
|
||
**References**:
|
||
|
||
**Pattern References**:
|
||
- `opengait/demo/preprocess.py` (Task 3) — The module under test
|
||
- `datasets/pretreatment.py:18-96` — Reference preprocessing to validate against
|
||
- `opengait/data/transform.py:46-58` — `BaseSilCuttingTransform` for expected output contract
|
||
|
||
**WHY Each Reference Matters**:
|
||
- `preprocess.py`: Direct test target
|
||
- `pretreatment.py`: Ground truth for what a correct silhouette looks like
|
||
- `BaseSilCuttingTransform`: Defines the 64→44 cut + /255 contract we must match
|
||
|
||
**Acceptance Criteria**:
|
||
- [x] `tests/demo/test_preprocess.py` exists with ≥5 test cases
|
||
- [x] `uv run pytest tests/demo/test_preprocess.py -q` passes
|
||
- [x] Tests cover: valid mask, tiny mask, empty mask, determinism
|
||
|
||
**QA Scenarios:**
|
||
|
||
```
|
||
Scenario: All preprocessing tests pass
|
||
Tool: Bash
|
||
Preconditions: Task 3 (preprocess.py) is complete
|
||
Steps:
|
||
1. Run `uv run pytest tests/demo/test_preprocess.py -v`
|
||
Expected Result: All tests pass (≥5 tests), exit code 0
|
||
Failure Indicators: Any assertion failure, import error
|
||
Evidence: .sisyphus/evidence/task-7-preprocess-tests.txt
|
||
|
||
Scenario: Jaxtyping annotation enforcement works
|
||
Tool: Bash
|
||
Steps:
|
||
1. Run `uv run python -c "`
|
||
```python
|
||
import numpy as np
|
||
from opengait.demo.preprocess import mask_to_silhouette
|
||
# Intentionally wrong type to verify beartype catches it
|
||
try:
|
||
mask_to_silhouette('not_an_array', (0, 0, 10, 10))
|
||
print('BEARTYPE_MISSED') # should not reach here
|
||
except Exception as e:
|
||
if 'beartype' in type(e).__module__ or 'BeartypeCallHint' in type(e).__name__:
|
||
print('BEARTYPE_OK')
|
||
else:
|
||
print(f'WRONG_ERROR: {type(e).__name__}: {e}')
|
||
```
|
||
Expected Result: Prints 'BEARTYPE_OK'
|
||
Failure Indicators: Prints 'BEARTYPE_MISSED' or 'WRONG_ERROR'
|
||
Evidence: .sisyphus/evidence/task-7-beartype-check.txt
|
||
```
|
||
|
||
**Commit**: YES (groups with Task 8)
|
||
- Message: `test(demo): add preprocessing and model unit tests`
|
||
- Files: `tests/demo/test_preprocess.py`
|
||
- Pre-commit: `uv run pytest tests/demo/test_preprocess.py -q`
|
||
|
||
- [x] 8. Unit Tests — ScoNetDemo Forward Pass
|
||
|
||
**What to do**:
|
||
- Create `tests/demo/test_sconet_demo.py`
|
||
- Test `ScoNetDemo` construction:
|
||
- Loads config from YAML
|
||
- Loads checkpoint weights
|
||
- Model is in eval mode
|
||
- Test `forward()` with dummy tensor:
|
||
- Input: `torch.rand(1, 1, 30, 64, 44)` on available device
|
||
- Output logits shape: `(1, 3, 16)`
|
||
- Output dtype: float32
|
||
- Test `predict()` convenience method:
|
||
- Returns `(label_str, confidence_float)`
|
||
- `label_str` is one of `{'negative', 'neutral', 'positive'}`
|
||
- `confidence` is in `[0.0, 1.0]`
|
||
- Test with various batch sizes: N=1, N=2
|
||
- Test with various sequence lengths if model supports it (should work with 30)
|
||
- Verify no `torch.distributed` calls are made (mock `torch.distributed` to raise if called)
|
||
- Verify jaxtyping shape annotations on forward/predict signatures
|
||
|
||
**Must NOT do**:
|
||
- Don't test with real video data — dummy tensors only for unit tests
|
||
- Don't modify the checkpoint
|
||
|
||
**Recommended Agent Profile**:
|
||
- **Category**: `unspecified-high`
|
||
- Reason: Requires CUDA, checkpoint loading, shape validation — moderate complexity
|
||
- **Skills**: []
|
||
|
||
**Parallelization**:
|
||
- **Can Run In Parallel**: YES
|
||
- **Parallel Group**: Wave 2 (with Tasks 5, 6, 7)
|
||
- **Blocks**: None (verification task)
|
||
- **Blocked By**: Task 2 (ScoNetDemo must exist)
|
||
|
||
**References**:
|
||
|
||
**Pattern References**:
|
||
- `opengait/demo/sconet_demo.py` (Task 1) — The module under test
|
||
- `opengait/evaluation/evaluator.py:evaluate_scoliosis()` (line ~418) — Canonical prediction logic to validate against
|
||
|
||
**Config/Checkpoint References**:
|
||
- `configs/sconet/sconet_scoliosis1k.yaml` — Config file to pass to ScoNetDemo
|
||
- `./ckpt/ScoNet-20000.pt` — Trained checkpoint
|
||
|
||
**WHY Each Reference Matters**:
|
||
- `sconet_demo.py`: Direct test target
|
||
- `evaluator.py`: Defines expected prediction behavior (argmax of mean logits)
|
||
|
||
**Acceptance Criteria**:
|
||
- [x] `tests/demo/test_sconet_demo.py` exists with ≥4 test cases
|
||
- [x] `uv run pytest tests/demo/test_sconet_demo.py -q` passes
|
||
- [x] Tests cover: construction, forward shape, predict output, no-DDP enforcement
|
||
|
||
**QA Scenarios:**
|
||
|
||
```
|
||
Scenario: All ScoNetDemo tests pass
|
||
Tool: Bash
|
||
Preconditions: Task 1 (sconet_demo.py) is complete, checkpoint exists, CUDA available
|
||
Steps:
|
||
1. Run `CUDA_VISIBLE_DEVICES=0 uv run pytest tests/demo/test_sconet_demo.py -v`
|
||
Expected Result: All tests pass (≥4 tests), exit code 0
|
||
Failure Indicators: state_dict key mismatch, shape error, CUDA OOM
|
||
Evidence: .sisyphus/evidence/task-8-sconet-tests.txt
|
||
|
||
Scenario: No DDP leakage in ScoNetDemo
|
||
Tool: Bash
|
||
Steps:
|
||
1. Run `grep -rn 'torch.distributed' opengait/demo/sconet_demo.py`
|
||
2. Run `grep -rn 'ddp_all_gather\|get_ddp_module\|get_rank' opengait/demo/sconet_demo.py`
|
||
Expected Result: Both commands produce no output (exit code 1 = no matches)
|
||
Failure Indicators: Any match found
|
||
Evidence: .sisyphus/evidence/task-8-no-ddp.txt
|
||
```
|
||
|
||
**Commit**: YES (groups with Task 7)
|
||
- Message: `test(demo): add preprocessing and model unit tests`
|
||
- Files: `tests/demo/test_sconet_demo.py`
|
||
- Pre-commit: `uv run pytest tests/demo/test_sconet_demo.py -q`
|
||
|
||
- [x] 9. Main Pipeline Application + CLI
|
||
|
||
**What to do**:
|
||
- Create `opengait/demo/pipeline.py` — the main orchestrator
|
||
- Create `opengait/demo/__main__.py` — CLI entry point (replace stub from Task 4)
|
||
- Pipeline class `ScoliosisPipeline`:
|
||
- Constructor: `__init__(self, source: FrameStream, model: ScoNetDemo, publisher: ResultPublisher, window: SilhouetteWindow, yolo_model_path: str = 'yolo11n-seg.pt', device: str = 'cuda:0')`
|
||
- Uses jaxtyping annotations for all tensor-bearing methods:
|
||
```python
|
||
from jaxtyping import Float, UInt8, jaxtyped
|
||
from beartype import beartype
|
||
from torch import Tensor
|
||
import numpy as np
|
||
from numpy import ndarray
|
||
```
|
||
- `run() -> None` — main loop:
|
||
1. Load YOLO model: `ultralytics.YOLO(yolo_model_path)`
|
||
2. For each `(frame, meta)` from source:
|
||
a. Run `yolo_model.track(frame, persist=True, verbose=False)` → results
|
||
b. `select_person(results)` → `(mask, bbox, track_id)` or None → skip if None
|
||
c. `mask_to_silhouette(mask, bbox)` → `sil` or None → skip if None
|
||
d. `window.push(sil, meta['frame_count'], track_id)`
|
||
e. If `window.should_classify()`:
|
||
- `tensor = window.get_tensor(device=self.device)`
|
||
- `label, confidence = self.model.predict(tensor)`
|
||
- `publisher.publish({...})` with JSON schema fields
|
||
- `window.mark_classified()`
|
||
3. Log FPS every 100 frames
|
||
4. Cleanup on exit (close publisher, release resources)
|
||
- Graceful shutdown on KeyboardInterrupt / SIGTERM
|
||
- CLI via `__main__.py` using `click`:
|
||
- `--source` (required): video path, camera index, or `cvmmap://name`
|
||
- `--checkpoint` (required): path to ScoNet checkpoint
|
||
- `--config` (default: `./configs/sconet/sconet_scoliosis1k.yaml`): ScoNet config YAML
|
||
- `--device` (default: `cuda:0`): torch device
|
||
- `--yolo-model` (default: `yolo11n-seg.pt`): YOLO model path (auto-downloads)
|
||
- `--window` (default: 30): sliding window size
|
||
- `--stride` (default: 30): classify every N frames after window is full
|
||
- `--nats-url` (default: None): NATS server URL, None = console output
|
||
- `--nats-subject` (default: `scoliosis.result`): NATS subject
|
||
- `--max-frames` (default: None): stop after N frames
|
||
- `--help`: print usage
|
||
- Entrypoint: `uv run python -m opengait.demo ...`
|
||
|
||
**Must NOT do**:
|
||
- No async in the main loop — synchronous pull-process-publish
|
||
- No multi-threading for inference — single-threaded pipeline
|
||
- No GUI / frame display / cv2.imshow
|
||
- No unbounded accumulation — ring buffer handles memory
|
||
- No auto-download of ScoNet checkpoint — user must provide path
|
||
|
||
**Recommended Agent Profile**:
|
||
- **Category**: `deep`
|
||
- Reason: Integration of all components, error handling, CLI design, FPS logging — largest single task
|
||
- **Skills**: []
|
||
|
||
**Parallelization**:
|
||
- **Can Run In Parallel**: NO
|
||
- **Parallel Group**: Wave 3 (sequential — depends on most Wave 1+2 tasks)
|
||
- **Blocks**: Tasks 12, 13
|
||
- **Blocked By**: Tasks 2, 3, 4, 5, 6 (all components must exist)
|
||
|
||
**References**:
|
||
|
||
**Pattern References**:
|
||
- `opengait/demo/sconet_demo.py` (Task 1) — `ScoNetDemo` class, `predict()` method
|
||
- `opengait/demo/preprocess.py` (Task 3) — `mask_to_silhouette()`, `frame_to_person_mask()`
|
||
- `opengait/demo/window.py` (Task 5) — `SilhouetteWindow`, `select_person()`
|
||
- `opengait/demo/input.py` (Task 2) — `create_source()`, `FrameStream` type alias
|
||
- `opengait/demo/output.py` (Task 6) — `create_publisher()`, `ResultPublisher`
|
||
|
||
**External References**:
|
||
- Ultralytics tracking API: `model.track(frame, persist=True)` — returns `Results` list
|
||
- Ultralytics result object: `results[0].masks.data`, `results[0].boxes.xyxy`, `results[0].boxes.id`
|
||
|
||
**WHY Each Reference Matters**:
|
||
- All Task refs: This task composes every component — must know each API surface
|
||
- Ultralytics: The YOLO `.track()` call is the only external API used directly in this file
|
||
|
||
**Acceptance Criteria**:
|
||
- [x] `opengait/demo/pipeline.py` exists with `ScoliosisPipeline` class
|
||
- [x] `opengait/demo/__main__.py` exists with click CLI
|
||
- [x] `uv run python -m opengait.demo --help` prints usage without errors
|
||
- [x] All public methods have jaxtyping annotations where tensor/array args are involved
|
||
|
||
**QA Scenarios:**
|
||
|
||
```
|
||
Scenario: CLI --help works
|
||
Tool: Bash
|
||
Steps:
|
||
1. Run `uv run python -m opengait.demo --help`
|
||
Expected Result: Prints usage showing --source, --checkpoint, --device, --window, --stride, --nats-url, --max-frames
|
||
Failure Indicators: ImportError, missing arguments, crash
|
||
Evidence: .sisyphus/evidence/task-9-help.txt
|
||
|
||
Scenario: Pipeline runs with sample video (no NATS)
|
||
Tool: Bash
|
||
Preconditions: sample.mp4 exists (from Task 11), checkpoint exists, CUDA available
|
||
Steps:
|
||
1. Run `CUDA_VISIBLE_DEVICES=0 uv run python -m opengait.demo --source ./assets/sample.mp4 --checkpoint ./ckpt/ScoNet-20000.pt --device cuda:0 --max-frames 120 2>&1 | tee /tmp/pipeline_output.txt`
|
||
2. Count prediction lines: `grep -c '"label"' /tmp/pipeline_output.txt`
|
||
Expected Result: Exit code 0, at least 1 prediction line with valid JSON containing "label" field
|
||
Failure Indicators: Crash, no predictions, invalid JSON, CUDA error
|
||
Evidence: .sisyphus/evidence/task-9-pipeline-run.txt
|
||
|
||
Scenario: Pipeline handles missing video gracefully
|
||
Tool: Bash
|
||
Steps:
|
||
1. Run `uv run python -m opengait.demo --source /nonexistent/video.mp4 --checkpoint ./ckpt/ScoNet-20000.pt 2>&1; echo "EXIT_CODE=$?"`
|
||
Expected Result: Prints error message about missing file, exits with non-zero code (no traceback dump)
|
||
Failure Indicators: Unhandled exception with full traceback, exit code 0
|
||
Evidence: .sisyphus/evidence/task-9-missing-video.txt
|
||
```
|
||
|
||
**Commit**: YES
|
||
- Message: `feat(demo): add main pipeline application with CLI entry point`
|
||
- Files: `opengait/demo/pipeline.py`, `opengait/demo/__main__.py`
|
||
- Pre-commit: `uv run python -m opengait.demo --help`
|
||
|
||
- [x] 10. Unit Tests — Single-Person Policy + Window Reset
|
||
|
||
**What to do**:
|
||
- Create `tests/demo/test_window.py`
|
||
- Test `SilhouetteWindow`:
|
||
- Fill to capacity → `is_ready()` returns True
|
||
- Underfilled → `is_ready()` returns False
|
||
- Track ID change resets buffer
|
||
- Frame gap exceeding threshold resets buffer
|
||
- `get_tensor()` returns correct shape `[1, 1, window_size, 64, 44]`
|
||
- `should_classify()` respects stride
|
||
- Test `select_person()`:
|
||
- Single detection → returns it
|
||
- Multiple detections → returns largest bbox area
|
||
- No detections → returns None
|
||
- Detections without track IDs (tracker not initialized) → returns None
|
||
- Use mock YOLO results (don't require actual YOLO model)
|
||
|
||
**Must NOT do**:
|
||
- Don't require GPU — window tests are CPU-only (get_tensor can use cpu device)
|
||
- Don't require YOLO model file — mock the results
|
||
|
||
**Recommended Agent Profile**:
|
||
- **Category**: `unspecified-high`
|
||
- Reason: Multiple edge cases to cover, mock YOLO results require understanding ultralytics API
|
||
- **Skills**: []
|
||
|
||
**Parallelization**:
|
||
- **Can Run In Parallel**: YES
|
||
- **Parallel Group**: Wave 3 (with Tasks 9, 11)
|
||
- **Blocks**: None (verification task)
|
||
- **Blocked By**: Task 5 (window module must exist)
|
||
|
||
**References**:
|
||
|
||
**Pattern References**:
|
||
- `opengait/demo/window.py` (Task 5) — Module under test
|
||
|
||
**WHY Each Reference Matters**:
|
||
- Direct test target
|
||
|
||
**Acceptance Criteria**:
|
||
- [x] `tests/demo/test_window.py` exists with ≥6 test cases
|
||
- [x] `uv run pytest tests/demo/test_window.py -q` passes
|
||
|
||
**QA Scenarios:**
|
||
|
||
```
|
||
Scenario: All window and single-person tests pass
|
||
Tool: Bash
|
||
Steps:
|
||
1. Run `uv run pytest tests/demo/test_window.py -v`
|
||
Expected Result: All tests pass (≥6 tests), exit code 0
|
||
Failure Indicators: Assertion failures, import errors
|
||
Evidence: .sisyphus/evidence/task-10-window-tests.txt
|
||
```
|
||
|
||
**Commit**: YES
|
||
- Message: `test(demo): add window manager and single-person policy tests`
|
||
- Files: `tests/demo/test_window.py`
|
||
- Pre-commit: `uv run pytest tests/demo/test_window.py -q`
|
||
|
||
- [x] 11. Sample Video for Smoke Testing
|
||
|
||
**What to do**:
|
||
- Acquire or create a short sample video for pipeline smoke testing
|
||
- Options (in order of preference):
|
||
1. Extract a short clip (5-10 seconds, ~120 frames at 15fps) from an existing Scoliosis1K raw video if accessible
|
||
2. Record a short clip using webcam via `cv2.VideoCapture(0)`
|
||
3. Generate a synthetic video with a person-shaped blob moving across frames
|
||
- Save to `./assets/sample.mp4` (or `./assets/sample.avi`)
|
||
- Requirements: contains at least one person walking, 720p or lower, ≥60 frames
|
||
- If no real video is available, create a synthetic one:
|
||
- 120 frames, 640×480, 15fps
|
||
- White rectangle (simulating person silhouette) moving across dark background
|
||
- This won't test YOLO detection quality but will verify pipeline doesn't crash
|
||
- Add `assets/sample.mp4` to `.gitignore` if it's large (>10MB)
|
||
|
||
**Must NOT do**:
|
||
- Don't use any Scoliosis1K dataset files that are symlinked (user constraint)
|
||
- Don't commit large video files to git
|
||
|
||
**Recommended Agent Profile**:
|
||
- **Category**: `quick`
|
||
- Reason: Simple file creation/acquisition task
|
||
- **Skills**: []
|
||
|
||
**Parallelization**:
|
||
- **Can Run In Parallel**: YES
|
||
- **Parallel Group**: Wave 3 (with Tasks 9, 10)
|
||
- **Blocks**: Task 12
|
||
- **Blocked By**: Task 1 (needs OpenCV dependency from scaffolding)
|
||
|
||
**References**: None needed — standalone task
|
||
|
||
**Acceptance Criteria**:
|
||
- [x] `./assets/sample.mp4` (or `.avi`) exists
|
||
- [x] Video has ≥60 frames
|
||
- [x] Playable with `uv run python -c "import cv2; cap=cv2.VideoCapture('./assets/sample.mp4'); print(f'frames={int(cap.get(7))}'); cap.release()"`
|
||
|
||
**QA Scenarios:**
|
||
|
||
```
|
||
Scenario: Sample video is valid
|
||
Tool: Bash
|
||
Steps:
|
||
1. Run `uv run python -c "`
|
||
```python
|
||
import cv2
|
||
cap = cv2.VideoCapture('./assets/sample.mp4')
|
||
assert cap.isOpened(), 'Cannot open video'
|
||
n = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
|
||
assert n >= 60, f'Too few frames: {n}'
|
||
ret, frame = cap.read()
|
||
assert ret and frame is not None, 'Cannot read first frame'
|
||
h, w = frame.shape[:2]
|
||
assert h >= 240 and w >= 320, f'Too small: {w}x{h}'
|
||
cap.release()
|
||
print(f'SAMPLE_VIDEO_OK frames={n} size={w}x{h}')
|
||
```
|
||
Expected Result: Prints 'SAMPLE_VIDEO_OK' with frame count ≥60
|
||
Failure Indicators: Cannot open, too few frames, too small
|
||
Evidence: .sisyphus/evidence/task-11-sample-video.txt
|
||
```
|
||
|
||
**Commit**: YES
|
||
- Message: `chore(demo): add sample video for smoke testing`
|
||
- Files: `assets/sample.mp4` (or add to .gitignore and document)
|
||
- Pre-commit: none
|
||
|
||
---
|
||
|
||
- [x] 12. Integration Tests — End-to-End Smoke Test
|
||
|
||
**What to do**:
|
||
- Create `tests/demo/test_pipeline.py`
|
||
- Integration test: run the full pipeline with sample video, no NATS
|
||
- Uses `subprocess.run()` to invoke `python -m opengait.demo`
|
||
- Captures stdout, parses JSON predictions
|
||
- Asserts: exit code 0, ≥1 prediction, valid JSON schema
|
||
- Test graceful exit on end-of-video
|
||
- Test `--max-frames` flag: run with max_frames=60, verify it stops
|
||
- Test error handling: invalid source path → non-zero exit, error message
|
||
- Test error handling: invalid checkpoint path → non-zero exit, error message
|
||
- FPS benchmark (informational, not a hard assertion):
|
||
- Run pipeline on sample video, measure wall time, compute FPS
|
||
- Log FPS to evidence file (target: ≥15 FPS on desktop GPU)
|
||
|
||
**Must NOT do**:
|
||
- Don't require NATS server for this test — use console publisher
|
||
- Don't hardcode CUDA device — use `--device cuda:0` only if CUDA available, else skip
|
||
|
||
**Recommended Agent Profile**:
|
||
- **Category**: `deep`
|
||
- Reason: Full integration test requiring all components working together, subprocess management, JSON parsing
|
||
- **Skills**: []
|
||
|
||
**Parallelization**:
|
||
- **Can Run In Parallel**: YES
|
||
- **Parallel Group**: Wave 4 (with Task 13)
|
||
- **Blocks**: F1-F4 (Final verification)
|
||
- **Blocked By**: Tasks 9 (pipeline), 11 (sample video)
|
||
|
||
**References**:
|
||
|
||
**Pattern References**:
|
||
- `opengait/demo/__main__.py` (Task 9) — CLI flags to invoke
|
||
- `opengait/demo/output.py` (Task 6) — JSON schema to validate
|
||
|
||
**WHY Each Reference Matters**:
|
||
- `__main__.py`: Need exact CLI flag names for subprocess invocation
|
||
- `output.py`: Need JSON schema to assert against
|
||
|
||
**Acceptance Criteria**:
|
||
- [x] `tests/demo/test_pipeline.py` exists with ≥4 test cases
|
||
- [x] `CUDA_VISIBLE_DEVICES=0 uv run pytest tests/demo/test_pipeline.py -q` passes
|
||
- [x] Tests cover: happy path, max-frames, invalid source, invalid checkpoint
|
||
|
||
**QA Scenarios:**
|
||
|
||
```
|
||
Scenario: Full pipeline integration test passes
|
||
Tool: Bash
|
||
Preconditions: All components built, sample video exists, CUDA available
|
||
Steps:
|
||
1. Run `CUDA_VISIBLE_DEVICES=0 uv run pytest tests/demo/test_pipeline.py -v --timeout=120`
|
||
Expected Result: All tests pass (≥4), exit code 0
|
||
Failure Indicators: Subprocess crash, JSON parse error, timeout
|
||
Evidence: .sisyphus/evidence/task-12-integration.txt
|
||
|
||
Scenario: FPS benchmark
|
||
Tool: Bash
|
||
Steps:
|
||
1. Run `CUDA_VISIBLE_DEVICES=0 uv run python -c "`
|
||
```python
|
||
import subprocess, time
|
||
start = time.monotonic()
|
||
result = subprocess.run(
|
||
['uv', 'run', 'python', '-m', 'opengait.demo',
|
||
'--source', './assets/sample.mp4',
|
||
'--checkpoint', './ckpt/ScoNet-20000.pt',
|
||
'--device', 'cuda:0', '--nats-url', ''],
|
||
capture_output=True, text=True, timeout=120)
|
||
elapsed = time.monotonic() - start
|
||
import cv2
|
||
cap = cv2.VideoCapture('./assets/sample.mp4')
|
||
n_frames = int(cap.get(7)); cap.release()
|
||
fps = n_frames / elapsed if elapsed > 0 else 0
|
||
print(f'FPS_BENCHMARK frames={n_frames} elapsed={elapsed:.1f}s fps={fps:.1f}')
|
||
assert fps >= 5, f'FPS too low: {fps}' # conservative threshold
|
||
```
|
||
Expected Result: Prints FPS benchmark, ≥5 FPS (conservative)
|
||
Failure Indicators: Timeout, crash, FPS < 5
|
||
Evidence: .sisyphus/evidence/task-12-fps-benchmark.txt
|
||
```
|
||
|
||
**Commit**: YES
|
||
- Message: `test(demo): add integration and end-to-end smoke tests`
|
||
- Files: `tests/demo/test_pipeline.py`
|
||
- Pre-commit: `CUDA_VISIBLE_DEVICES=0 uv run pytest tests/demo/test_pipeline.py -q`
|
||
|
||
- [x] 13. NATS Integration Test
|
||
|
||
**What to do**:
|
||
- Create `tests/demo/test_nats.py`
|
||
- Test requires NATS server (use Docker: `docker run -d --rm --name nats-test -p 4222:4222 nats:2`)
|
||
- Mark tests with `@pytest.mark.skipif` if Docker/NATS not available
|
||
- Test flow:
|
||
1. Start NATS container
|
||
2. Start a `nats-py` subscriber on `scoliosis.result`
|
||
3. Run pipeline with `--nats-url nats://127.0.0.1:4222 --max-frames 60`
|
||
4. Collect received messages
|
||
5. Assert: ≥1 message received, valid JSON, correct schema
|
||
6. Stop NATS container
|
||
- Test NatsPublisher reconnection: start pipeline, stop NATS, restart NATS — pipeline should recover
|
||
- JSON schema validation:
|
||
- `frame`: int
|
||
- `track_id`: int
|
||
- `label`: str in {"negative", "neutral", "positive"}
|
||
- `confidence`: float in [0, 1]
|
||
- `window`: int (should equal window_size)
|
||
- `timestamp_ns`: int
|
||
|
||
**Must NOT do**:
|
||
- Don't leave Docker containers running after test
|
||
- Don't hardcode NATS port — use a fixture that finds an open port
|
||
|
||
**Recommended Agent Profile**:
|
||
- **Category**: `unspecified-high`
|
||
- Reason: Docker orchestration + async NATS subscriber + subprocess pipeline — moderate complexity
|
||
- **Skills**: []
|
||
|
||
**Parallelization**:
|
||
- **Can Run In Parallel**: YES
|
||
- **Parallel Group**: Wave 4 (with Task 12)
|
||
- **Blocks**: F1-F4 (Final verification)
|
||
- **Blocked By**: Tasks 9 (pipeline), 6 (NATS publisher)
|
||
|
||
**References**:
|
||
|
||
**Pattern References**:
|
||
- `opengait/demo/output.py` (Task 6) — `NatsPublisher` class, JSON schema
|
||
|
||
**External References**:
|
||
- nats-py subscriber: `sub = await nc.subscribe('scoliosis.result'); msg = await sub.next_msg(timeout=10)`
|
||
- Docker NATS: `docker run -d --rm --name nats-test -p 4222:4222 nats:2`
|
||
|
||
**WHY Each Reference Matters**:
|
||
- `output.py`: Need to match the exact subject and JSON schema the publisher produces
|
||
- nats-py: Need subscriber API to consume and validate messages
|
||
|
||
**Acceptance Criteria**:
|
||
- [x] `tests/demo/test_nats.py` exists with ≥2 test cases
|
||
- [x] Tests are skippable when Docker/NATS not available
|
||
- [x] `CUDA_VISIBLE_DEVICES=0 uv run pytest tests/demo/test_nats.py -q` passes (when Docker available)
|
||
|
||
**QA Scenarios:**
|
||
|
||
```
|
||
Scenario: NATS receives valid prediction JSON
|
||
Tool: Bash
|
||
Preconditions: Docker available, CUDA available, sample video exists
|
||
Steps:
|
||
1. Run `docker run -d --rm --name nats-test -p 4222:4222 nats:2`
|
||
2. Run `CUDA_VISIBLE_DEVICES=0 uv run pytest tests/demo/test_nats.py -v --timeout=60`
|
||
3. Run `docker stop nats-test`
|
||
Expected Result: Tests pass, at least one valid JSON message received on scoliosis.result
|
||
Failure Indicators: No messages, invalid JSON, schema mismatch, timeout
|
||
Evidence: .sisyphus/evidence/task-13-nats-integration.txt
|
||
|
||
Scenario: NATS test is skipped when Docker unavailable
|
||
Tool: Bash
|
||
Preconditions: Docker NOT running or not installed
|
||
Steps:
|
||
1. Run `uv run pytest tests/demo/test_nats.py -v -k 'nats' 2>&1 | head -20`
|
||
Expected Result: Tests show as SKIPPED (not FAILED)
|
||
Failure Indicators: Test fails instead of skipping
|
||
Evidence: .sisyphus/evidence/task-13-nats-skip.txt
|
||
```
|
||
|
||
**Commit**: YES
|
||
- Message: `test(demo): add NATS integration tests`
|
||
- Files: `tests/demo/test_nats.py`
|
||
- Pre-commit: `uv run pytest tests/demo/test_nats.py -q` (skips if no Docker)
|
||
|
||
---
|
||
|
||
## Final Verification Wave (MANDATORY — after ALL implementation tasks)
|
||
|
||
> 4 review agents run in PARALLEL. ALL must APPROVE. Rejection → fix → re-run.
|
||
|
||
- [x] F1. **Plan Compliance Audit** — `oracle`
|
||
Read the plan end-to-end. For each "Must Have": verify implementation exists (read file, run command). For each "Must NOT Have": search codebase for forbidden patterns (torch.distributed imports in demo/, BaseModel subclassing). Check evidence files exist in .sisyphus/evidence/. Compare deliverables against plan.
|
||
Output: `Must Have [N/N] | Must NOT Have [N/N] | Tasks [N/N] | VERDICT: APPROVE/REJECT`
|
||
|
||
- [x] F2. **Code Quality Review** — `unspecified-high`
|
||
Run linter + `uv run pytest tests/demo/ -q`. Review all new files in `opengait/demo/` for: `as any`/type:ignore, empty catches, print statements used instead of logging, commented-out code, unused imports. Check AI slop: excessive comments, over-abstraction, generic variable names.
|
||
Output: `Tests [N pass/N fail] | Files [N clean/N issues] | VERDICT`
|
||
|
||
- [x] F3. **Real Manual QA** — `unspecified-high`
|
||
Start from clean state. Run pipeline with sample video: `uv run python -m opengait.demo --source ./assets/sample.mp4 --checkpoint ./ckpt/ScoNet-20000.pt --max-frames 120`. Verify predictions are printed to console (no `--nats-url` = console output). Run with NATS: start container, run pipeline with `--nats-url nats://127.0.0.1:4222`, subscribe and validate JSON schema. Test edge cases: missing video file (graceful error), no checkpoint (graceful error), --help flag.
|
||
Output: `Scenarios [N/N pass] | Edge Cases [N tested] | VERDICT`
|
||
|
||
- [x] F4. **Scope Fidelity Check** — `deep`
|
||
For each task: read "What to do", read actual files created. Verify 1:1 — everything in spec was built (no missing), nothing beyond spec was built (no creep). Check "Must NOT do" compliance: no torch.distributed in demo/, no BaseModel subclass, no TensorRT code, no multi-person logic. Flag unaccounted changes.
|
||
Output: `Tasks [N/N compliant] | Scope [CLEAN/N issues] | VERDICT`
|
||
|
||
---
|
||
|
||
## Commit Strategy
|
||
|
||
- **Wave 1**: `feat(demo): add ScoNetDemo inference wrapper` — sconet_demo.py
|
||
- **Wave 1**: `feat(demo): add input adapters and silhouette preprocessing` — input.py, preprocess.py
|
||
- **Wave 1**: `chore(demo): scaffold demo package and test infrastructure` — __init__.py, conftest, pyproject.toml
|
||
- **Wave 2**: `feat(demo): add sliding window manager and NATS publisher` — window.py, output.py
|
||
- **Wave 2**: `test(demo): add preprocessing and model unit tests` — test_preprocess.py, test_sconet_demo.py
|
||
- **Wave 3**: `feat(demo): add main pipeline application with CLI` — pipeline.py, __main__.py
|
||
- **Wave 3**: `test(demo): add window manager and single-person policy tests` — test_window.py
|
||
- **Wave 4**: `test(demo): add integration and NATS tests` — test_pipeline.py, test_nats.py
|
||
|
||
---
|
||
|
||
## Success Criteria
|
||
|
||
### Verification Commands
|
||
```bash
|
||
# Smoke test (no NATS)
|
||
uv run python -m opengait.demo --source ./assets/sample.mp4 --checkpoint ./ckpt/ScoNet-20000.pt --max-frames 120
|
||
# Expected: exits 0, prints ≥3 prediction lines with label in {negative, neutral, positive}
|
||
|
||
# Unit tests
|
||
uv run pytest tests/demo/ -q
|
||
# Expected: all tests pass
|
||
|
||
# Help flag
|
||
uv run python -m opengait.demo --help
|
||
# Expected: prints usage with --source, --checkpoint, --device, --window, --stride, --nats-url, --max-frames
|
||
```
|
||
|
||
### Final Checklist
|
||
- [x] All "Must Have" present
|
||
- [x] All "Must NOT Have" absent
|
||
- [x] All tests pass
|
||
- [x] Pipeline runs at ≥15 FPS on desktop GPU
|
||
- [x] JSON schema matches spec
|
||
- [x] No torch.distributed imports in opengait/demo/
|