Files

T

crosstyan 3496a1beb7 docs(sisyphus): record sconet-pipeline plan and verification trail

Persist orchestration artifacts, including plan definition, progress state, decisions, issues, and learnings gathered during delegated execution and QA gates. This preserves implementation rationale and auditability without coupling documentation snapshots to runtime logic commits.

2026-02-27 09:59:26 +08:00

70 KiB

Raw Blame History

Real-Time Scoliosis Screening Pipeline (ScoNet)

TL;DR

Quick Summary: Build a production-oriented, real-time scoliosis screening pipeline inside the OpenGait repo. Reads video from cv-mmap shared memory or OpenCV, runs YOLO11n-seg for person detection/segmentation/tracking, extracts 64×44 binary silhouettes, feeds a sliding window of 30 frames into a DDP-free ScoNet wrapper, and publishes classification results (positive/neutral/negative) as JSON over NATS.

Deliverables:

ScoNetDemo — standalone nn.Module wrapper for ScoNet inference (no DDP)

Silhouette preprocessing module — deterministic mask→64×44 float tensor pipeline

Ring buffer / sliding window manager — per-track frame accumulation with reset logic

Input adapters — cv-mmap async client + OpenCV VideoCapture fallback

NATS publisher — JSON result output

Main pipeline application — orchestrates all components

pytest test suite — preprocessing, windowing, single-person policy, recovery

Sample video for smoke testing

Estimated Effort: Large Parallel Execution: YES — 4 waves Critical Path: Task 1 (ScoNetDemo) → Task 3 (Silhouette Preprocess) → Task 5 (Ring Buffer) → Task 7 (Pipeline App) → Task 10 (Integration Tests)

Context

Original Request

Build a camera-to-classification pipeline for scoliosis screening using OpenGait's ScoNet model, targeting Jetson AGX Orin deployment, with cv-mmap shared-memory video framework integration.

Interview Summary

Key Discussions:

Input: Both camera (via cv-mmap) and video files (OpenCV fallback). Single person only.
CV Stack: YOLO11n-seg with built-in ByteTrack (replaces paper's BYTETracker + PP-HumanSeg)
Inference: Sliding window of 30 frames, continuous classification
Output: JSON over NATS (decided over binary protocol — simpler, cross-language)
DDP Bypass: Create ScoNetDemo(nn.Module) following All-in-One-Gait's BaselineDemo pattern
Build Location: Inside repo (opengait lacks __init__.py, config system hardcodes paths)
Test Strategy: pytest, tests after implementation
Hardware: Dev on i7-14700KF + RTX 5070 Ti/3090; deploy on Jetson AGX Orin

Research Findings:

ScoNet input: [N, 1, S, 64, 44] float32 [0,1]. Output: logits [N, 3, 16] → argmax(mean(-1)) → class index
.pkl preprocessing: grayscale → crop vertical → resize h=64 → center-crop w=64 → cut 10px sides → /255.0
BaseSilCuttingTransform: cuts int(W // 64) * 10 px each side + divides by 255
All-in-One-Gait BaselineDemo: extends nn.Module, uses torch.load() + load_state_dict(), training=False
YOLO11n-seg: 6MB, ~50-60 FPS, model.track(frame, persist=True) → bbox + mask + track_id
cv-mmap Python client: async for im, meta in CvMmapClient("name") — zero-copy numpy

Metis Review

Identified Gaps (addressed):

Single-person policy undefined → Defined: largest-bbox selection, ignore others, reset window on ID change
Sliding window stride undefined → Defined: stride=1 (every frame updates buffer), classify every N frames (configurable)
No-detection / empty mask handling → Defined: skip frame, don't reset window unless gap exceeds threshold
Mask quality / partial body → Defined: minimum mask area threshold to accept frame
Track ID reset / re-identification → Defined: reset ring buffer on track ID change
YOLO letterboxing → Defined: use result.masks.data in original frame coords, not letterboxed
Async/sync impedance → Defined: synchronous pull-process-publish loop (no async queues in MVP)
Scope creep lockdown → Explicitly excluded: TensorRT, DeepStream, multi-person, GUI, recording, auto-tuning

Work Objectives

Core Objective

Create a self-contained scoliosis screening pipeline that runs standalone (no DDP, no OpenGait training infrastructure) while reusing ScoNet's trained weights and matching its exact preprocessing contract.

Prerequisites (already present in repo)

Checkpoint: ./ckpt/ScoNet-20000.pt — trained ScoNet weights (verified working, 80.88% accuracy on Scoliosis1K eval). Already exists in the repo, no download needed.
Config: ./configs/sconet/sconet_scoliosis1k.yaml — ScoNet architecture config. Already exists.

Concrete Deliverables

opengait/demo/sconet_demo.py — ScoNetDemo nn.Module wrapper
opengait/demo/preprocess.py — Silhouette extraction and normalization
opengait/demo/window.py — Sliding window / ring buffer manager
opengait/demo/input.py — Input adapters (cv-mmap + OpenCV)
opengait/demo/output.py — NATS JSON publisher
opengait/demo/pipeline.py — Main pipeline orchestrator
opengait/demo/__main__.py — CLI entry point
tests/demo/test_preprocess.py — Preprocessing unit tests
tests/demo/test_window.py — Ring buffer + single-person policy tests
tests/demo/test_pipeline.py — Integration / smoke tests
tests/demo/test_pipeline.py — Integration / smoke tests

Definition of Done

uv run python -m opengait.demo --source ./assets/sample.mp4 --checkpoint ./ckpt/ScoNet-20000.pt --max-frames 120 exits 0 and prints predictions (no NATS by default when --nats-url not provided)
uv run pytest tests/demo/ -q passes all tests
Pipeline processes ≥15 FPS on desktop GPU with 720p input
JSON schema validated: {"frame": int, "track_id": int, "label": str, "confidence": float, "window": int, "timestamp_ns": int}

Must Have

Deterministic preprocessing matching ScoNet training data exactly (64×44, float32, [0,1])
Single-person selection (largest bbox) with consistent tracking
Sliding window of 30 frames with reset on track loss/ID change
Graceful handling of: no detection, end of video, cv-mmap disconnect
CLI with --source, --checkpoint, --device, --window, --stride, --nats-url, --max-frames flags (using click)
Works without NATS server when --nats-url is omitted (console output fallback)
All tensor/array function signatures annotated with jaxtyping types (e.g., Float[Tensor, 'batch 1 seq 64 44']) and checked at runtime with beartype via @jaxtyped(typechecker=beartype) decorators
Generator-based input adapters — any Iterable[tuple[np.ndarray, dict]] works as a source

Must NOT Have (Guardrails)

No DDP: Demo must never import or call torch.distributed anything
No BaseModel subclassing: ScoNetDemo extends nn.Module directly
No repo restructuring: Don't touch existing opengait training/eval/data code
No TensorRT/DeepStream: Jetson acceleration is out of MVP scope
No multi-person: Single tracked person only
No GUI/visualization: Output is JSON, not rendered frames
No dataset recording/auto-labeling: This is inference only
No OpenCV GStreamer builds: Use pip-installed OpenCV
No magic preprocessing: Every transform step must be explicit and testable
No unbounded buffers: Every queue/buffer has a max size and drop policy

Verification Strategy

ZERO HUMAN INTERVENTION — ALL verification is agent-executed. No exceptions.

Test Decision

Infrastructure exists: NO (creating with this plan)
Automated tests: Tests after implementation (pytest)
Framework: pytest (via uv run pytest)
Setup: Add pytest to dev dependencies in pyproject.toml

QA Policy

Every task MUST include agent-executed QA scenarios. Evidence saved to .sisyphus/evidence/task-{N}-{scenario-slug}.{ext}.

CLI/Pipeline: Use Bash — run pipeline with sample video, validate output
Unit Tests: Use Bash — uv run pytest specific test files
NATS Integration: Use Bash — start NATS container, run pipeline, subscribe and validate JSON

Execution Strategy

Parallel Execution Waves

Wave 1 (Foundation — all independent, start immediately):
├── Task 1: Project scaffolding (opengait/demo/, tests/demo/, deps) [quick]
├── Task 2: ScoNetDemo nn.Module wrapper [deep]
├── Task 3: Silhouette preprocessing module [deep]
└── Task 4: Input adapters (cv-mmap + OpenCV) [unspecified-high]

Wave 2 (Core logic — depends on Wave 1 foundations):
├── Task 5: Ring buffer / sliding window manager (depends: 3) [unspecified-high]
├── Task 6: NATS JSON publisher (depends: 1) [quick]
├── Task 7: Unit tests — preprocessing (depends: 3) [unspecified-high]
└── Task 8: Unit tests — ScoNetDemo forward pass (depends: 2) [unspecified-high]

Wave 3 (Integration — combines all components):
├── Task 9: Main pipeline application + CLI (depends: 2,3,4,5,6) [deep]
├── Task 10: Single-person policy tests (depends: 5) [unspecified-high]
└── Task 11: Sample video acquisition (depends: 1) [quick]

Wave 4 (Verification — end-to-end):
├── Task 12: Integration tests + smoke test (depends: 9,11) [deep]
└── Task 13: NATS integration test (depends: 9,6) [unspecified-high]

Wave FINAL (Independent review — 4 parallel):
├── Task F1: Plan compliance audit (oracle)
├── Task F2: Code quality review (unspecified-high)
├── Task F3: Real manual QA (unspecified-high)
└── Task F4: Scope fidelity check (deep)

Critical Path: Task 1 → Task 2 → Task 3 → Task 5 → Task 9 → Task 12 → F1-F4
Parallel Speedup: ~60% faster than sequential
Max Concurrent: 4 (Waves 1 & 2)

Dependency Matrix

Task	Depends On	Blocks	Wave
1	—	6, 11	1
2	—	8, 9	1
3	—	5, 7, 9	1
4	—	9	1
5	3	9, 10	2
6	1	9, 13	2
7	3	—	2
8	2	—	2
9	2, 3, 4, 5, 6	12, 13	3
10	5	—	3
11	1	12	3
12	9, 11	F1-F4	4
13	9, 6	F1-F4	4
F1-F4	12, 13	—	FINAL

Agent Dispatch Summary

Wave 1: 4 — T1 → quick, T2 → deep, T3 → deep, T4 → unspecified-high
Wave 2: 4 — T5 → unspecified-high, T6 → quick, T7 → unspecified-high, T8 → unspecified-high
Wave 3: 3 — T9 → deep, T10 → unspecified-high, T11 → quick
Wave 4: 2 — T12 → deep, T13 → unspecified-high
FINAL: 4 — F1 → oracle, F2 → unspecified-high, F3 → unspecified-high, F4 → deep

TODOs

Implementation + Test = ONE Task. Never separate. EVERY task MUST have: Recommended Agent Profile + Parallelization info + QA Scenarios.

1. Project Scaffolding + Dependencies

What to do:
- Create opengait/demo/__init__.py (empty, makes it a package)
- Create opengait/demo/__main__.py (stub: from .pipeline import main; main())
- Create tests/demo/__init__.py and tests/__init__.py if missing
- Create tests/demo/conftest.py with shared fixtures (sample tensor, mock frame)
- Add dev dependencies to pyproject.toml: pytest, nats-py, ultralytics, jaxtyping, beartype, click
- Verify: uv sync --extra torch succeeds with new deps
- Verify: uv run python -c "import ultralytics; import nats; import jaxtyping; import beartype; import click" works
Must NOT do:
- Don't modify existing opengait code or imports
- Don't add runtime deps that aren't needed (no flask, no fastapi, etc.)
Recommended Agent Profile:
- Category: quick
  - Reason: Boilerplate file creation and dependency management, no complex logic
- Skills: []
- Skills Evaluated but Omitted:
  - explore: Not needed — we know exactly what files to create
Parallelization:
- Can Run In Parallel: YES
- Parallel Group: Wave 1 (with Tasks 1, 2, 3)
- Blocks: Tasks 6, 11
- Blocked By: None (can start immediately)
References:

Pattern References:
- opengait/modeling/models/__init__.py — Example of package init in this repo
- pyproject.toml — Current dependency structure; add to [project.optional-dependencies] or [dependency-groups]
External References:
- ultralytics pip package: pip install ultralytics (includes YOLO + ByteTrack)
- nats-py: pip install nats-py (async NATS client)
WHY Each Reference Matters:
- pyproject.toml: Must match existing dep management style (uv + groups) to avoid breaking uv sync
- opengait/modeling/models/__init__.py: Shows the repo's package init convention (dynamic imports vs empty)
Acceptance Criteria:
- opengait/demo/__init__.py exists
- opengait/demo/__main__.py exists with stub entry point
- tests/demo/conftest.py exists with at least one fixture
- uv sync succeeds without errors
- uv run python -c "import ultralytics; import nats; import jaxtyping; import beartype; import click; print('OK')" prints OK
QA Scenarios:
```
Scenario: Dependencies install correctly
  Tool: Bash
  Preconditions: Clean uv environment
  Steps:
    1. Run `uv sync --extra torch`
    2. Run `uv run python -c "import ultralytics; import nats; import pytest; import jaxtyping; import beartype; import click; print('ALL_DEPS_OK')"`
  Expected Result: Both commands exit 0, second prints 'ALL_DEPS_OK'
  Failure Indicators: ImportError, uv sync failure, missing package
  Evidence: .sisyphus/evidence/task-4-deps-install.txt

Scenario: Package structure is importable
  Tool: Bash
  Preconditions: uv sync completed
  Steps:
    1. Run `uv run python -c "from opengait.demo import __main__; print('IMPORT_OK')"`
  Expected Result: Prints 'IMPORT_OK' without errors
  Failure Indicators: ModuleNotFoundError, ImportError
  Evidence: .sisyphus/evidence/task-4-import-check.txt
```
Commit: YES
- Message: chore(demo): scaffold demo package and test infrastructure
- Files: opengait/demo/__init__.py, opengait/demo/__main__.py, tests/demo/conftest.py, tests/demo/__init__.py, tests/__init__.py, pyproject.toml
- Pre-commit: uv sync && uv run python -c "import ultralytics; import nats; import jaxtyping; import beartype; import click"
2. ScoNetDemo — DDP-Free Inference Wrapper

What to do:
- Create opengait/demo/sconet_demo.py
- Class ScoNetDemo(nn.Module) — NOT a BaseModel subclass
- Constructor takes cfg_path: str and checkpoint_path: str
- Use config_loader from opengait/utils/common.py to parse YAML config
- Build the ScoNet architecture layers directly:
  - Backbone (ResNet9 from opengait/modeling/backbones/resnet.py)
  - TemporalPool (from opengait/modeling/modules.py)
  - HorizontalPoolingPyramid (from opengait/modeling/modules.py)
  - SeparateFCs (from opengait/modeling/modules.py)
  - SeparateBNNecks (from opengait/modeling/modules.py)
- Load checkpoint: torch.load(checkpoint_path, map_location=device) → extract state_dict → load_state_dict()
- Handle checkpoint format: may be {'model': state_dict, ...} or plain state_dict
- Strip module. prefix from DDP-wrapped keys if present
- All public methods decorated with @jaxtyped(typechecker=beartype) for runtime shape checking
- forward(sils: Float[Tensor, 'batch 1 seq 64 44']) -> dict where seq=30 (window size)
  - Use jaxtyping: from jaxtyping import Float, Int, jaxtyped
  - Use beartype: from beartype import beartype
- Returns {'logits': Float[Tensor, 'batch 3 16'], 'label': int, 'confidence': float}
- predict(sils: Float[Tensor, 'batch 1 seq 64 44']) -> tuple[str, float] convenience method: returns ('positive'|'neutral'|'negative', confidence)
- Prediction logic: argmax(logits.mean(dim=-1), dim=-1) → index → label string
- Confidence: softmax(logits.mean(dim=-1)).max() — probability of chosen class
- Class mapping: {0: 'negative', 1: 'neutral', 2: 'positive'}
Must NOT do:
- Do NOT import anything from torch.distributed
- Do NOT subclass BaseModel
- Do NOT use ddp_all_gather or get_ddp_module
- Do NOT modify sconet.py or any existing model file
Recommended Agent Profile:
- Category: deep
  - Reason: Requires understanding ScoNet's architecture, checkpoint format, and careful weight loading — high complexity, correctness-critical
- Skills: []
- Skills Evaluated but Omitted:
  - explore: Agent should read referenced files directly, not search broadly
Parallelization:
- Can Run In Parallel: YES
- Parallel Group: Wave 1 (with Tasks 2, 3, 4)
- Blocks: Tasks 8, 9
- Blocked By: None (can start immediately)
References:

Pattern References:
- opengait/modeling/models/sconet.py — ScoNet model definition. Study __init__ to see which submodules are built and how forward() assembles the pipeline. Lines ~10-54.
- opengait/modeling/base_model.py — BaseModel class. Study __init__ (lines ~30-80) to see how it builds backbone/heads from config. Replicate the build logic WITHOUT DDP calls.
- All-in-One-Gait BaselineDemo pattern: extends nn.Module directly, uses torch.load() + load_state_dict() with training=False
API/Type References:
- opengait/modeling/backbones/resnet.py — ResNet9 backbone class. Constructor signature and forward signature.
- opengait/modeling/modules.py — TemporalPool, HorizontalPoolingPyramid, SeparateFCs, SeparateBNNecks classes. Constructor args come from config YAML.
- opengait/utils/common.py::config_loader — Loads YAML config, merges with default.yaml. Returns dict.
Config References:
- configs/sconet/sconet_scoliosis1k.yaml — ScoNet config specifying backbone, head, loss params. The model_cfg section defines architecture hyperparams.
- configs/default.yaml — Default config merged by config_loader
Checkpoint Reference:
- ./ckpt/ScoNet-20000.pt — Trained ScoNet checkpoint. Verify format: torch.load() and inspect keys.
Inference Logic Reference:
- opengait/evaluation/evaluator.py:evaluate_scoliosis() (line ~418) — Shows argmax(logits.mean(-1)) prediction logic and label mapping
WHY Each Reference Matters:
- sconet.py: Defines the exact forward pass we must replicate — TP → backbone → HPP → FCs → BNNecks
- base_model.py: Shows how config is parsed into submodule instantiation — we copy this logic minus DDP
- modules.py: Constructor signatures tell us what config keys to extract
- evaluator.py: The prediction aggregation (mean over parts, argmax) is the canonical inference logic
- sconet_scoliosis1k.yaml: Contains the exact hyperparams (channels, num_parts, etc.) for building layers
Acceptance Criteria:
- opengait/demo/sconet_demo.py exists with ScoNetDemo(nn.Module) class
- No torch.distributed imports in the file
- ScoNetDemo does not inherit from BaseModel
- uv run python -c "from opengait.demo.sconet_demo import ScoNetDemo; print('OK')" works
QA Scenarios:
```
Scenario: ScoNetDemo loads checkpoint and produces correct output shape
  Tool: Bash
  Preconditions: ./ckpt/ScoNet-20000.pt exists, CUDA available
  Steps:
    1. Run `uv run python -c "`
       ```python
       import torch
       from opengait.demo.sconet_demo import ScoNetDemo
       model = ScoNetDemo('./configs/sconet/sconet_scoliosis1k.yaml', './ckpt/ScoNet-20000.pt', device='cuda:0')
       model.eval()
       dummy = torch.rand(1, 1, 30, 64, 44, device='cuda:0')
       with torch.no_grad():
           result = model(dummy)
       assert result['logits'].shape == (1, 3, 16), f'Bad shape: {result["logits"].shape}'
       label, conf = model.predict(dummy)
       assert label in ('negative', 'neutral', 'positive'), f'Bad label: {label}'
       assert 0.0 <= conf <= 1.0, f'Bad confidence: {conf}'
       print(f'SCONET_OK label={label} conf={conf:.3f}')
       ```
  Expected Result: Prints 'SCONET_OK label=... conf=...' with valid label and confidence
  Failure Indicators: ImportError, state_dict key mismatch, shape error, CUDA error
  Evidence: .sisyphus/evidence/task-1-sconet-forward.txt

Scenario: ScoNetDemo rejects DDP-wrapped usage
  Tool: Bash
  Preconditions: File exists
  Steps:
    1. Run `grep -c 'torch.distributed' opengait/demo/sconet_demo.py`
    2. Run `grep -c 'BaseModel' opengait/demo/sconet_demo.py`
  Expected Result: Both commands output '0'
  Failure Indicators: Any count > 0
  Evidence: .sisyphus/evidence/task-1-no-ddp.txt
```
Commit: YES
- Message: feat(demo): add ScoNetDemo DDP-free inference wrapper
- Files: opengait/demo/sconet_demo.py
- Pre-commit: uv run python -c "from opengait.demo.sconet_demo import ScoNetDemo"
3. Silhouette Preprocessing Module

What to do:
- Create opengait/demo/preprocess.py
- All public functions decorated with @jaxtyped(typechecker=beartype) for runtime shape checking
- Function mask_to_silhouette(mask: UInt8[ndarray, 'h w'], bbox: tuple[int,int,int,int]) -> Float[ndarray, '64 44'] | None:
  - Uses jaxtyping: from jaxtyping import Float, UInt8, jaxtyped and from numpy import ndarray
  - Input: binary mask (H, W) uint8 from YOLO, bounding box (x1, y1, x2, y2)
  - Crop mask to bbox region
  - Find vertical extent of foreground pixels (top/bottom rows with nonzero)
  - Crop to tight vertical bounding box (remove empty rows above/below)
  - Resize height to 64, maintaining aspect ratio
  - Center-crop or center-pad width to 64
  - Cut 10px from each side → final 64×44
  - Return float32 array [0.0, 1.0] (divide by 255)
  - Return None if mask area below MIN_MASK_AREA threshold (default: 500 pixels)
- Function frame_to_person_mask(result, min_area: int = 500) -> tuple[UInt8[ndarray, 'h w'], tuple[int,int,int,int]] | None:
  - Extract single-person mask + bbox from YOLO result object
  - Uses result.masks.data and result.boxes.xyxy
  - Returns None if no valid detection
- Constants: SIL_HEIGHT = 64, SIL_WIDTH = 44, SIL_FULL_WIDTH = 64, SIDE_CUT = 10, MIN_MASK_AREA = 500
- Each step must match the preprocessing in datasets/pretreatment.py (grayscale → crop → resize → center) and BaseSilCuttingTransform (cut sides → /255)
Must NOT do:
- Don't import or modify datasets/pretreatment.py
- Don't add color/texture features — binary silhouettes only
- Don't resize to arbitrary sizes — must be exactly 64×44 output
Recommended Agent Profile:
- Category: deep
  - Reason: Correctness-critical — must exactly match training data preprocessing. Subtle pixel-level bugs will silently degrade accuracy.
- Skills: []
Parallelization:
- Can Run In Parallel: YES
- Parallel Group: Wave 1 (with Tasks 1, 2, 4)
- Blocks: Tasks 5, 7, 9
- Blocked By: None
References:

Pattern References:
- datasets/pretreatment.py:18-96 (function imgs2pickle) — The canonical preprocessing pipeline. Study lines 45-80 carefully: cv2.imread(GRAYSCALE) → find contours → crop to person bbox → cv2.resize(img, (int(64 * ratio), 64)) → center-crop width. This is the EXACT sequence to replicate for live masks.
- opengait/data/transform.py:46-58 (BaseSilCuttingTransform) — The runtime transform applied during training/eval. cutting = int(w // 64) * 10 then slices [:, :, cutting:-cutting] then divides by 255.0. For w=64 input, cutting=10, output width=44.
API/Type References:
- Ultralytics Results object: result.masks.data → Tensor[N, H, W] binary masks; result.boxes.xyxy → Tensor[N, 4] bounding boxes; result.boxes.id → track IDs (may be None)
WHY Each Reference Matters:
- pretreatment.py: Defines the ground truth preprocessing. If our live pipeline differs by even 1 pixel of padding, ScoNet accuracy degrades.
- BaseSilCuttingTransform: The 10px side cut + /255 is applied at runtime during training. We must apply the SAME transform.
- Ultralytics masks: Need to know exact API to extract binary masks from YOLO output
Acceptance Criteria:
- opengait/demo/preprocess.py exists
- mask_to_silhouette() returns np.ndarray of shape (64, 44) dtype float32 with values in [0, 1]
- Returns None for masks below MIN_MASK_AREA
QA Scenarios:
```
Scenario: Preprocessing produces correct output shape and range
  Tool: Bash
  Preconditions: Module importable
  Steps:
    1. Run `uv run python -c "`
       ```python
       import numpy as np
       from opengait.demo.preprocess import mask_to_silhouette
       # Create a synthetic mask: 200x100 person-shaped blob
       mask = np.zeros((480, 640), dtype=np.uint8)
       mask[100:400, 250:400] = 255  # person region
       bbox = (250, 100, 400, 400)
       sil = mask_to_silhouette(mask, bbox)
       assert sil is not None, 'Should not be None for valid mask'
       assert sil.shape == (64, 44), f'Bad shape: {sil.shape}'
       assert sil.dtype == np.float32, f'Bad dtype: {sil.dtype}'
       assert 0.0 <= sil.min() and sil.max() <= 1.0, f'Bad range: [{sil.min()}, {sil.max()}]'
       assert sil.max() > 0, 'Should have nonzero pixels'
       print('PREPROCESS_OK')
       ```
  Expected Result: Prints 'PREPROCESS_OK'
  Failure Indicators: Shape mismatch, dtype error, range error
  Evidence: .sisyphus/evidence/task-3-preprocess-shape.txt

Scenario: Small masks are rejected
  Tool: Bash
  Preconditions: Module importable
  Steps:
    1. Run `uv run python -c "`
       ```python
       import numpy as np
       from opengait.demo.preprocess import mask_to_silhouette
       # Tiny mask: only 10x10 = 100 pixels (below MIN_MASK_AREA=500)
       mask = np.zeros((480, 640), dtype=np.uint8)
       mask[100:110, 100:110] = 255
       bbox = (100, 100, 110, 110)
       sil = mask_to_silhouette(mask, bbox)
       assert sil is None, f'Should be None for tiny mask, got {type(sil)}'
       print('SMALL_MASK_REJECTED_OK')
       ```
  Expected Result: Prints 'SMALL_MASK_REJECTED_OK'
  Failure Indicators: Returns non-None for tiny mask
  Evidence: .sisyphus/evidence/task-3-small-mask-reject.txt
```
Commit: YES
- Message: feat(demo): add silhouette preprocessing module
- Files: opengait/demo/preprocess.py
- Pre-commit: uv run python -c "from opengait.demo.preprocess import mask_to_silhouette"
4. Input Adapters (cv-mmap + OpenCV)

What to do:
- Create opengait/demo/input.py
- The pipeline contract is simple: it consumes any Iterable[tuple[np.ndarray, dict]] — any generator or iterator that yields (frame_bgr_uint8, metadata_dict) works
- Type alias: FrameStream = Iterable[tuple[np.ndarray, dict]]
- Generator function opencv_source(path: str | int, max_frames: int | None = None) -> Generator[tuple[np.ndarray, dict], None, None]:
  - path can be video file path or camera index (int)
  - Opens cv2.VideoCapture(path)
  - Yields (frame, {'frame_count': int, 'timestamp_ns': int}) tuples
  - Handles end-of-video gracefully (just returns)
  - Handles camera disconnect (log warning, return)
  - Respects max_frames limit
- Generator function cvmmap_source(name: str, max_frames: int | None = None) -> Generator[tuple[np.ndarray, dict], None, None]:
  - Wraps CvMmapClient from /home/crosstyan/Code/cv-mmap/client/cvmmap/
  - Since cv-mmap is async (anyio), this adapter must bridge async→sync:
    - Run anyio event loop in a background thread, drain frames via queue.Queue
    - Or use anyio.from_thread / asyncio.run() with async for internally
    - Choose simplest correct approach
  - Yields same (frame, metadata_dict) tuple format as opencv_source
  - Handles cv-mmap disconnect/offline events gracefully
  - Import should be conditional (try/except ImportError) so pipeline works without cv-mmap installed
- Factory function create_source(source: str, max_frames: int | None = None) -> FrameStream:
  - If source starts with cvmmap:// → cvmmap_source(name)
  - If source is a digit string → opencv_source(int(source)) (camera index)
  - Otherwise → opencv_source(source) (file path)
- The key design point: any user-written generator that yields (np.ndarray, dict) plugs in directly — no class inheritance needed
Must NOT do:
- Don't build GStreamer pipelines
- Don't add async to the main pipeline loop — keep synchronous pull model
- Don't use abstract base classes or heavy OOP — plain generator functions are the interface
- Don't buffer frames internally (no unbounded queue between source and consumer)
Recommended Agent Profile:
- Category: unspecified-high
  - Reason: Integration with external library (cv-mmap) requires careful async→sync bridging
- Skills: []
Parallelization:
- Can Run In Parallel: YES
- Parallel Group: Wave 1 (with Tasks 1, 3, 4)
- Blocks: Task 9
- Blocked By: None
References:

Pattern References:
- /home/crosstyan/Code/cv-mmap/client/cvmmap/__init__.py — CvMmapClient class. Async iterator: async for im, meta in client. Understand the __aiter__/__anext__ protocol.
- /home/crosstyan/Code/cv-mmap/client/test_cvmmap.py — Example consumer pattern using anyio.run()
- /home/crosstyan/Code/cv-mmap/client/cvmmap/msg.py — FrameMetadata and FrameInfo dataclasses. Fields: frame_count, timestamp_ns, info.width, info.height, info.pixel_format
API/Type References:
- cv2.VideoCapture — OpenCV video capture. cap.read() returns (bool, np.ndarray). cap.get(cv2.CAP_PROP_FRAME_COUNT) for total frames.
WHY Each Reference Matters:
- CvMmapClient: The async iterator yields (numpy_array, FrameMetadata) — need to know exact types for sync bridging
- msg.py: Metadata fields must be mapped to our generic dict metadata format
- test_cvmmap.py: Shows the canonical consumer pattern we must wrap
Acceptance Criteria:
- opengait/demo/input.py exists with opencv_source, cvmmap_source, create_source as functions (not classes)
- create_source('./some/video.mp4') returns a generator/iterable
- create_source('cvmmap://default') returns a generator (or raises if cv-mmap not installed)
- create_source('0') returns a generator for camera index 0
- Any custom generator def my_source(): yield (frame, meta) can be used directly by the pipeline
QA Scenarios:
```
Scenario: opencv_source reads frames from a video file
  Tool: Bash
  Preconditions: Any .mp4 or .avi file available (use sample video or create synthetic one)
  Steps:
    1. Create a short test video if none exists:
       `uv run python -c "import cv2; import numpy as np; w=cv2.VideoWriter('/tmp/test.avi', cv2.VideoWriter_fourcc(*'MJPG'), 15, (640,480)); [w.write(np.random.randint(0,255,(480,640,3),dtype=np.uint8)) for _ in range(30)]; w.release(); print('VIDEO_CREATED')"`
    2. Run `uv run python -c "`
       ```python
       from opengait.demo.input import create_source
       src = create_source('/tmp/test.avi', max_frames=10)
       count = 0
       for frame, meta in src:
           assert frame.shape[2] == 3, f'Not BGR: {frame.shape}'
           assert 'frame_count' in meta
           count += 1
       assert count == 10, f'Expected 10 frames, got {count}'
       print('OPENCV_SOURCE_OK')
       ```
  Expected Result: Prints 'OPENCV_SOURCE_OK'
  Failure Indicators: Shape error, missing metadata, wrong frame count
  Evidence: .sisyphus/evidence/task-2-opencv-source.txt

Scenario: Custom generator works as pipeline input
  Tool: Bash
  Steps:
    1. Run `uv run python -c "`
       ```python
       import numpy as np
       from opengait.demo.input import FrameStream
       import typing
       # Any generator works — no class needed
       def my_source():
           for i in range(5):
               yield np.zeros((480, 640, 3), dtype=np.uint8), {'frame_count': i}
       src = my_source()
       frames = list(src)
       assert len(frames) == 5
       print('CUSTOM_GENERATOR_OK')
       ```
  Expected Result: Prints 'CUSTOM_GENERATOR_OK'
  Failure Indicators: Type error, protocol mismatch
  Evidence: .sisyphus/evidence/task-2-custom-gen.txt
```
Commit: YES
- Message: feat(demo): add generator-based input adapters for cv-mmap and OpenCV
- Files: opengait/demo/input.py
- Pre-commit: uv run python -c "from opengait.demo.input import create_source"
5. Sliding Window / Ring Buffer Manager

What to do:
- Create opengait/demo/window.py
- Class SilhouetteWindow:
  - Constructor: __init__(self, window_size: int = 30, stride: int = 1, gap_threshold: int = 15)
  - Internal storage: collections.deque(maxlen=window_size) of np.ndarray (64×44 float32)
  - push(sil: np.ndarray, frame_idx: int, track_id: int) -> None:
    - If track_id differs from current tracked ID → reset buffer, update tracked ID
    - If frame_idx - last_frame_idx > gap_threshold → reset buffer (too many missed frames)
    - Append silhouette to deque
    - Increment internal frame counter
  - is_ready() -> bool: returns len(buffer) == window_size
  - should_classify() -> bool: returns is_ready() and (frames_since_last_classify >= stride)
  - get_tensor(device: str = 'cpu') -> torch.Tensor:
    - Stack buffer into np.array shape [window_size, 64, 44]
    - Convert to torch.Tensor shape [1, 1, window_size, 64, 44] on device
    - This is the exact input shape for ScoNetDemo
  - reset() -> None: clear buffer and counters
  - mark_classified() -> None: reset frames_since_last_classify counter
  - Properties: current_track_id, frame_count, fill_level (len/window_size as float)
- Single-person selection policy (function or small helper):
  - select_person(results) -> tuple[np.ndarray, tuple, int] | None
  - From YOLO results, select the detection with the largest bounding box area
  - Return (mask, bbox, track_id) or None if no valid detection
  - If result.boxes.id is None (tracker not yet initialized), skip frame
Must NOT do:
- No unbounded buffers — deque with maxlen enforces this
- No multi-person tracking — single person only, select largest bbox
- No time-based windowing — frame-count based only
Recommended Agent Profile:
- Category: unspecified-high
  - Reason: Data structure + policy logic with edge cases (ID changes, gaps, resets)
- Skills: []
Parallelization:
- Can Run In Parallel: YES
- Parallel Group: Wave 2 (with Tasks 6, 7, 8)
- Blocks: Tasks 9, 10
- Blocked By: Task 3 (needs silhouette shape constants from preprocess.py)
References:

Pattern References:
- opengait/demo/preprocess.py (Task 3) — SIL_HEIGHT, SIL_WIDTH constants. The window stores arrays of this shape.
- opengait/data/dataset.py — Shows how OpenGait's DataSet samples fixed-length sequences. The seqL parameter controls sequence length (our window_size=30).
API/Type References:
- Ultralytics Results.boxes.id — Track IDs tensor, may be None if tracker hasn't assigned IDs yet
- Ultralytics Results.boxes.xyxy — Bounding boxes [N, 4] for area calculation
- Ultralytics Results.masks.data — Binary masks [N, H, W]
WHY Each Reference Matters:
- preprocess.py: Window must store silhouettes of the exact shape produced by preprocessing
- dataset.py: Understanding how training samples sequences helps ensure our window matches
- Ultralytics API: Need to handle None track IDs and extract correct tensors
Acceptance Criteria:
- opengait/demo/window.py exists with SilhouetteWindow class and select_person function
- Buffer is bounded (deque with maxlen)
- get_tensor() returns shape [1, 1, 30, 64, 44] when full
- Track ID change triggers reset
- Gap exceeding threshold triggers reset
QA Scenarios:
```
Scenario: Window fills and produces correct tensor shape
  Tool: Bash
  Preconditions: Module importable
  Steps:
    1. Run `uv run python -c "`
       ```python
       import numpy as np
       from opengait.demo.window import SilhouetteWindow
       win = SilhouetteWindow(window_size=30, stride=1)
       for i in range(30):
           sil = np.random.rand(64, 44).astype(np.float32)
           win.push(sil, frame_idx=i, track_id=1)
       assert win.is_ready(), 'Window should be ready after 30 frames'
       t = win.get_tensor()
       assert t.shape == (1, 1, 30, 64, 44), f'Bad shape: {t.shape}'
       assert t.dtype.is_floating_point, f'Bad dtype: {t.dtype}'
       print('WINDOW_FILL_OK')
       ```
  Expected Result: Prints 'WINDOW_FILL_OK'
  Failure Indicators: Shape mismatch, not ready after 30 pushes
  Evidence: .sisyphus/evidence/task-5-window-fill.txt

Scenario: Track ID change resets buffer
  Tool: Bash
  Steps:
    1. Run `uv run python -c "`
       ```python
       import numpy as np
       from opengait.demo.window import SilhouetteWindow
       win = SilhouetteWindow(window_size=30)
       for i in range(20):
           win.push(np.random.rand(64, 44).astype(np.float32), frame_idx=i, track_id=1)
       assert win.frame_count == 20
       # Switch track ID — should reset
       win.push(np.random.rand(64, 44).astype(np.float32), frame_idx=20, track_id=2)
       assert win.frame_count == 1, f'Should reset to 1, got {win.frame_count}'
       assert win.current_track_id == 2
       print('TRACK_RESET_OK')
       ```
  Expected Result: Prints 'TRACK_RESET_OK'
  Failure Indicators: Buffer not reset, wrong track ID
  Evidence: .sisyphus/evidence/task-5-track-reset.txt
```
Commit: YES
- Message: feat(demo): add sliding window manager with single-person selection
- Files: opengait/demo/window.py
- Pre-commit: uv run python -c "from opengait.demo.window import SilhouetteWindow"
6. NATS JSON Publisher

What to do:
- Create opengait/demo/output.py
- Class ResultPublisher(Protocol) — any object with publish(result: dict) -> None
- Function console_publisher() -> Generator or simple class ConsolePublisher:
  - Prints JSON to stdout (default when --nats-url is not provided)
  - Format: one JSON object per line (JSONL)
- Class NatsPublisher:
  - Constructor: __init__(self, url: str = 'nats://127.0.0.1:4222', subject: str = 'scoliosis.result')
  - Uses nats-py async client, bridged to sync publish() method
  - Handles connection failures gracefully (log warning, retry with backoff, don't crash pipeline)
  - Handles reconnection automatically (nats-py does this by default)
  - publish(result: dict) -> None: serializes to JSON, publishes to subject
  - close() -> None: drain and close NATS connection
  - Context manager support (__enter__/__exit__)
- JSON schema for results:
```
{
  "frame": 1234,
  "track_id": 1,
  "label": "positive",
  "confidence": 0.82,
  "window": 30,
  "timestamp_ns": 1234567890000
}
```
- Factory: create_publisher(nats_url: str | None, subject: str = 'scoliosis.result') -> ResultPublisher
  - If nats_url is None → ConsolePublisher
  - Otherwise → NatsPublisher(url, subject)
Must NOT do:
- Don't use JetStream (plain NATS PUB/SUB is sufficient)
- Don't build custom binary protocol
- Don't buffer/batch results — publish immediately
Recommended Agent Profile:
- Category: quick
  - Reason: Straightforward JSON serialization + NATS client wrapper, well-documented library
- Skills: []
Parallelization:
- Can Run In Parallel: YES
- Parallel Group: Wave 2 (with Tasks 5, 7, 8)
- Blocks: Tasks 9, 13
- Blocked By: Task 1 (needs project scaffolding for nats-py dependency)
References:

External References:
- nats-py docs: import nats; nc = await nats.connect(); await nc.publish(subject, data) — async API
- /home/crosstyan/Code/cv-mmap-gui/ — Uses NATS.c for messaging; our Python publisher sends to the same broker
WHY Each Reference Matters:
- nats-py: Need to bridge async NATS client to sync publish() call
- cv-mmap-gui: Confirms NATS is the right transport for this ecosystem
Acceptance Criteria:
- opengait/demo/output.py exists with ConsolePublisher, NatsPublisher, create_publisher
- ConsolePublisher prints valid JSON to stdout
- NatsPublisher connects and publishes without crashing (when NATS available)
- NatsPublisher logs warning and doesn't crash when NATS unavailable
QA Scenarios:
```
Scenario: ConsolePublisher outputs valid JSONL
  Tool: Bash
  Steps:
    1. Run `uv run python -c "`
       ```python
       import json, io, sys
       from opengait.demo.output import create_publisher
       pub = create_publisher(nats_url=None)
       result = {'frame': 100, 'track_id': 1, 'label': 'positive', 'confidence': 0.85, 'window': 30, 'timestamp_ns': 0}
       pub.publish(result)  # should print to stdout
       print('CONSOLE_PUB_OK')
       ```
  Expected Result: Prints a JSON line followed by 'CONSOLE_PUB_OK'
  Failure Indicators: Invalid JSON, missing fields, crash
  Evidence: .sisyphus/evidence/task-6-console-pub.txt

Scenario: NatsPublisher handles missing server gracefully
  Tool: Bash
  Steps:
    1. Run `uv run python -c "`
       ```python
       from opengait.demo.output import create_publisher
       try:
           pub = create_publisher(nats_url='nats://127.0.0.1:14222')  # wrong port, no server
           pub.publish({'frame': 0, 'label': 'test'})
       except SystemExit:
           print('SHOULD_NOT_EXIT')
           raise
       print('NATS_GRACEFUL_OK')
       ```
  Expected Result: Prints 'NATS_GRACEFUL_OK' (warning logged, no crash)
  Failure Indicators: Unhandled exception, SystemExit, hang
  Evidence: .sisyphus/evidence/task-6-nats-graceful.txt
```
Commit: YES
- Message: feat(demo): add NATS JSON publisher and console fallback
- Files: opengait/demo/output.py
- Pre-commit: uv run python -c "from opengait.demo.output import create_publisher"
7. Unit Tests — Silhouette Preprocessing

What to do:
- Create tests/demo/test_preprocess.py
- Test mask_to_silhouette() with:
  - Valid mask (person-shaped blob) → output shape (64, 44), dtype float32, range [0, 1]
  - Tiny mask below MIN_MASK_AREA → returns None
  - Empty mask (all zeros) → returns None
  - Full-frame mask (all 255) → produces valid output (edge case: very wide person)
  - Tall narrow mask → verify aspect ratio handling (resize h=64, center-crop width)
  - Wide short mask → verify handling (should still produce 64×44)
- Test determinism: same input always produces same output
- Test against a reference .pkl sample if available:
  - Load a known .pkl file from Scoliosis1K
  - Extract one frame
  - Compare our preprocessing output to the stored frame (should be close/identical)
- Verify jaxtyping annotations are present and beartype checks fire on wrong shapes
Must NOT do:
- Don't test YOLO integration here — only test the mask_to_silhouette function in isolation
- Don't require GPU — all preprocessing is CPU numpy ops
Recommended Agent Profile:
- Category: unspecified-high
  - Reason: Must verify pixel-level correctness against training data contract, multiple edge cases
- Skills: []
Parallelization:
- Can Run In Parallel: YES
- Parallel Group: Wave 2 (with Tasks 5, 6, 8)
- Blocks: None (verification task)
- Blocked By: Task 3 (preprocess module must exist)
References:

Pattern References:
- opengait/demo/preprocess.py (Task 3) — The module under test
- datasets/pretreatment.py:18-96 — Reference preprocessing to validate against
- opengait/data/transform.py:46-58 — BaseSilCuttingTransform for expected output contract
WHY Each Reference Matters:
- preprocess.py: Direct test target
- pretreatment.py: Ground truth for what a correct silhouette looks like
- BaseSilCuttingTransform: Defines the 64→44 cut + /255 contract we must match
Acceptance Criteria:
- tests/demo/test_preprocess.py exists with ≥5 test cases
- uv run pytest tests/demo/test_preprocess.py -q passes
- Tests cover: valid mask, tiny mask, empty mask, determinism
QA Scenarios:
```
Scenario: All preprocessing tests pass
  Tool: Bash
  Preconditions: Task 3 (preprocess.py) is complete
  Steps:
    1. Run `uv run pytest tests/demo/test_preprocess.py -v`
  Expected Result: All tests pass (≥5 tests), exit code 0
  Failure Indicators: Any assertion failure, import error
  Evidence: .sisyphus/evidence/task-7-preprocess-tests.txt

Scenario: Jaxtyping annotation enforcement works
  Tool: Bash
  Steps:
    1. Run `uv run python -c "`
       ```python
       import numpy as np
       from opengait.demo.preprocess import mask_to_silhouette
       # Intentionally wrong type to verify beartype catches it
       try:
           mask_to_silhouette('not_an_array', (0, 0, 10, 10))
           print('BEARTYPE_MISSED')  # should not reach here
       except Exception as e:
           if 'beartype' in type(e).__module__ or 'BeartypeCallHint' in type(e).__name__:
               print('BEARTYPE_OK')
           else:
               print(f'WRONG_ERROR: {type(e).__name__}: {e}')
       ```
  Expected Result: Prints 'BEARTYPE_OK'
  Failure Indicators: Prints 'BEARTYPE_MISSED' or 'WRONG_ERROR'
  Evidence: .sisyphus/evidence/task-7-beartype-check.txt
```
Commit: YES (groups with Task 8)
- Message: test(demo): add preprocessing and model unit tests
- Files: tests/demo/test_preprocess.py
- Pre-commit: uv run pytest tests/demo/test_preprocess.py -q
8. Unit Tests — ScoNetDemo Forward Pass

What to do:
- Create tests/demo/test_sconet_demo.py
- Test ScoNetDemo construction:
  - Loads config from YAML
  - Loads checkpoint weights
  - Model is in eval mode
- Test forward() with dummy tensor:
  - Input: torch.rand(1, 1, 30, 64, 44) on available device
  - Output logits shape: (1, 3, 16)
  - Output dtype: float32
- Test predict() convenience method:
  - Returns (label_str, confidence_float)
  - label_str is one of {'negative', 'neutral', 'positive'}
  - confidence is in [0.0, 1.0]
- Test with various batch sizes: N=1, N=2
- Test with various sequence lengths if model supports it (should work with 30)
- Verify no torch.distributed calls are made (mock torch.distributed to raise if called)
- Verify jaxtyping shape annotations on forward/predict signatures
Must NOT do:
- Don't test with real video data — dummy tensors only for unit tests
- Don't modify the checkpoint
Recommended Agent Profile:
- Category: unspecified-high
  - Reason: Requires CUDA, checkpoint loading, shape validation — moderate complexity
- Skills: []
Parallelization:
- Can Run In Parallel: YES
- Parallel Group: Wave 2 (with Tasks 5, 6, 7)
- Blocks: None (verification task)
- Blocked By: Task 2 (ScoNetDemo must exist)
References:

Pattern References:
- opengait/demo/sconet_demo.py (Task 1) — The module under test
- opengait/evaluation/evaluator.py:evaluate_scoliosis() (line ~418) — Canonical prediction logic to validate against
Config/Checkpoint References:
- configs/sconet/sconet_scoliosis1k.yaml — Config file to pass to ScoNetDemo
- ./ckpt/ScoNet-20000.pt — Trained checkpoint
WHY Each Reference Matters:
- sconet_demo.py: Direct test target
- evaluator.py: Defines expected prediction behavior (argmax of mean logits)
Acceptance Criteria:
- tests/demo/test_sconet_demo.py exists with ≥4 test cases
- uv run pytest tests/demo/test_sconet_demo.py -q passes
- Tests cover: construction, forward shape, predict output, no-DDP enforcement
QA Scenarios:
```
Scenario: All ScoNetDemo tests pass
  Tool: Bash
  Preconditions: Task 1 (sconet_demo.py) is complete, checkpoint exists, CUDA available
  Steps:
    1. Run `CUDA_VISIBLE_DEVICES=0 uv run pytest tests/demo/test_sconet_demo.py -v`
  Expected Result: All tests pass (≥4 tests), exit code 0
  Failure Indicators: state_dict key mismatch, shape error, CUDA OOM
  Evidence: .sisyphus/evidence/task-8-sconet-tests.txt

Scenario: No DDP leakage in ScoNetDemo
  Tool: Bash
  Steps:
    1. Run `grep -rn 'torch.distributed' opengait/demo/sconet_demo.py`
    2. Run `grep -rn 'ddp_all_gather\|get_ddp_module\|get_rank' opengait/demo/sconet_demo.py`
  Expected Result: Both commands produce no output (exit code 1 = no matches)
  Failure Indicators: Any match found
  Evidence: .sisyphus/evidence/task-8-no-ddp.txt
```
Commit: YES (groups with Task 7)
- Message: test(demo): add preprocessing and model unit tests
- Files: tests/demo/test_sconet_demo.py
- Pre-commit: uv run pytest tests/demo/test_sconet_demo.py -q
9. Main Pipeline Application + CLI

What to do:
- Create opengait/demo/pipeline.py — the main orchestrator
- Create opengait/demo/__main__.py — CLI entry point (replace stub from Task 4)
- Pipeline class ScoliosisPipeline:
  - Constructor: __init__(self, source: FrameStream, model: ScoNetDemo, publisher: ResultPublisher, window: SilhouetteWindow, yolo_model_path: str = 'yolo11n-seg.pt', device: str = 'cuda:0')
  - Uses jaxtyping annotations for all tensor-bearing methods:
```
from jaxtyping import Float, UInt8, jaxtyped
from beartype import beartype
from torch import Tensor
import numpy as np
from numpy import ndarray
```
  - run() -> None — main loop:
    1. Load YOLO model: ultralytics.YOLO(yolo_model_path)
    2. For each (frame, meta) from source: a. Run yolo_model.track(frame, persist=True, verbose=False) → results b. select_person(results) → (mask, bbox, track_id) or None → skip if None c. mask_to_silhouette(mask, bbox) → sil or None → skip if None d. window.push(sil, meta['frame_count'], track_id) e. If window.should_classify():
      - tensor = window.get_tensor(device=self.device)
      - label, confidence = self.model.predict(tensor)
      - publisher.publish({...}) with JSON schema fields
      - window.mark_classified()
    3. Log FPS every 100 frames
    4. Cleanup on exit (close publisher, release resources)
  - Graceful shutdown on KeyboardInterrupt / SIGTERM
- CLI via __main__.py using click:
  - --source (required): video path, camera index, or cvmmap://name
  - --checkpoint (required): path to ScoNet checkpoint
  - --config (default: ./configs/sconet/sconet_scoliosis1k.yaml): ScoNet config YAML
  - --device (default: cuda:0): torch device
  - --yolo-model (default: yolo11n-seg.pt): YOLO model path (auto-downloads)
  - --window (default: 30): sliding window size
  - --stride (default: 30): classify every N frames after window is full
  - --nats-url (default: None): NATS server URL, None = console output
  - --nats-subject (default: scoliosis.result): NATS subject
  - --max-frames (default: None): stop after N frames
  - --help: print usage
- Entrypoint: uv run python -m opengait.demo ...
Must NOT do:
- No async in the main loop — synchronous pull-process-publish
- No multi-threading for inference — single-threaded pipeline
- No GUI / frame display / cv2.imshow
- No unbounded accumulation — ring buffer handles memory
- No auto-download of ScoNet checkpoint — user must provide path
Recommended Agent Profile:
- Category: deep
  - Reason: Integration of all components, error handling, CLI design, FPS logging — largest single task
- Skills: []
Parallelization:
- Can Run In Parallel: NO
- Parallel Group: Wave 3 (sequential — depends on most Wave 1+2 tasks)
- Blocks: Tasks 12, 13
- Blocked By: Tasks 2, 3, 4, 5, 6 (all components must exist)
References:

Pattern References:
- opengait/demo/sconet_demo.py (Task 1) — ScoNetDemo class, predict() method
- opengait/demo/preprocess.py (Task 3) — mask_to_silhouette(), frame_to_person_mask()
- opengait/demo/window.py (Task 5) — SilhouetteWindow, select_person()
- opengait/demo/input.py (Task 2) — create_source(), FrameStream type alias
- opengait/demo/output.py (Task 6) — create_publisher(), ResultPublisher
External References:
- Ultralytics tracking API: model.track(frame, persist=True) — returns Results list
- Ultralytics result object: results[0].masks.data, results[0].boxes.xyxy, results[0].boxes.id
WHY Each Reference Matters:
- All Task refs: This task composes every component — must know each API surface
- Ultralytics: The YOLO .track() call is the only external API used directly in this file
Acceptance Criteria:
- opengait/demo/pipeline.py exists with ScoliosisPipeline class
- opengait/demo/__main__.py exists with click CLI
- uv run python -m opengait.demo --help prints usage without errors
- All public methods have jaxtyping annotations where tensor/array args are involved
QA Scenarios:
```
Scenario: CLI --help works
  Tool: Bash
  Steps:
    1. Run `uv run python -m opengait.demo --help`
  Expected Result: Prints usage showing --source, --checkpoint, --device, --window, --stride, --nats-url, --max-frames
  Failure Indicators: ImportError, missing arguments, crash
  Evidence: .sisyphus/evidence/task-9-help.txt

Scenario: Pipeline runs with sample video (no NATS)
  Tool: Bash
  Preconditions: sample.mp4 exists (from Task 11), checkpoint exists, CUDA available
  Steps:
    1. Run `CUDA_VISIBLE_DEVICES=0 uv run python -m opengait.demo --source ./assets/sample.mp4 --checkpoint ./ckpt/ScoNet-20000.pt --device cuda:0 --max-frames 120 2>&1 | tee /tmp/pipeline_output.txt`
    2. Count prediction lines: `grep -c '"label"' /tmp/pipeline_output.txt`
  Expected Result: Exit code 0, at least 1 prediction line with valid JSON containing "label" field
  Failure Indicators: Crash, no predictions, invalid JSON, CUDA error
  Evidence: .sisyphus/evidence/task-9-pipeline-run.txt

Scenario: Pipeline handles missing video gracefully
  Tool: Bash
  Steps:
    1. Run `uv run python -m opengait.demo --source /nonexistent/video.mp4 --checkpoint ./ckpt/ScoNet-20000.pt 2>&1; echo "EXIT_CODE=$?"`
  Expected Result: Prints error message about missing file, exits with non-zero code (no traceback dump)
  Failure Indicators: Unhandled exception with full traceback, exit code 0
  Evidence: .sisyphus/evidence/task-9-missing-video.txt
```
Commit: YES
- Message: feat(demo): add main pipeline application with CLI entry point
- Files: opengait/demo/pipeline.py, opengait/demo/__main__.py
- Pre-commit: uv run python -m opengait.demo --help
10. Unit Tests — Single-Person Policy + Window Reset

What to do:
- Create tests/demo/test_window.py
- Test SilhouetteWindow:
  - Fill to capacity → is_ready() returns True
  - Underfilled → is_ready() returns False
  - Track ID change resets buffer
  - Frame gap exceeding threshold resets buffer
  - get_tensor() returns correct shape [1, 1, window_size, 64, 44]
  - should_classify() respects stride
- Test select_person():
  - Single detection → returns it
  - Multiple detections → returns largest bbox area
  - No detections → returns None
  - Detections without track IDs (tracker not initialized) → returns None
- Use mock YOLO results (don't require actual YOLO model)
Must NOT do:
- Don't require GPU — window tests are CPU-only (get_tensor can use cpu device)
- Don't require YOLO model file — mock the results
Recommended Agent Profile:
- Category: unspecified-high
  - Reason: Multiple edge cases to cover, mock YOLO results require understanding ultralytics API
- Skills: []
Parallelization:
- Can Run In Parallel: YES
- Parallel Group: Wave 3 (with Tasks 9, 11)
- Blocks: None (verification task)
- Blocked By: Task 5 (window module must exist)
References:

Pattern References:
- opengait/demo/window.py (Task 5) — Module under test
WHY Each Reference Matters:
- Direct test target
Acceptance Criteria:
- tests/demo/test_window.py exists with ≥6 test cases
- uv run pytest tests/demo/test_window.py -q passes
QA Scenarios:
```
Scenario: All window and single-person tests pass
  Tool: Bash
  Steps:
    1. Run `uv run pytest tests/demo/test_window.py -v`
  Expected Result: All tests pass (≥6 tests), exit code 0
  Failure Indicators: Assertion failures, import errors
  Evidence: .sisyphus/evidence/task-10-window-tests.txt
```
Commit: YES
- Message: test(demo): add window manager and single-person policy tests
- Files: tests/demo/test_window.py
- Pre-commit: uv run pytest tests/demo/test_window.py -q
11. Sample Video for Smoke Testing

What to do:
- Acquire or create a short sample video for pipeline smoke testing
- Options (in order of preference):
  1. Extract a short clip (5-10 seconds, ~120 frames at 15fps) from an existing Scoliosis1K raw video if accessible
  2. Record a short clip using webcam via cv2.VideoCapture(0)
  3. Generate a synthetic video with a person-shaped blob moving across frames
- Save to ./assets/sample.mp4 (or ./assets/sample.avi)
- Requirements: contains at least one person walking, 720p or lower, ≥60 frames
- If no real video is available, create a synthetic one:
  - 120 frames, 640×480, 15fps
  - White rectangle (simulating person silhouette) moving across dark background
  - This won't test YOLO detection quality but will verify pipeline doesn't crash
- Add assets/sample.mp4 to .gitignore if it's large (>10MB)
Must NOT do:
- Don't use any Scoliosis1K dataset files that are symlinked (user constraint)
- Don't commit large video files to git
Recommended Agent Profile:
- Category: quick
  - Reason: Simple file creation/acquisition task
- Skills: []
Parallelization:
- Can Run In Parallel: YES
- Parallel Group: Wave 3 (with Tasks 9, 10)
- Blocks: Task 12
- Blocked By: Task 1 (needs OpenCV dependency from scaffolding)
References: None needed — standalone task

Acceptance Criteria:
- ./assets/sample.mp4 (or .avi) exists
- Video has ≥60 frames
- Playable with uv run python -c "import cv2; cap=cv2.VideoCapture('./assets/sample.mp4'); print(f'frames={int(cap.get(7))}'); cap.release()"
QA Scenarios:
```
Scenario: Sample video is valid
  Tool: Bash
  Steps:
    1. Run `uv run python -c "`
       ```python
       import cv2
       cap = cv2.VideoCapture('./assets/sample.mp4')
       assert cap.isOpened(), 'Cannot open video'
       n = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
       assert n >= 60, f'Too few frames: {n}'
       ret, frame = cap.read()
       assert ret and frame is not None, 'Cannot read first frame'
       h, w = frame.shape[:2]
       assert h >= 240 and w >= 320, f'Too small: {w}x{h}'
       cap.release()
       print(f'SAMPLE_VIDEO_OK frames={n} size={w}x{h}')
       ```
  Expected Result: Prints 'SAMPLE_VIDEO_OK' with frame count ≥60
  Failure Indicators: Cannot open, too few frames, too small
  Evidence: .sisyphus/evidence/task-11-sample-video.txt
```
Commit: YES
- Message: chore(demo): add sample video for smoke testing
- Files: assets/sample.mp4 (or add to .gitignore and document)
- Pre-commit: none

12. Integration Tests — End-to-End Smoke Test

What to do:
- Create tests/demo/test_pipeline.py
- Integration test: run the full pipeline with sample video, no NATS
  - Uses subprocess.run() to invoke python -m opengait.demo
  - Captures stdout, parses JSON predictions
  - Asserts: exit code 0, ≥1 prediction, valid JSON schema
- Test graceful exit on end-of-video
- Test --max-frames flag: run with max_frames=60, verify it stops
- Test error handling: invalid source path → non-zero exit, error message
- Test error handling: invalid checkpoint path → non-zero exit, error message
- FPS benchmark (informational, not a hard assertion):
  - Run pipeline on sample video, measure wall time, compute FPS
  - Log FPS to evidence file (target: ≥15 FPS on desktop GPU)
Must NOT do:
- Don't require NATS server for this test — use console publisher
- Don't hardcode CUDA device — use --device cuda:0 only if CUDA available, else skip
Recommended Agent Profile:
- Category: deep
  - Reason: Full integration test requiring all components working together, subprocess management, JSON parsing
- Skills: []
Parallelization:
- Can Run In Parallel: YES
- Parallel Group: Wave 4 (with Task 13)
- Blocks: F1-F4 (Final verification)
- Blocked By: Tasks 9 (pipeline), 11 (sample video)
References:

Pattern References:
- opengait/demo/__main__.py (Task 9) — CLI flags to invoke
- opengait/demo/output.py (Task 6) — JSON schema to validate
WHY Each Reference Matters:
- __main__.py: Need exact CLI flag names for subprocess invocation
- output.py: Need JSON schema to assert against
Acceptance Criteria:
- tests/demo/test_pipeline.py exists with ≥4 test cases
- CUDA_VISIBLE_DEVICES=0 uv run pytest tests/demo/test_pipeline.py -q passes
- Tests cover: happy path, max-frames, invalid source, invalid checkpoint
QA Scenarios:
```
Scenario: Full pipeline integration test passes
  Tool: Bash
  Preconditions: All components built, sample video exists, CUDA available
  Steps:
    1. Run `CUDA_VISIBLE_DEVICES=0 uv run pytest tests/demo/test_pipeline.py -v --timeout=120`
  Expected Result: All tests pass (≥4), exit code 0
  Failure Indicators: Subprocess crash, JSON parse error, timeout
  Evidence: .sisyphus/evidence/task-12-integration.txt

Scenario: FPS benchmark
  Tool: Bash
  Steps:
    1. Run `CUDA_VISIBLE_DEVICES=0 uv run python -c "`
       ```python
       import subprocess, time
       start = time.monotonic()
       result = subprocess.run(
           ['uv', 'run', 'python', '-m', 'opengait.demo',
            '--source', './assets/sample.mp4',
            '--checkpoint', './ckpt/ScoNet-20000.pt',
            '--device', 'cuda:0', '--nats-url', ''],
           capture_output=True, text=True, timeout=120)
       elapsed = time.monotonic() - start
       import cv2
       cap = cv2.VideoCapture('./assets/sample.mp4')
       n_frames = int(cap.get(7)); cap.release()
       fps = n_frames / elapsed if elapsed > 0 else 0
       print(f'FPS_BENCHMARK frames={n_frames} elapsed={elapsed:.1f}s fps={fps:.1f}')
       assert fps >= 5, f'FPS too low: {fps}'  # conservative threshold
       ```
  Expected Result: Prints FPS benchmark, ≥5 FPS (conservative)
  Failure Indicators: Timeout, crash, FPS < 5
  Evidence: .sisyphus/evidence/task-12-fps-benchmark.txt
```
Commit: YES
- Message: test(demo): add integration and end-to-end smoke tests
- Files: tests/demo/test_pipeline.py
- Pre-commit: CUDA_VISIBLE_DEVICES=0 uv run pytest tests/demo/test_pipeline.py -q
13. NATS Integration Test

What to do:
- Create tests/demo/test_nats.py
- Test requires NATS server (use Docker: docker run -d --rm --name nats-test -p 4222:4222 nats:2)
- Mark tests with @pytest.mark.skipif if Docker/NATS not available
- Test flow:
  1. Start NATS container
  2. Start a nats-py subscriber on scoliosis.result
  3. Run pipeline with --nats-url nats://127.0.0.1:4222 --max-frames 60
  4. Collect received messages
  5. Assert: ≥1 message received, valid JSON, correct schema
  6. Stop NATS container
- Test NatsPublisher reconnection: start pipeline, stop NATS, restart NATS — pipeline should recover
- JSON schema validation:
  - frame: int
  - track_id: int
  - label: str in {"negative", "neutral", "positive"}
  - confidence: float in [0, 1]
  - window: int (should equal window_size)
  - timestamp_ns: int
Must NOT do:
- Don't leave Docker containers running after test
- Don't hardcode NATS port — use a fixture that finds an open port
Recommended Agent Profile:
- Category: unspecified-high
  - Reason: Docker orchestration + async NATS subscriber + subprocess pipeline — moderate complexity
- Skills: []
Parallelization:
- Can Run In Parallel: YES
- Parallel Group: Wave 4 (with Task 12)
- Blocks: F1-F4 (Final verification)
- Blocked By: Tasks 9 (pipeline), 6 (NATS publisher)
References:

Pattern References:
- opengait/demo/output.py (Task 6) — NatsPublisher class, JSON schema
External References:
- nats-py subscriber: sub = await nc.subscribe('scoliosis.result'); msg = await sub.next_msg(timeout=10)
- Docker NATS: docker run -d --rm --name nats-test -p 4222:4222 nats:2
WHY Each Reference Matters:
- output.py: Need to match the exact subject and JSON schema the publisher produces
- nats-py: Need subscriber API to consume and validate messages
Acceptance Criteria:
- tests/demo/test_nats.py exists with ≥2 test cases
- Tests are skippable when Docker/NATS not available
- CUDA_VISIBLE_DEVICES=0 uv run pytest tests/demo/test_nats.py -q passes (when Docker available)
QA Scenarios:
```
Scenario: NATS receives valid prediction JSON
  Tool: Bash
  Preconditions: Docker available, CUDA available, sample video exists
  Steps:
    1. Run `docker run -d --rm --name nats-test -p 4222:4222 nats:2`
    2. Run `CUDA_VISIBLE_DEVICES=0 uv run pytest tests/demo/test_nats.py -v --timeout=60`
    3. Run `docker stop nats-test`
  Expected Result: Tests pass, at least one valid JSON message received on scoliosis.result
  Failure Indicators: No messages, invalid JSON, schema mismatch, timeout
  Evidence: .sisyphus/evidence/task-13-nats-integration.txt

Scenario: NATS test is skipped when Docker unavailable
  Tool: Bash
  Preconditions: Docker NOT running or not installed
  Steps:
    1. Run `uv run pytest tests/demo/test_nats.py -v -k 'nats' 2>&1 | head -20`
  Expected Result: Tests show as SKIPPED (not FAILED)
  Failure Indicators: Test fails instead of skipping
  Evidence: .sisyphus/evidence/task-13-nats-skip.txt
```
Commit: YES
- Message: test(demo): add NATS integration tests
- Files: tests/demo/test_nats.py
- Pre-commit: uv run pytest tests/demo/test_nats.py -q (skips if no Docker)

Final Verification Wave (MANDATORY — after ALL implementation tasks)

4 review agents run in PARALLEL. ALL must APPROVE. Rejection → fix → re-run.

F1. Plan Compliance Audit — oracle Read the plan end-to-end. For each "Must Have": verify implementation exists (read file, run command). For each "Must NOT Have": search codebase for forbidden patterns (torch.distributed imports in demo/, BaseModel subclassing). Check evidence files exist in .sisyphus/evidence/. Compare deliverables against plan. Output: Must Have [N/N] | Must NOT Have [N/N] | Tasks [N/N] | VERDICT: APPROVE/REJECT
F2. Code Quality Review — unspecified-high Run linter + uv run pytest tests/demo/ -q. Review all new files in opengait/demo/ for: as any/type:ignore, empty catches, print statements used instead of logging, commented-out code, unused imports. Check AI slop: excessive comments, over-abstraction, generic variable names. Output: Tests [N pass/N fail] | Files [N clean/N issues] | VERDICT
F3. Real Manual QA — unspecified-high Start from clean state. Run pipeline with sample video: uv run python -m opengait.demo --source ./assets/sample.mp4 --checkpoint ./ckpt/ScoNet-20000.pt --max-frames 120. Verify predictions are printed to console (no --nats-url = console output). Run with NATS: start container, run pipeline with --nats-url nats://127.0.0.1:4222, subscribe and validate JSON schema. Test edge cases: missing video file (graceful error), no checkpoint (graceful error), --help flag. Output: Scenarios [N/N pass] | Edge Cases [N tested] | VERDICT
F4. Scope Fidelity Check — deep For each task: read "What to do", read actual files created. Verify 1:1 — everything in spec was built (no missing), nothing beyond spec was built (no creep). Check "Must NOT do" compliance: no torch.distributed in demo/, no BaseModel subclass, no TensorRT code, no multi-person logic. Flag unaccounted changes. Output: Tasks [N/N compliant] | Scope [CLEAN/N issues] | VERDICT

Commit Strategy

Wave 1: feat(demo): add ScoNetDemo inference wrapper — sconet_demo.py
Wave 1: feat(demo): add input adapters and silhouette preprocessing — input.py, preprocess.py
Wave 1: chore(demo): scaffold demo package and test infrastructure — __init__.py, conftest, pyproject.toml
Wave 2: feat(demo): add sliding window manager and NATS publisher — window.py, output.py
Wave 2: test(demo): add preprocessing and model unit tests — test_preprocess.py, test_sconet_demo.py
Wave 3: feat(demo): add main pipeline application with CLI — pipeline.py, __main__.py
Wave 3: test(demo): add window manager and single-person policy tests — test_window.py
Wave 4: test(demo): add integration and NATS tests — test_pipeline.py, test_nats.py

Success Criteria

Verification Commands

# Smoke test (no NATS)
uv run python -m opengait.demo --source ./assets/sample.mp4 --checkpoint ./ckpt/ScoNet-20000.pt --max-frames 120
# Expected: exits 0, prints ≥3 prediction lines with label in {negative, neutral, positive}

# Unit tests
uv run pytest tests/demo/ -q
# Expected: all tests pass

# Help flag
uv run python -m opengait.demo --help
# Expected: prints usage with --source, --checkpoint, --device, --window, --stride, --nats-url, --max-frames

Final Checklist

All "Must Have" present
All "Must NOT Have" absent
All tests pass
Pipeline runs at ≥15 FPS on desktop GPU
JSON schema matches spec
No torch.distributed imports in opengait/demo/

70 KiB Raw Blame History Unescape Escape