Files
OpenGait/.sisyphus/plans/sconet-pipeline.md
T
crosstyan 3496a1beb7 docs(sisyphus): record sconet-pipeline plan and verification trail
Persist orchestration artifacts, including plan definition, progress state, decisions, issues, and learnings gathered during delegated execution and QA gates. This preserves implementation rationale and auditability without coupling documentation snapshots to runtime logic commits.
2026-02-27 09:59:26 +08:00

70 KiB
Raw Blame History

Real-Time Scoliosis Screening Pipeline (ScoNet)

TL;DR

Quick Summary: Build a production-oriented, real-time scoliosis screening pipeline inside the OpenGait repo. Reads video from cv-mmap shared memory or OpenCV, runs YOLO11n-seg for person detection/segmentation/tracking, extracts 64×44 binary silhouettes, feeds a sliding window of 30 frames into a DDP-free ScoNet wrapper, and publishes classification results (positive/neutral/negative) as JSON over NATS.

Deliverables:

  • ScoNetDemo — standalone nn.Module wrapper for ScoNet inference (no DDP)
  • Silhouette preprocessing module — deterministic mask→64×44 float tensor pipeline
  • Ring buffer / sliding window manager — per-track frame accumulation with reset logic
  • Input adapters — cv-mmap async client + OpenCV VideoCapture fallback
  • NATS publisher — JSON result output
  • Main pipeline application — orchestrates all components
  • pytest test suite — preprocessing, windowing, single-person policy, recovery
  • Sample video for smoke testing

Estimated Effort: Large Parallel Execution: YES — 4 waves Critical Path: Task 1 (ScoNetDemo) → Task 3 (Silhouette Preprocess) → Task 5 (Ring Buffer) → Task 7 (Pipeline App) → Task 10 (Integration Tests)


Context

Original Request

Build a camera-to-classification pipeline for scoliosis screening using OpenGait's ScoNet model, targeting Jetson AGX Orin deployment, with cv-mmap shared-memory video framework integration.

Interview Summary

Key Discussions:

  • Input: Both camera (via cv-mmap) and video files (OpenCV fallback). Single person only.
  • CV Stack: YOLO11n-seg with built-in ByteTrack (replaces paper's BYTETracker + PP-HumanSeg)
  • Inference: Sliding window of 30 frames, continuous classification
  • Output: JSON over NATS (decided over binary protocol — simpler, cross-language)
  • DDP Bypass: Create ScoNetDemo(nn.Module) following All-in-One-Gait's BaselineDemo pattern
  • Build Location: Inside repo (opengait lacks __init__.py, config system hardcodes paths)
  • Test Strategy: pytest, tests after implementation
  • Hardware: Dev on i7-14700KF + RTX 5070 Ti/3090; deploy on Jetson AGX Orin

Research Findings:

  • ScoNet input: [N, 1, S, 64, 44] float32 [0,1]. Output: logits [N, 3, 16]argmax(mean(-1)) → class index
  • .pkl preprocessing: grayscale → crop vertical → resize h=64 → center-crop w=64 → cut 10px sides → /255.0
  • BaseSilCuttingTransform: cuts int(W // 64) * 10 px each side + divides by 255
  • All-in-One-Gait BaselineDemo: extends nn.Module, uses torch.load() + load_state_dict(), training=False
  • YOLO11n-seg: 6MB, ~50-60 FPS, model.track(frame, persist=True) → bbox + mask + track_id
  • cv-mmap Python client: async for im, meta in CvMmapClient("name") — zero-copy numpy

Metis Review

Identified Gaps (addressed):

  • Single-person policy undefined → Defined: largest-bbox selection, ignore others, reset window on ID change
  • Sliding window stride undefined → Defined: stride=1 (every frame updates buffer), classify every N frames (configurable)
  • No-detection / empty mask handling → Defined: skip frame, don't reset window unless gap exceeds threshold
  • Mask quality / partial body → Defined: minimum mask area threshold to accept frame
  • Track ID reset / re-identification → Defined: reset ring buffer on track ID change
  • YOLO letterboxing → Defined: use result.masks.data in original frame coords, not letterboxed
  • Async/sync impedance → Defined: synchronous pull-process-publish loop (no async queues in MVP)
  • Scope creep lockdown → Explicitly excluded: TensorRT, DeepStream, multi-person, GUI, recording, auto-tuning

Work Objectives

Core Objective

Create a self-contained scoliosis screening pipeline that runs standalone (no DDP, no OpenGait training infrastructure) while reusing ScoNet's trained weights and matching its exact preprocessing contract.

Prerequisites (already present in repo)

  • Checkpoint: ./ckpt/ScoNet-20000.pt — trained ScoNet weights (verified working, 80.88% accuracy on Scoliosis1K eval). Already exists in the repo, no download needed.
  • Config: ./configs/sconet/sconet_scoliosis1k.yaml — ScoNet architecture config. Already exists.

Concrete Deliverables

  • opengait/demo/sconet_demo.py — ScoNetDemo nn.Module wrapper
  • opengait/demo/preprocess.py — Silhouette extraction and normalization
  • opengait/demo/window.py — Sliding window / ring buffer manager
  • opengait/demo/input.py — Input adapters (cv-mmap + OpenCV)
  • opengait/demo/output.py — NATS JSON publisher
  • opengait/demo/pipeline.py — Main pipeline orchestrator
  • opengait/demo/__main__.py — CLI entry point
  • tests/demo/test_preprocess.py — Preprocessing unit tests
  • tests/demo/test_window.py — Ring buffer + single-person policy tests
  • tests/demo/test_pipeline.py — Integration / smoke tests
  • tests/demo/test_pipeline.py — Integration / smoke tests

Definition of Done

  • uv run python -m opengait.demo --source ./assets/sample.mp4 --checkpoint ./ckpt/ScoNet-20000.pt --max-frames 120 exits 0 and prints predictions (no NATS by default when --nats-url not provided)
  • uv run pytest tests/demo/ -q passes all tests
  • Pipeline processes ≥15 FPS on desktop GPU with 720p input
  • JSON schema validated: {"frame": int, "track_id": int, "label": str, "confidence": float, "window": int, "timestamp_ns": int}

Must Have

  • Deterministic preprocessing matching ScoNet training data exactly (64×44, float32, [0,1])
  • Single-person selection (largest bbox) with consistent tracking
  • Sliding window of 30 frames with reset on track loss/ID change
  • Graceful handling of: no detection, end of video, cv-mmap disconnect
  • CLI with --source, --checkpoint, --device, --window, --stride, --nats-url, --max-frames flags (using click)
  • Works without NATS server when --nats-url is omitted (console output fallback)
  • All tensor/array function signatures annotated with jaxtyping types (e.g., Float[Tensor, 'batch 1 seq 64 44']) and checked at runtime with beartype via @jaxtyped(typechecker=beartype) decorators
  • Generator-based input adapters — any Iterable[tuple[np.ndarray, dict]] works as a source

Must NOT Have (Guardrails)

  • No DDP: Demo must never import or call torch.distributed anything
  • No BaseModel subclassing: ScoNetDemo extends nn.Module directly
  • No repo restructuring: Don't touch existing opengait training/eval/data code
  • No TensorRT/DeepStream: Jetson acceleration is out of MVP scope
  • No multi-person: Single tracked person only
  • No GUI/visualization: Output is JSON, not rendered frames
  • No dataset recording/auto-labeling: This is inference only
  • No OpenCV GStreamer builds: Use pip-installed OpenCV
  • No magic preprocessing: Every transform step must be explicit and testable
  • No unbounded buffers: Every queue/buffer has a max size and drop policy

Verification Strategy

ZERO HUMAN INTERVENTION — ALL verification is agent-executed. No exceptions.

Test Decision

  • Infrastructure exists: NO (creating with this plan)
  • Automated tests: Tests after implementation (pytest)
  • Framework: pytest (via uv run pytest)
  • Setup: Add pytest to dev dependencies in pyproject.toml

QA Policy

Every task MUST include agent-executed QA scenarios. Evidence saved to .sisyphus/evidence/task-{N}-{scenario-slug}.{ext}.

  • CLI/Pipeline: Use Bash — run pipeline with sample video, validate output
  • Unit Tests: Use Bash — uv run pytest specific test files
  • NATS Integration: Use Bash — start NATS container, run pipeline, subscribe and validate JSON

Execution Strategy

Parallel Execution Waves

Wave 1 (Foundation — all independent, start immediately):
├── Task 1: Project scaffolding (opengait/demo/, tests/demo/, deps) [quick]
├── Task 2: ScoNetDemo nn.Module wrapper [deep]
├── Task 3: Silhouette preprocessing module [deep]
└── Task 4: Input adapters (cv-mmap + OpenCV) [unspecified-high]

Wave 2 (Core logic — depends on Wave 1 foundations):
├── Task 5: Ring buffer / sliding window manager (depends: 3) [unspecified-high]
├── Task 6: NATS JSON publisher (depends: 1) [quick]
├── Task 7: Unit tests — preprocessing (depends: 3) [unspecified-high]
└── Task 8: Unit tests — ScoNetDemo forward pass (depends: 2) [unspecified-high]

Wave 3 (Integration — combines all components):
├── Task 9: Main pipeline application + CLI (depends: 2,3,4,5,6) [deep]
├── Task 10: Single-person policy tests (depends: 5) [unspecified-high]
└── Task 11: Sample video acquisition (depends: 1) [quick]

Wave 4 (Verification — end-to-end):
├── Task 12: Integration tests + smoke test (depends: 9,11) [deep]
└── Task 13: NATS integration test (depends: 9,6) [unspecified-high]

Wave FINAL (Independent review — 4 parallel):
├── Task F1: Plan compliance audit (oracle)
├── Task F2: Code quality review (unspecified-high)
├── Task F3: Real manual QA (unspecified-high)
└── Task F4: Scope fidelity check (deep)

Critical Path: Task 1 → Task 2 → Task 3 → Task 5 → Task 9 → Task 12 → F1-F4
Parallel Speedup: ~60% faster than sequential
Max Concurrent: 4 (Waves 1 & 2)

Dependency Matrix

Task Depends On Blocks Wave
1 6, 11 1
2 8, 9 1
3 5, 7, 9 1
4 9 1
5 3 9, 10 2
6 1 9, 13 2
7 3 2
8 2 2
9 2, 3, 4, 5, 6 12, 13 3
10 5 3
11 1 12 3
12 9, 11 F1-F4 4
13 9, 6 F1-F4 4
F1-F4 12, 13 FINAL

Agent Dispatch Summary

  • Wave 1: 4 — T1 → quick, T2 → deep, T3 → deep, T4 → unspecified-high
  • Wave 2: 4 — T5 → unspecified-high, T6 → quick, T7 → unspecified-high, T8 → unspecified-high
  • Wave 3: 3 — T9 → deep, T10 → unspecified-high, T11 → quick
  • Wave 4: 2 — T12 → deep, T13 → unspecified-high
  • FINAL: 4 — F1 → oracle, F2 → unspecified-high, F3 → unspecified-high, F4 → deep

TODOs

Implementation + Test = ONE Task. Never separate. EVERY task MUST have: Recommended Agent Profile + Parallelization info + QA Scenarios.


  • 1. Project Scaffolding + Dependencies

    What to do:

    • Create opengait/demo/__init__.py (empty, makes it a package)
    • Create opengait/demo/__main__.py (stub: from .pipeline import main; main())
    • Create tests/demo/__init__.py and tests/__init__.py if missing
    • Create tests/demo/conftest.py with shared fixtures (sample tensor, mock frame)
    • Add dev dependencies to pyproject.toml: pytest, nats-py, ultralytics, jaxtyping, beartype, click
    • Verify: uv sync --extra torch succeeds with new deps
    • Verify: uv run python -c "import ultralytics; import nats; import jaxtyping; import beartype; import click" works

    Must NOT do:

    • Don't modify existing opengait code or imports
    • Don't add runtime deps that aren't needed (no flask, no fastapi, etc.)

    Recommended Agent Profile:

    • Category: quick
      • Reason: Boilerplate file creation and dependency management, no complex logic
    • Skills: []
    • Skills Evaluated but Omitted:
      • explore: Not needed — we know exactly what files to create

    Parallelization:

    • Can Run In Parallel: YES
    • Parallel Group: Wave 1 (with Tasks 1, 2, 3)
    • Blocks: Tasks 6, 11
    • Blocked By: None (can start immediately)

    References:

    Pattern References:

    • opengait/modeling/models/__init__.py — Example of package init in this repo
    • pyproject.toml — Current dependency structure; add to [project.optional-dependencies] or [dependency-groups]

    External References:

    • ultralytics pip package: pip install ultralytics (includes YOLO + ByteTrack)
    • nats-py: pip install nats-py (async NATS client)

    WHY Each Reference Matters:

    • pyproject.toml: Must match existing dep management style (uv + groups) to avoid breaking uv sync
    • opengait/modeling/models/__init__.py: Shows the repo's package init convention (dynamic imports vs empty)

    Acceptance Criteria:

    • opengait/demo/__init__.py exists
    • opengait/demo/__main__.py exists with stub entry point
    • tests/demo/conftest.py exists with at least one fixture
    • uv sync succeeds without errors
    • uv run python -c "import ultralytics; import nats; import jaxtyping; import beartype; import click; print('OK')" prints OK

    QA Scenarios:

    Scenario: Dependencies install correctly
      Tool: Bash
      Preconditions: Clean uv environment
      Steps:
        1. Run `uv sync --extra torch`
        2. Run `uv run python -c "import ultralytics; import nats; import pytest; import jaxtyping; import beartype; import click; print('ALL_DEPS_OK')"`
      Expected Result: Both commands exit 0, second prints 'ALL_DEPS_OK'
      Failure Indicators: ImportError, uv sync failure, missing package
      Evidence: .sisyphus/evidence/task-4-deps-install.txt
    
    Scenario: Package structure is importable
      Tool: Bash
      Preconditions: uv sync completed
      Steps:
        1. Run `uv run python -c "from opengait.demo import __main__; print('IMPORT_OK')"`
      Expected Result: Prints 'IMPORT_OK' without errors
      Failure Indicators: ModuleNotFoundError, ImportError
      Evidence: .sisyphus/evidence/task-4-import-check.txt
    

    Commit: YES

    • Message: chore(demo): scaffold demo package and test infrastructure
    • Files: opengait/demo/__init__.py, opengait/demo/__main__.py, tests/demo/conftest.py, tests/demo/__init__.py, tests/__init__.py, pyproject.toml
    • Pre-commit: uv sync && uv run python -c "import ultralytics; import nats; import jaxtyping; import beartype; import click"
  • 2. ScoNetDemo — DDP-Free Inference Wrapper

    What to do:

    • Create opengait/demo/sconet_demo.py
    • Class ScoNetDemo(nn.Module) — NOT a BaseModel subclass
    • Constructor takes cfg_path: str and checkpoint_path: str
    • Use config_loader from opengait/utils/common.py to parse YAML config
    • Build the ScoNet architecture layers directly:
      • Backbone (ResNet9 from opengait/modeling/backbones/resnet.py)
      • TemporalPool (from opengait/modeling/modules.py)
      • HorizontalPoolingPyramid (from opengait/modeling/modules.py)
      • SeparateFCs (from opengait/modeling/modules.py)
      • SeparateBNNecks (from opengait/modeling/modules.py)
    • Load checkpoint: torch.load(checkpoint_path, map_location=device) → extract state_dict → load_state_dict()
    • Handle checkpoint format: may be {'model': state_dict, ...} or plain state_dict
    • Strip module. prefix from DDP-wrapped keys if present
    • All public methods decorated with @jaxtyped(typechecker=beartype) for runtime shape checking
    • forward(sils: Float[Tensor, 'batch 1 seq 64 44']) -> dict where seq=30 (window size)
      • Use jaxtyping: from jaxtyping import Float, Int, jaxtyped
      • Use beartype: from beartype import beartype
    • Returns {'logits': Float[Tensor, 'batch 3 16'], 'label': int, 'confidence': float}
    • predict(sils: Float[Tensor, 'batch 1 seq 64 44']) -> tuple[str, float] convenience method: returns ('positive'|'neutral'|'negative', confidence)
    • Prediction logic: argmax(logits.mean(dim=-1), dim=-1) → index → label string
    • Confidence: softmax(logits.mean(dim=-1)).max() — probability of chosen class
    • Class mapping: {0: 'negative', 1: 'neutral', 2: 'positive'}

    Must NOT do:

    • Do NOT import anything from torch.distributed
    • Do NOT subclass BaseModel
    • Do NOT use ddp_all_gather or get_ddp_module
    • Do NOT modify sconet.py or any existing model file

    Recommended Agent Profile:

    • Category: deep
      • Reason: Requires understanding ScoNet's architecture, checkpoint format, and careful weight loading — high complexity, correctness-critical
    • Skills: []
    • Skills Evaluated but Omitted:
      • explore: Agent should read referenced files directly, not search broadly

    Parallelization:

    • Can Run In Parallel: YES
    • Parallel Group: Wave 1 (with Tasks 2, 3, 4)
    • Blocks: Tasks 8, 9
    • Blocked By: None (can start immediately)

    References:

    Pattern References:

    • opengait/modeling/models/sconet.py — ScoNet model definition. Study __init__ to see which submodules are built and how forward() assembles the pipeline. Lines ~10-54.
    • opengait/modeling/base_model.py — BaseModel class. Study __init__ (lines ~30-80) to see how it builds backbone/heads from config. Replicate the build logic WITHOUT DDP calls.
    • All-in-One-Gait BaselineDemo pattern: extends nn.Module directly, uses torch.load() + load_state_dict() with training=False

    API/Type References:

    • opengait/modeling/backbones/resnet.py — ResNet9 backbone class. Constructor signature and forward signature.
    • opengait/modeling/modules.pyTemporalPool, HorizontalPoolingPyramid, SeparateFCs, SeparateBNNecks classes. Constructor args come from config YAML.
    • opengait/utils/common.py::config_loader — Loads YAML config, merges with default.yaml. Returns dict.

    Config References:

    • configs/sconet/sconet_scoliosis1k.yaml — ScoNet config specifying backbone, head, loss params. The model_cfg section defines architecture hyperparams.
    • configs/default.yaml — Default config merged by config_loader

    Checkpoint Reference:

    • ./ckpt/ScoNet-20000.pt — Trained ScoNet checkpoint. Verify format: torch.load() and inspect keys.

    Inference Logic Reference:

    • opengait/evaluation/evaluator.py:evaluate_scoliosis() (line ~418) — Shows argmax(logits.mean(-1)) prediction logic and label mapping

    WHY Each Reference Matters:

    • sconet.py: Defines the exact forward pass we must replicate — TP → backbone → HPP → FCs → BNNecks
    • base_model.py: Shows how config is parsed into submodule instantiation — we copy this logic minus DDP
    • modules.py: Constructor signatures tell us what config keys to extract
    • evaluator.py: The prediction aggregation (mean over parts, argmax) is the canonical inference logic
    • sconet_scoliosis1k.yaml: Contains the exact hyperparams (channels, num_parts, etc.) for building layers

    Acceptance Criteria:

    • opengait/demo/sconet_demo.py exists with ScoNetDemo(nn.Module) class
    • No torch.distributed imports in the file
    • ScoNetDemo does not inherit from BaseModel
    • uv run python -c "from opengait.demo.sconet_demo import ScoNetDemo; print('OK')" works

    QA Scenarios:

    Scenario: ScoNetDemo loads checkpoint and produces correct output shape
      Tool: Bash
      Preconditions: ./ckpt/ScoNet-20000.pt exists, CUDA available
      Steps:
        1. Run `uv run python -c "`
           ```python
           import torch
           from opengait.demo.sconet_demo import ScoNetDemo
           model = ScoNetDemo('./configs/sconet/sconet_scoliosis1k.yaml', './ckpt/ScoNet-20000.pt', device='cuda:0')
           model.eval()
           dummy = torch.rand(1, 1, 30, 64, 44, device='cuda:0')
           with torch.no_grad():
               result = model(dummy)
           assert result['logits'].shape == (1, 3, 16), f'Bad shape: {result["logits"].shape}'
           label, conf = model.predict(dummy)
           assert label in ('negative', 'neutral', 'positive'), f'Bad label: {label}'
           assert 0.0 <= conf <= 1.0, f'Bad confidence: {conf}'
           print(f'SCONET_OK label={label} conf={conf:.3f}')
           ```
      Expected Result: Prints 'SCONET_OK label=... conf=...' with valid label and confidence
      Failure Indicators: ImportError, state_dict key mismatch, shape error, CUDA error
      Evidence: .sisyphus/evidence/task-1-sconet-forward.txt
    
    Scenario: ScoNetDemo rejects DDP-wrapped usage
      Tool: Bash
      Preconditions: File exists
      Steps:
        1. Run `grep -c 'torch.distributed' opengait/demo/sconet_demo.py`
        2. Run `grep -c 'BaseModel' opengait/demo/sconet_demo.py`
      Expected Result: Both commands output '0'
      Failure Indicators: Any count > 0
      Evidence: .sisyphus/evidence/task-1-no-ddp.txt
    

    Commit: YES

    • Message: feat(demo): add ScoNetDemo DDP-free inference wrapper
    • Files: opengait/demo/sconet_demo.py
    • Pre-commit: uv run python -c "from opengait.demo.sconet_demo import ScoNetDemo"
  • 3. Silhouette Preprocessing Module

    What to do:

    • Create opengait/demo/preprocess.py
    • All public functions decorated with @jaxtyped(typechecker=beartype) for runtime shape checking
    • Function mask_to_silhouette(mask: UInt8[ndarray, 'h w'], bbox: tuple[int,int,int,int]) -> Float[ndarray, '64 44'] | None:
      • Uses jaxtyping: from jaxtyping import Float, UInt8, jaxtyped and from numpy import ndarray
      • Input: binary mask (H, W) uint8 from YOLO, bounding box (x1, y1, x2, y2)
      • Crop mask to bbox region
      • Find vertical extent of foreground pixels (top/bottom rows with nonzero)
      • Crop to tight vertical bounding box (remove empty rows above/below)
      • Resize height to 64, maintaining aspect ratio
      • Center-crop or center-pad width to 64
      • Cut 10px from each side → final 64×44
      • Return float32 array [0.0, 1.0] (divide by 255)
      • Return None if mask area below MIN_MASK_AREA threshold (default: 500 pixels)
    • Function frame_to_person_mask(result, min_area: int = 500) -> tuple[UInt8[ndarray, 'h w'], tuple[int,int,int,int]] | None:
      • Extract single-person mask + bbox from YOLO result object
      • Uses result.masks.data and result.boxes.xyxy
      • Returns None if no valid detection
    • Constants: SIL_HEIGHT = 64, SIL_WIDTH = 44, SIL_FULL_WIDTH = 64, SIDE_CUT = 10, MIN_MASK_AREA = 500
    • Each step must match the preprocessing in datasets/pretreatment.py (grayscale → crop → resize → center) and BaseSilCuttingTransform (cut sides → /255)

    Must NOT do:

    • Don't import or modify datasets/pretreatment.py
    • Don't add color/texture features — binary silhouettes only
    • Don't resize to arbitrary sizes — must be exactly 64×44 output

    Recommended Agent Profile:

    • Category: deep
      • Reason: Correctness-critical — must exactly match training data preprocessing. Subtle pixel-level bugs will silently degrade accuracy.
    • Skills: []

    Parallelization:

    • Can Run In Parallel: YES
    • Parallel Group: Wave 1 (with Tasks 1, 2, 4)
    • Blocks: Tasks 5, 7, 9
    • Blocked By: None

    References:

    Pattern References:

    • datasets/pretreatment.py:18-96 (function imgs2pickle) — The canonical preprocessing pipeline. Study lines 45-80 carefully: cv2.imread(GRAYSCALE) → find contours → crop to person bbox → cv2.resize(img, (int(64 * ratio), 64)) → center-crop width. This is the EXACT sequence to replicate for live masks.
    • opengait/data/transform.py:46-58 (BaseSilCuttingTransform) — The runtime transform applied during training/eval. cutting = int(w // 64) * 10 then slices [:, :, cutting:-cutting] then divides by 255.0. For w=64 input, cutting=10, output width=44.

    API/Type References:

    • Ultralytics Results object: result.masks.dataTensor[N, H, W] binary masks; result.boxes.xyxyTensor[N, 4] bounding boxes; result.boxes.id → track IDs (may be None)

    WHY Each Reference Matters:

    • pretreatment.py: Defines the ground truth preprocessing. If our live pipeline differs by even 1 pixel of padding, ScoNet accuracy degrades.
    • BaseSilCuttingTransform: The 10px side cut + /255 is applied at runtime during training. We must apply the SAME transform.
    • Ultralytics masks: Need to know exact API to extract binary masks from YOLO output

    Acceptance Criteria:

    • opengait/demo/preprocess.py exists
    • mask_to_silhouette() returns np.ndarray of shape (64, 44) dtype float32 with values in [0, 1]
    • Returns None for masks below MIN_MASK_AREA

    QA Scenarios:

    Scenario: Preprocessing produces correct output shape and range
      Tool: Bash
      Preconditions: Module importable
      Steps:
        1. Run `uv run python -c "`
           ```python
           import numpy as np
           from opengait.demo.preprocess import mask_to_silhouette
           # Create a synthetic mask: 200x100 person-shaped blob
           mask = np.zeros((480, 640), dtype=np.uint8)
           mask[100:400, 250:400] = 255  # person region
           bbox = (250, 100, 400, 400)
           sil = mask_to_silhouette(mask, bbox)
           assert sil is not None, 'Should not be None for valid mask'
           assert sil.shape == (64, 44), f'Bad shape: {sil.shape}'
           assert sil.dtype == np.float32, f'Bad dtype: {sil.dtype}'
           assert 0.0 <= sil.min() and sil.max() <= 1.0, f'Bad range: [{sil.min()}, {sil.max()}]'
           assert sil.max() > 0, 'Should have nonzero pixels'
           print('PREPROCESS_OK')
           ```
      Expected Result: Prints 'PREPROCESS_OK'
      Failure Indicators: Shape mismatch, dtype error, range error
      Evidence: .sisyphus/evidence/task-3-preprocess-shape.txt
    
    Scenario: Small masks are rejected
      Tool: Bash
      Preconditions: Module importable
      Steps:
        1. Run `uv run python -c "`
           ```python
           import numpy as np
           from opengait.demo.preprocess import mask_to_silhouette
           # Tiny mask: only 10x10 = 100 pixels (below MIN_MASK_AREA=500)
           mask = np.zeros((480, 640), dtype=np.uint8)
           mask[100:110, 100:110] = 255
           bbox = (100, 100, 110, 110)
           sil = mask_to_silhouette(mask, bbox)
           assert sil is None, f'Should be None for tiny mask, got {type(sil)}'
           print('SMALL_MASK_REJECTED_OK')
           ```
      Expected Result: Prints 'SMALL_MASK_REJECTED_OK'
      Failure Indicators: Returns non-None for tiny mask
      Evidence: .sisyphus/evidence/task-3-small-mask-reject.txt
    

    Commit: YES

    • Message: feat(demo): add silhouette preprocessing module
    • Files: opengait/demo/preprocess.py
    • Pre-commit: uv run python -c "from opengait.demo.preprocess import mask_to_silhouette"
  • 4. Input Adapters (cv-mmap + OpenCV)

    What to do:

    • Create opengait/demo/input.py
    • The pipeline contract is simple: it consumes any Iterable[tuple[np.ndarray, dict]] — any generator or iterator that yields (frame_bgr_uint8, metadata_dict) works
    • Type alias: FrameStream = Iterable[tuple[np.ndarray, dict]]
    • Generator function opencv_source(path: str | int, max_frames: int | None = None) -> Generator[tuple[np.ndarray, dict], None, None]:
      • path can be video file path or camera index (int)
      • Opens cv2.VideoCapture(path)
      • Yields (frame, {'frame_count': int, 'timestamp_ns': int}) tuples
      • Handles end-of-video gracefully (just returns)
      • Handles camera disconnect (log warning, return)
      • Respects max_frames limit
    • Generator function cvmmap_source(name: str, max_frames: int | None = None) -> Generator[tuple[np.ndarray, dict], None, None]:
      • Wraps CvMmapClient from /home/crosstyan/Code/cv-mmap/client/cvmmap/
      • Since cv-mmap is async (anyio), this adapter must bridge async→sync:
        • Run anyio event loop in a background thread, drain frames via queue.Queue
        • Or use anyio.from_thread / asyncio.run() with async for internally
        • Choose simplest correct approach
      • Yields same (frame, metadata_dict) tuple format as opencv_source
      • Handles cv-mmap disconnect/offline events gracefully
      • Import should be conditional (try/except ImportError) so pipeline works without cv-mmap installed
    • Factory function create_source(source: str, max_frames: int | None = None) -> FrameStream:
      • If source starts with cvmmap://cvmmap_source(name)
      • If source is a digit string → opencv_source(int(source)) (camera index)
      • Otherwise → opencv_source(source) (file path)
    • The key design point: any user-written generator that yields (np.ndarray, dict) plugs in directly — no class inheritance needed

    Must NOT do:

    • Don't build GStreamer pipelines
    • Don't add async to the main pipeline loop — keep synchronous pull model
    • Don't use abstract base classes or heavy OOP — plain generator functions are the interface
    • Don't buffer frames internally (no unbounded queue between source and consumer)

    Recommended Agent Profile:

    • Category: unspecified-high
      • Reason: Integration with external library (cv-mmap) requires careful async→sync bridging
    • Skills: []

    Parallelization:

    • Can Run In Parallel: YES
    • Parallel Group: Wave 1 (with Tasks 1, 3, 4)
    • Blocks: Task 9
    • Blocked By: None

    References:

    Pattern References:

    • /home/crosstyan/Code/cv-mmap/client/cvmmap/__init__.pyCvMmapClient class. Async iterator: async for im, meta in client. Understand the __aiter__/__anext__ protocol.
    • /home/crosstyan/Code/cv-mmap/client/test_cvmmap.py — Example consumer pattern using anyio.run()
    • /home/crosstyan/Code/cv-mmap/client/cvmmap/msg.pyFrameMetadata and FrameInfo dataclasses. Fields: frame_count, timestamp_ns, info.width, info.height, info.pixel_format

    API/Type References:

    • cv2.VideoCapture — OpenCV video capture. cap.read() returns (bool, np.ndarray). cap.get(cv2.CAP_PROP_FRAME_COUNT) for total frames.

    WHY Each Reference Matters:

    • CvMmapClient: The async iterator yields (numpy_array, FrameMetadata) — need to know exact types for sync bridging
    • msg.py: Metadata fields must be mapped to our generic dict metadata format
    • test_cvmmap.py: Shows the canonical consumer pattern we must wrap

    Acceptance Criteria:

    • opengait/demo/input.py exists with opencv_source, cvmmap_source, create_source as functions (not classes)
    • create_source('./some/video.mp4') returns a generator/iterable
    • create_source('cvmmap://default') returns a generator (or raises if cv-mmap not installed)
    • create_source('0') returns a generator for camera index 0
    • Any custom generator def my_source(): yield (frame, meta) can be used directly by the pipeline

    QA Scenarios:

    Scenario: opencv_source reads frames from a video file
      Tool: Bash
      Preconditions: Any .mp4 or .avi file available (use sample video or create synthetic one)
      Steps:
        1. Create a short test video if none exists:
           `uv run python -c "import cv2; import numpy as np; w=cv2.VideoWriter('/tmp/test.avi', cv2.VideoWriter_fourcc(*'MJPG'), 15, (640,480)); [w.write(np.random.randint(0,255,(480,640,3),dtype=np.uint8)) for _ in range(30)]; w.release(); print('VIDEO_CREATED')"`
        2. Run `uv run python -c "`
           ```python
           from opengait.demo.input import create_source
           src = create_source('/tmp/test.avi', max_frames=10)
           count = 0
           for frame, meta in src:
               assert frame.shape[2] == 3, f'Not BGR: {frame.shape}'
               assert 'frame_count' in meta
               count += 1
           assert count == 10, f'Expected 10 frames, got {count}'
           print('OPENCV_SOURCE_OK')
           ```
      Expected Result: Prints 'OPENCV_SOURCE_OK'
      Failure Indicators: Shape error, missing metadata, wrong frame count
      Evidence: .sisyphus/evidence/task-2-opencv-source.txt
    
    Scenario: Custom generator works as pipeline input
      Tool: Bash
      Steps:
        1. Run `uv run python -c "`
           ```python
           import numpy as np
           from opengait.demo.input import FrameStream
           import typing
           # Any generator works — no class needed
           def my_source():
               for i in range(5):
                   yield np.zeros((480, 640, 3), dtype=np.uint8), {'frame_count': i}
           src = my_source()
           frames = list(src)
           assert len(frames) == 5
           print('CUSTOM_GENERATOR_OK')
           ```
      Expected Result: Prints 'CUSTOM_GENERATOR_OK'
      Failure Indicators: Type error, protocol mismatch
      Evidence: .sisyphus/evidence/task-2-custom-gen.txt
    

    Commit: YES

    • Message: feat(demo): add generator-based input adapters for cv-mmap and OpenCV
    • Files: opengait/demo/input.py
    • Pre-commit: uv run python -c "from opengait.demo.input import create_source"
  • 5. Sliding Window / Ring Buffer Manager

    What to do:

    • Create opengait/demo/window.py
    • Class SilhouetteWindow:
      • Constructor: __init__(self, window_size: int = 30, stride: int = 1, gap_threshold: int = 15)
      • Internal storage: collections.deque(maxlen=window_size) of np.ndarray (64×44 float32)
      • push(sil: np.ndarray, frame_idx: int, track_id: int) -> None:
        • If track_id differs from current tracked ID → reset buffer, update tracked ID
        • If frame_idx - last_frame_idx > gap_threshold → reset buffer (too many missed frames)
        • Append silhouette to deque
        • Increment internal frame counter
      • is_ready() -> bool: returns len(buffer) == window_size
      • should_classify() -> bool: returns is_ready() and (frames_since_last_classify >= stride)
      • get_tensor(device: str = 'cpu') -> torch.Tensor:
        • Stack buffer into np.array shape [window_size, 64, 44]
        • Convert to torch.Tensor shape [1, 1, window_size, 64, 44] on device
        • This is the exact input shape for ScoNetDemo
      • reset() -> None: clear buffer and counters
      • mark_classified() -> None: reset frames_since_last_classify counter
      • Properties: current_track_id, frame_count, fill_level (len/window_size as float)
    • Single-person selection policy (function or small helper):
      • select_person(results) -> tuple[np.ndarray, tuple, int] | None
      • From YOLO results, select the detection with the largest bounding box area
      • Return (mask, bbox, track_id) or None if no valid detection
      • If result.boxes.id is None (tracker not yet initialized), skip frame

    Must NOT do:

    • No unbounded buffers — deque with maxlen enforces this
    • No multi-person tracking — single person only, select largest bbox
    • No time-based windowing — frame-count based only

    Recommended Agent Profile:

    • Category: unspecified-high
      • Reason: Data structure + policy logic with edge cases (ID changes, gaps, resets)
    • Skills: []

    Parallelization:

    • Can Run In Parallel: YES
    • Parallel Group: Wave 2 (with Tasks 6, 7, 8)
    • Blocks: Tasks 9, 10
    • Blocked By: Task 3 (needs silhouette shape constants from preprocess.py)

    References:

    Pattern References:

    • opengait/demo/preprocess.py (Task 3) — SIL_HEIGHT, SIL_WIDTH constants. The window stores arrays of this shape.
    • opengait/data/dataset.py — Shows how OpenGait's DataSet samples fixed-length sequences. The seqL parameter controls sequence length (our window_size=30).

    API/Type References:

    • Ultralytics Results.boxes.id — Track IDs tensor, may be None if tracker hasn't assigned IDs yet
    • Ultralytics Results.boxes.xyxy — Bounding boxes [N, 4] for area calculation
    • Ultralytics Results.masks.data — Binary masks [N, H, W]

    WHY Each Reference Matters:

    • preprocess.py: Window must store silhouettes of the exact shape produced by preprocessing
    • dataset.py: Understanding how training samples sequences helps ensure our window matches
    • Ultralytics API: Need to handle None track IDs and extract correct tensors

    Acceptance Criteria:

    • opengait/demo/window.py exists with SilhouetteWindow class and select_person function
    • Buffer is bounded (deque with maxlen)
    • get_tensor() returns shape [1, 1, 30, 64, 44] when full
    • Track ID change triggers reset
    • Gap exceeding threshold triggers reset

    QA Scenarios:

    Scenario: Window fills and produces correct tensor shape
      Tool: Bash
      Preconditions: Module importable
      Steps:
        1. Run `uv run python -c "`
           ```python
           import numpy as np
           from opengait.demo.window import SilhouetteWindow
           win = SilhouetteWindow(window_size=30, stride=1)
           for i in range(30):
               sil = np.random.rand(64, 44).astype(np.float32)
               win.push(sil, frame_idx=i, track_id=1)
           assert win.is_ready(), 'Window should be ready after 30 frames'
           t = win.get_tensor()
           assert t.shape == (1, 1, 30, 64, 44), f'Bad shape: {t.shape}'
           assert t.dtype.is_floating_point, f'Bad dtype: {t.dtype}'
           print('WINDOW_FILL_OK')
           ```
      Expected Result: Prints 'WINDOW_FILL_OK'
      Failure Indicators: Shape mismatch, not ready after 30 pushes
      Evidence: .sisyphus/evidence/task-5-window-fill.txt
    
    Scenario: Track ID change resets buffer
      Tool: Bash
      Steps:
        1. Run `uv run python -c "`
           ```python
           import numpy as np
           from opengait.demo.window import SilhouetteWindow
           win = SilhouetteWindow(window_size=30)
           for i in range(20):
               win.push(np.random.rand(64, 44).astype(np.float32), frame_idx=i, track_id=1)
           assert win.frame_count == 20
           # Switch track ID — should reset
           win.push(np.random.rand(64, 44).astype(np.float32), frame_idx=20, track_id=2)
           assert win.frame_count == 1, f'Should reset to 1, got {win.frame_count}'
           assert win.current_track_id == 2
           print('TRACK_RESET_OK')
           ```
      Expected Result: Prints 'TRACK_RESET_OK'
      Failure Indicators: Buffer not reset, wrong track ID
      Evidence: .sisyphus/evidence/task-5-track-reset.txt
    

    Commit: YES

    • Message: feat(demo): add sliding window manager with single-person selection
    • Files: opengait/demo/window.py
    • Pre-commit: uv run python -c "from opengait.demo.window import SilhouetteWindow"
  • 6. NATS JSON Publisher

    What to do:

    • Create opengait/demo/output.py
    • Class ResultPublisher(Protocol) — any object with publish(result: dict) -> None
    • Function console_publisher() -> Generator or simple class ConsolePublisher:
      • Prints JSON to stdout (default when --nats-url is not provided)
      • Format: one JSON object per line (JSONL)
    • Class NatsPublisher:
      • Constructor: __init__(self, url: str = 'nats://127.0.0.1:4222', subject: str = 'scoliosis.result')
      • Uses nats-py async client, bridged to sync publish() method
      • Handles connection failures gracefully (log warning, retry with backoff, don't crash pipeline)
      • Handles reconnection automatically (nats-py does this by default)
      • publish(result: dict) -> None: serializes to JSON, publishes to subject
      • close() -> None: drain and close NATS connection
      • Context manager support (__enter__/__exit__)
    • JSON schema for results:
      {
        "frame": 1234,
        "track_id": 1,
        "label": "positive",
        "confidence": 0.82,
        "window": 30,
        "timestamp_ns": 1234567890000
      }
      
    • Factory: create_publisher(nats_url: str | None, subject: str = 'scoliosis.result') -> ResultPublisher
      • If nats_url is None → ConsolePublisher
      • Otherwise → NatsPublisher(url, subject)

    Must NOT do:

    • Don't use JetStream (plain NATS PUB/SUB is sufficient)
    • Don't build custom binary protocol
    • Don't buffer/batch results — publish immediately

    Recommended Agent Profile:

    • Category: quick
      • Reason: Straightforward JSON serialization + NATS client wrapper, well-documented library
    • Skills: []

    Parallelization:

    • Can Run In Parallel: YES
    • Parallel Group: Wave 2 (with Tasks 5, 7, 8)
    • Blocks: Tasks 9, 13
    • Blocked By: Task 1 (needs project scaffolding for nats-py dependency)

    References:

    External References:

    • nats-py docs: import nats; nc = await nats.connect(); await nc.publish(subject, data) — async API
    • /home/crosstyan/Code/cv-mmap-gui/ — Uses NATS.c for messaging; our Python publisher sends to the same broker

    WHY Each Reference Matters:

    • nats-py: Need to bridge async NATS client to sync publish() call
    • cv-mmap-gui: Confirms NATS is the right transport for this ecosystem

    Acceptance Criteria:

    • opengait/demo/output.py exists with ConsolePublisher, NatsPublisher, create_publisher
    • ConsolePublisher prints valid JSON to stdout
    • NatsPublisher connects and publishes without crashing (when NATS available)
    • NatsPublisher logs warning and doesn't crash when NATS unavailable

    QA Scenarios:

    Scenario: ConsolePublisher outputs valid JSONL
      Tool: Bash
      Steps:
        1. Run `uv run python -c "`
           ```python
           import json, io, sys
           from opengait.demo.output import create_publisher
           pub = create_publisher(nats_url=None)
           result = {'frame': 100, 'track_id': 1, 'label': 'positive', 'confidence': 0.85, 'window': 30, 'timestamp_ns': 0}
           pub.publish(result)  # should print to stdout
           print('CONSOLE_PUB_OK')
           ```
      Expected Result: Prints a JSON line followed by 'CONSOLE_PUB_OK'
      Failure Indicators: Invalid JSON, missing fields, crash
      Evidence: .sisyphus/evidence/task-6-console-pub.txt
    
    Scenario: NatsPublisher handles missing server gracefully
      Tool: Bash
      Steps:
        1. Run `uv run python -c "`
           ```python
           from opengait.demo.output import create_publisher
           try:
               pub = create_publisher(nats_url='nats://127.0.0.1:14222')  # wrong port, no server
               pub.publish({'frame': 0, 'label': 'test'})
           except SystemExit:
               print('SHOULD_NOT_EXIT')
               raise
           print('NATS_GRACEFUL_OK')
           ```
      Expected Result: Prints 'NATS_GRACEFUL_OK' (warning logged, no crash)
      Failure Indicators: Unhandled exception, SystemExit, hang
      Evidence: .sisyphus/evidence/task-6-nats-graceful.txt
    

    Commit: YES

    • Message: feat(demo): add NATS JSON publisher and console fallback
    • Files: opengait/demo/output.py
    • Pre-commit: uv run python -c "from opengait.demo.output import create_publisher"
  • 7. Unit Tests — Silhouette Preprocessing

    What to do:

    • Create tests/demo/test_preprocess.py
    • Test mask_to_silhouette() with:
      • Valid mask (person-shaped blob) → output shape (64, 44), dtype float32, range [0, 1]
      • Tiny mask below MIN_MASK_AREA → returns None
      • Empty mask (all zeros) → returns None
      • Full-frame mask (all 255) → produces valid output (edge case: very wide person)
      • Tall narrow mask → verify aspect ratio handling (resize h=64, center-crop width)
      • Wide short mask → verify handling (should still produce 64×44)
    • Test determinism: same input always produces same output
    • Test against a reference .pkl sample if available:
      • Load a known .pkl file from Scoliosis1K
      • Extract one frame
      • Compare our preprocessing output to the stored frame (should be close/identical)
    • Verify jaxtyping annotations are present and beartype checks fire on wrong shapes

    Must NOT do:

    • Don't test YOLO integration here — only test the mask_to_silhouette function in isolation
    • Don't require GPU — all preprocessing is CPU numpy ops

    Recommended Agent Profile:

    • Category: unspecified-high
      • Reason: Must verify pixel-level correctness against training data contract, multiple edge cases
    • Skills: []

    Parallelization:

    • Can Run In Parallel: YES
    • Parallel Group: Wave 2 (with Tasks 5, 6, 8)
    • Blocks: None (verification task)
    • Blocked By: Task 3 (preprocess module must exist)

    References:

    Pattern References:

    • opengait/demo/preprocess.py (Task 3) — The module under test
    • datasets/pretreatment.py:18-96 — Reference preprocessing to validate against
    • opengait/data/transform.py:46-58BaseSilCuttingTransform for expected output contract

    WHY Each Reference Matters:

    • preprocess.py: Direct test target
    • pretreatment.py: Ground truth for what a correct silhouette looks like
    • BaseSilCuttingTransform: Defines the 64→44 cut + /255 contract we must match

    Acceptance Criteria:

    • tests/demo/test_preprocess.py exists with ≥5 test cases
    • uv run pytest tests/demo/test_preprocess.py -q passes
    • Tests cover: valid mask, tiny mask, empty mask, determinism

    QA Scenarios:

    Scenario: All preprocessing tests pass
      Tool: Bash
      Preconditions: Task 3 (preprocess.py) is complete
      Steps:
        1. Run `uv run pytest tests/demo/test_preprocess.py -v`
      Expected Result: All tests pass (≥5 tests), exit code 0
      Failure Indicators: Any assertion failure, import error
      Evidence: .sisyphus/evidence/task-7-preprocess-tests.txt
    
    Scenario: Jaxtyping annotation enforcement works
      Tool: Bash
      Steps:
        1. Run `uv run python -c "`
           ```python
           import numpy as np
           from opengait.demo.preprocess import mask_to_silhouette
           # Intentionally wrong type to verify beartype catches it
           try:
               mask_to_silhouette('not_an_array', (0, 0, 10, 10))
               print('BEARTYPE_MISSED')  # should not reach here
           except Exception as e:
               if 'beartype' in type(e).__module__ or 'BeartypeCallHint' in type(e).__name__:
                   print('BEARTYPE_OK')
               else:
                   print(f'WRONG_ERROR: {type(e).__name__}: {e}')
           ```
      Expected Result: Prints 'BEARTYPE_OK'
      Failure Indicators: Prints 'BEARTYPE_MISSED' or 'WRONG_ERROR'
      Evidence: .sisyphus/evidence/task-7-beartype-check.txt
    

    Commit: YES (groups with Task 8)

    • Message: test(demo): add preprocessing and model unit tests
    • Files: tests/demo/test_preprocess.py
    • Pre-commit: uv run pytest tests/demo/test_preprocess.py -q
  • 8. Unit Tests — ScoNetDemo Forward Pass

    What to do:

    • Create tests/demo/test_sconet_demo.py
    • Test ScoNetDemo construction:
      • Loads config from YAML
      • Loads checkpoint weights
      • Model is in eval mode
    • Test forward() with dummy tensor:
      • Input: torch.rand(1, 1, 30, 64, 44) on available device
      • Output logits shape: (1, 3, 16)
      • Output dtype: float32
    • Test predict() convenience method:
      • Returns (label_str, confidence_float)
      • label_str is one of {'negative', 'neutral', 'positive'}
      • confidence is in [0.0, 1.0]
    • Test with various batch sizes: N=1, N=2
    • Test with various sequence lengths if model supports it (should work with 30)
    • Verify no torch.distributed calls are made (mock torch.distributed to raise if called)
    • Verify jaxtyping shape annotations on forward/predict signatures

    Must NOT do:

    • Don't test with real video data — dummy tensors only for unit tests
    • Don't modify the checkpoint

    Recommended Agent Profile:

    • Category: unspecified-high
      • Reason: Requires CUDA, checkpoint loading, shape validation — moderate complexity
    • Skills: []

    Parallelization:

    • Can Run In Parallel: YES
    • Parallel Group: Wave 2 (with Tasks 5, 6, 7)
    • Blocks: None (verification task)
    • Blocked By: Task 2 (ScoNetDemo must exist)

    References:

    Pattern References:

    • opengait/demo/sconet_demo.py (Task 1) — The module under test
    • opengait/evaluation/evaluator.py:evaluate_scoliosis() (line ~418) — Canonical prediction logic to validate against

    Config/Checkpoint References:

    • configs/sconet/sconet_scoliosis1k.yaml — Config file to pass to ScoNetDemo
    • ./ckpt/ScoNet-20000.pt — Trained checkpoint

    WHY Each Reference Matters:

    • sconet_demo.py: Direct test target
    • evaluator.py: Defines expected prediction behavior (argmax of mean logits)

    Acceptance Criteria:

    • tests/demo/test_sconet_demo.py exists with ≥4 test cases
    • uv run pytest tests/demo/test_sconet_demo.py -q passes
    • Tests cover: construction, forward shape, predict output, no-DDP enforcement

    QA Scenarios:

    Scenario: All ScoNetDemo tests pass
      Tool: Bash
      Preconditions: Task 1 (sconet_demo.py) is complete, checkpoint exists, CUDA available
      Steps:
        1. Run `CUDA_VISIBLE_DEVICES=0 uv run pytest tests/demo/test_sconet_demo.py -v`
      Expected Result: All tests pass (≥4 tests), exit code 0
      Failure Indicators: state_dict key mismatch, shape error, CUDA OOM
      Evidence: .sisyphus/evidence/task-8-sconet-tests.txt
    
    Scenario: No DDP leakage in ScoNetDemo
      Tool: Bash
      Steps:
        1. Run `grep -rn 'torch.distributed' opengait/demo/sconet_demo.py`
        2. Run `grep -rn 'ddp_all_gather\|get_ddp_module\|get_rank' opengait/demo/sconet_demo.py`
      Expected Result: Both commands produce no output (exit code 1 = no matches)
      Failure Indicators: Any match found
      Evidence: .sisyphus/evidence/task-8-no-ddp.txt
    

    Commit: YES (groups with Task 7)

    • Message: test(demo): add preprocessing and model unit tests
    • Files: tests/demo/test_sconet_demo.py
    • Pre-commit: uv run pytest tests/demo/test_sconet_demo.py -q
  • 9. Main Pipeline Application + CLI

    What to do:

    • Create opengait/demo/pipeline.py — the main orchestrator
    • Create opengait/demo/__main__.py — CLI entry point (replace stub from Task 4)
    • Pipeline class ScoliosisPipeline:
      • Constructor: __init__(self, source: FrameStream, model: ScoNetDemo, publisher: ResultPublisher, window: SilhouetteWindow, yolo_model_path: str = 'yolo11n-seg.pt', device: str = 'cuda:0')
      • Uses jaxtyping annotations for all tensor-bearing methods:
        from jaxtyping import Float, UInt8, jaxtyped
        from beartype import beartype
        from torch import Tensor
        import numpy as np
        from numpy import ndarray
        
      • run() -> None — main loop:
        1. Load YOLO model: ultralytics.YOLO(yolo_model_path)
        2. For each (frame, meta) from source: a. Run yolo_model.track(frame, persist=True, verbose=False) → results b. select_person(results)(mask, bbox, track_id) or None → skip if None c. mask_to_silhouette(mask, bbox)sil or None → skip if None d. window.push(sil, meta['frame_count'], track_id) e. If window.should_classify():
          • tensor = window.get_tensor(device=self.device)
          • label, confidence = self.model.predict(tensor)
          • publisher.publish({...}) with JSON schema fields
          • window.mark_classified()
        3. Log FPS every 100 frames
        4. Cleanup on exit (close publisher, release resources)
      • Graceful shutdown on KeyboardInterrupt / SIGTERM
    • CLI via __main__.py using click:
      • --source (required): video path, camera index, or cvmmap://name
      • --checkpoint (required): path to ScoNet checkpoint
      • --config (default: ./configs/sconet/sconet_scoliosis1k.yaml): ScoNet config YAML
      • --device (default: cuda:0): torch device
      • --yolo-model (default: yolo11n-seg.pt): YOLO model path (auto-downloads)
      • --window (default: 30): sliding window size
      • --stride (default: 30): classify every N frames after window is full
      • --nats-url (default: None): NATS server URL, None = console output
      • --nats-subject (default: scoliosis.result): NATS subject
      • --max-frames (default: None): stop after N frames
      • --help: print usage
    • Entrypoint: uv run python -m opengait.demo ...

    Must NOT do:

    • No async in the main loop — synchronous pull-process-publish
    • No multi-threading for inference — single-threaded pipeline
    • No GUI / frame display / cv2.imshow
    • No unbounded accumulation — ring buffer handles memory
    • No auto-download of ScoNet checkpoint — user must provide path

    Recommended Agent Profile:

    • Category: deep
      • Reason: Integration of all components, error handling, CLI design, FPS logging — largest single task
    • Skills: []

    Parallelization:

    • Can Run In Parallel: NO
    • Parallel Group: Wave 3 (sequential — depends on most Wave 1+2 tasks)
    • Blocks: Tasks 12, 13
    • Blocked By: Tasks 2, 3, 4, 5, 6 (all components must exist)

    References:

    Pattern References:

    • opengait/demo/sconet_demo.py (Task 1) — ScoNetDemo class, predict() method
    • opengait/demo/preprocess.py (Task 3) — mask_to_silhouette(), frame_to_person_mask()
    • opengait/demo/window.py (Task 5) — SilhouetteWindow, select_person()
    • opengait/demo/input.py (Task 2) — create_source(), FrameStream type alias
    • opengait/demo/output.py (Task 6) — create_publisher(), ResultPublisher

    External References:

    • Ultralytics tracking API: model.track(frame, persist=True) — returns Results list
    • Ultralytics result object: results[0].masks.data, results[0].boxes.xyxy, results[0].boxes.id

    WHY Each Reference Matters:

    • All Task refs: This task composes every component — must know each API surface
    • Ultralytics: The YOLO .track() call is the only external API used directly in this file

    Acceptance Criteria:

    • opengait/demo/pipeline.py exists with ScoliosisPipeline class
    • opengait/demo/__main__.py exists with click CLI
    • uv run python -m opengait.demo --help prints usage without errors
    • All public methods have jaxtyping annotations where tensor/array args are involved

    QA Scenarios:

    Scenario: CLI --help works
      Tool: Bash
      Steps:
        1. Run `uv run python -m opengait.demo --help`
      Expected Result: Prints usage showing --source, --checkpoint, --device, --window, --stride, --nats-url, --max-frames
      Failure Indicators: ImportError, missing arguments, crash
      Evidence: .sisyphus/evidence/task-9-help.txt
    
    Scenario: Pipeline runs with sample video (no NATS)
      Tool: Bash
      Preconditions: sample.mp4 exists (from Task 11), checkpoint exists, CUDA available
      Steps:
        1. Run `CUDA_VISIBLE_DEVICES=0 uv run python -m opengait.demo --source ./assets/sample.mp4 --checkpoint ./ckpt/ScoNet-20000.pt --device cuda:0 --max-frames 120 2>&1 | tee /tmp/pipeline_output.txt`
        2. Count prediction lines: `grep -c '"label"' /tmp/pipeline_output.txt`
      Expected Result: Exit code 0, at least 1 prediction line with valid JSON containing "label" field
      Failure Indicators: Crash, no predictions, invalid JSON, CUDA error
      Evidence: .sisyphus/evidence/task-9-pipeline-run.txt
    
    Scenario: Pipeline handles missing video gracefully
      Tool: Bash
      Steps:
        1. Run `uv run python -m opengait.demo --source /nonexistent/video.mp4 --checkpoint ./ckpt/ScoNet-20000.pt 2>&1; echo "EXIT_CODE=$?"`
      Expected Result: Prints error message about missing file, exits with non-zero code (no traceback dump)
      Failure Indicators: Unhandled exception with full traceback, exit code 0
      Evidence: .sisyphus/evidence/task-9-missing-video.txt
    

    Commit: YES

    • Message: feat(demo): add main pipeline application with CLI entry point
    • Files: opengait/demo/pipeline.py, opengait/demo/__main__.py
    • Pre-commit: uv run python -m opengait.demo --help
  • 10. Unit Tests — Single-Person Policy + Window Reset

    What to do:

    • Create tests/demo/test_window.py
    • Test SilhouetteWindow:
      • Fill to capacity → is_ready() returns True
      • Underfilled → is_ready() returns False
      • Track ID change resets buffer
      • Frame gap exceeding threshold resets buffer
      • get_tensor() returns correct shape [1, 1, window_size, 64, 44]
      • should_classify() respects stride
    • Test select_person():
      • Single detection → returns it
      • Multiple detections → returns largest bbox area
      • No detections → returns None
      • Detections without track IDs (tracker not initialized) → returns None
    • Use mock YOLO results (don't require actual YOLO model)

    Must NOT do:

    • Don't require GPU — window tests are CPU-only (get_tensor can use cpu device)
    • Don't require YOLO model file — mock the results

    Recommended Agent Profile:

    • Category: unspecified-high
      • Reason: Multiple edge cases to cover, mock YOLO results require understanding ultralytics API
    • Skills: []

    Parallelization:

    • Can Run In Parallel: YES
    • Parallel Group: Wave 3 (with Tasks 9, 11)
    • Blocks: None (verification task)
    • Blocked By: Task 5 (window module must exist)

    References:

    Pattern References:

    • opengait/demo/window.py (Task 5) — Module under test

    WHY Each Reference Matters:

    • Direct test target

    Acceptance Criteria:

    • tests/demo/test_window.py exists with ≥6 test cases
    • uv run pytest tests/demo/test_window.py -q passes

    QA Scenarios:

    Scenario: All window and single-person tests pass
      Tool: Bash
      Steps:
        1. Run `uv run pytest tests/demo/test_window.py -v`
      Expected Result: All tests pass (≥6 tests), exit code 0
      Failure Indicators: Assertion failures, import errors
      Evidence: .sisyphus/evidence/task-10-window-tests.txt
    

    Commit: YES

    • Message: test(demo): add window manager and single-person policy tests
    • Files: tests/demo/test_window.py
    • Pre-commit: uv run pytest tests/demo/test_window.py -q
  • 11. Sample Video for Smoke Testing

    What to do:

    • Acquire or create a short sample video for pipeline smoke testing
    • Options (in order of preference):
      1. Extract a short clip (5-10 seconds, ~120 frames at 15fps) from an existing Scoliosis1K raw video if accessible
      2. Record a short clip using webcam via cv2.VideoCapture(0)
      3. Generate a synthetic video with a person-shaped blob moving across frames
    • Save to ./assets/sample.mp4 (or ./assets/sample.avi)
    • Requirements: contains at least one person walking, 720p or lower, ≥60 frames
    • If no real video is available, create a synthetic one:
      • 120 frames, 640×480, 15fps
      • White rectangle (simulating person silhouette) moving across dark background
      • This won't test YOLO detection quality but will verify pipeline doesn't crash
    • Add assets/sample.mp4 to .gitignore if it's large (>10MB)

    Must NOT do:

    • Don't use any Scoliosis1K dataset files that are symlinked (user constraint)
    • Don't commit large video files to git

    Recommended Agent Profile:

    • Category: quick
      • Reason: Simple file creation/acquisition task
    • Skills: []

    Parallelization:

    • Can Run In Parallel: YES
    • Parallel Group: Wave 3 (with Tasks 9, 10)
    • Blocks: Task 12
    • Blocked By: Task 1 (needs OpenCV dependency from scaffolding)

    References: None needed — standalone task

    Acceptance Criteria:

    • ./assets/sample.mp4 (or .avi) exists
    • Video has ≥60 frames
    • Playable with uv run python -c "import cv2; cap=cv2.VideoCapture('./assets/sample.mp4'); print(f'frames={int(cap.get(7))}'); cap.release()"

    QA Scenarios:

    Scenario: Sample video is valid
      Tool: Bash
      Steps:
        1. Run `uv run python -c "`
           ```python
           import cv2
           cap = cv2.VideoCapture('./assets/sample.mp4')
           assert cap.isOpened(), 'Cannot open video'
           n = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
           assert n >= 60, f'Too few frames: {n}'
           ret, frame = cap.read()
           assert ret and frame is not None, 'Cannot read first frame'
           h, w = frame.shape[:2]
           assert h >= 240 and w >= 320, f'Too small: {w}x{h}'
           cap.release()
           print(f'SAMPLE_VIDEO_OK frames={n} size={w}x{h}')
           ```
      Expected Result: Prints 'SAMPLE_VIDEO_OK' with frame count ≥60
      Failure Indicators: Cannot open, too few frames, too small
      Evidence: .sisyphus/evidence/task-11-sample-video.txt
    

    Commit: YES

    • Message: chore(demo): add sample video for smoke testing
    • Files: assets/sample.mp4 (or add to .gitignore and document)
    • Pre-commit: none

  • 12. Integration Tests — End-to-End Smoke Test

    What to do:

    • Create tests/demo/test_pipeline.py
    • Integration test: run the full pipeline with sample video, no NATS
      • Uses subprocess.run() to invoke python -m opengait.demo
      • Captures stdout, parses JSON predictions
      • Asserts: exit code 0, ≥1 prediction, valid JSON schema
    • Test graceful exit on end-of-video
    • Test --max-frames flag: run with max_frames=60, verify it stops
    • Test error handling: invalid source path → non-zero exit, error message
    • Test error handling: invalid checkpoint path → non-zero exit, error message
    • FPS benchmark (informational, not a hard assertion):
      • Run pipeline on sample video, measure wall time, compute FPS
      • Log FPS to evidence file (target: ≥15 FPS on desktop GPU)

    Must NOT do:

    • Don't require NATS server for this test — use console publisher
    • Don't hardcode CUDA device — use --device cuda:0 only if CUDA available, else skip

    Recommended Agent Profile:

    • Category: deep
      • Reason: Full integration test requiring all components working together, subprocess management, JSON parsing
    • Skills: []

    Parallelization:

    • Can Run In Parallel: YES
    • Parallel Group: Wave 4 (with Task 13)
    • Blocks: F1-F4 (Final verification)
    • Blocked By: Tasks 9 (pipeline), 11 (sample video)

    References:

    Pattern References:

    • opengait/demo/__main__.py (Task 9) — CLI flags to invoke
    • opengait/demo/output.py (Task 6) — JSON schema to validate

    WHY Each Reference Matters:

    • __main__.py: Need exact CLI flag names for subprocess invocation
    • output.py: Need JSON schema to assert against

    Acceptance Criteria:

    • tests/demo/test_pipeline.py exists with ≥4 test cases
    • CUDA_VISIBLE_DEVICES=0 uv run pytest tests/demo/test_pipeline.py -q passes
    • Tests cover: happy path, max-frames, invalid source, invalid checkpoint

    QA Scenarios:

    Scenario: Full pipeline integration test passes
      Tool: Bash
      Preconditions: All components built, sample video exists, CUDA available
      Steps:
        1. Run `CUDA_VISIBLE_DEVICES=0 uv run pytest tests/demo/test_pipeline.py -v --timeout=120`
      Expected Result: All tests pass (≥4), exit code 0
      Failure Indicators: Subprocess crash, JSON parse error, timeout
      Evidence: .sisyphus/evidence/task-12-integration.txt
    
    Scenario: FPS benchmark
      Tool: Bash
      Steps:
        1. Run `CUDA_VISIBLE_DEVICES=0 uv run python -c "`
           ```python
           import subprocess, time
           start = time.monotonic()
           result = subprocess.run(
               ['uv', 'run', 'python', '-m', 'opengait.demo',
                '--source', './assets/sample.mp4',
                '--checkpoint', './ckpt/ScoNet-20000.pt',
                '--device', 'cuda:0', '--nats-url', ''],
               capture_output=True, text=True, timeout=120)
           elapsed = time.monotonic() - start
           import cv2
           cap = cv2.VideoCapture('./assets/sample.mp4')
           n_frames = int(cap.get(7)); cap.release()
           fps = n_frames / elapsed if elapsed > 0 else 0
           print(f'FPS_BENCHMARK frames={n_frames} elapsed={elapsed:.1f}s fps={fps:.1f}')
           assert fps >= 5, f'FPS too low: {fps}'  # conservative threshold
           ```
      Expected Result: Prints FPS benchmark, ≥5 FPS (conservative)
      Failure Indicators: Timeout, crash, FPS < 5
      Evidence: .sisyphus/evidence/task-12-fps-benchmark.txt
    

    Commit: YES

    • Message: test(demo): add integration and end-to-end smoke tests
    • Files: tests/demo/test_pipeline.py
    • Pre-commit: CUDA_VISIBLE_DEVICES=0 uv run pytest tests/demo/test_pipeline.py -q
  • 13. NATS Integration Test

    What to do:

    • Create tests/demo/test_nats.py
    • Test requires NATS server (use Docker: docker run -d --rm --name nats-test -p 4222:4222 nats:2)
    • Mark tests with @pytest.mark.skipif if Docker/NATS not available
    • Test flow:
      1. Start NATS container
      2. Start a nats-py subscriber on scoliosis.result
      3. Run pipeline with --nats-url nats://127.0.0.1:4222 --max-frames 60
      4. Collect received messages
      5. Assert: ≥1 message received, valid JSON, correct schema
      6. Stop NATS container
    • Test NatsPublisher reconnection: start pipeline, stop NATS, restart NATS — pipeline should recover
    • JSON schema validation:
      • frame: int
      • track_id: int
      • label: str in {"negative", "neutral", "positive"}
      • confidence: float in [0, 1]
      • window: int (should equal window_size)
      • timestamp_ns: int

    Must NOT do:

    • Don't leave Docker containers running after test
    • Don't hardcode NATS port — use a fixture that finds an open port

    Recommended Agent Profile:

    • Category: unspecified-high
      • Reason: Docker orchestration + async NATS subscriber + subprocess pipeline — moderate complexity
    • Skills: []

    Parallelization:

    • Can Run In Parallel: YES
    • Parallel Group: Wave 4 (with Task 12)
    • Blocks: F1-F4 (Final verification)
    • Blocked By: Tasks 9 (pipeline), 6 (NATS publisher)

    References:

    Pattern References:

    • opengait/demo/output.py (Task 6) — NatsPublisher class, JSON schema

    External References:

    • nats-py subscriber: sub = await nc.subscribe('scoliosis.result'); msg = await sub.next_msg(timeout=10)
    • Docker NATS: docker run -d --rm --name nats-test -p 4222:4222 nats:2

    WHY Each Reference Matters:

    • output.py: Need to match the exact subject and JSON schema the publisher produces
    • nats-py: Need subscriber API to consume and validate messages

    Acceptance Criteria:

    • tests/demo/test_nats.py exists with ≥2 test cases
    • Tests are skippable when Docker/NATS not available
    • CUDA_VISIBLE_DEVICES=0 uv run pytest tests/demo/test_nats.py -q passes (when Docker available)

    QA Scenarios:

    Scenario: NATS receives valid prediction JSON
      Tool: Bash
      Preconditions: Docker available, CUDA available, sample video exists
      Steps:
        1. Run `docker run -d --rm --name nats-test -p 4222:4222 nats:2`
        2. Run `CUDA_VISIBLE_DEVICES=0 uv run pytest tests/demo/test_nats.py -v --timeout=60`
        3. Run `docker stop nats-test`
      Expected Result: Tests pass, at least one valid JSON message received on scoliosis.result
      Failure Indicators: No messages, invalid JSON, schema mismatch, timeout
      Evidence: .sisyphus/evidence/task-13-nats-integration.txt
    
    Scenario: NATS test is skipped when Docker unavailable
      Tool: Bash
      Preconditions: Docker NOT running or not installed
      Steps:
        1. Run `uv run pytest tests/demo/test_nats.py -v -k 'nats' 2>&1 | head -20`
      Expected Result: Tests show as SKIPPED (not FAILED)
      Failure Indicators: Test fails instead of skipping
      Evidence: .sisyphus/evidence/task-13-nats-skip.txt
    

    Commit: YES

    • Message: test(demo): add NATS integration tests
    • Files: tests/demo/test_nats.py
    • Pre-commit: uv run pytest tests/demo/test_nats.py -q (skips if no Docker)

Final Verification Wave (MANDATORY — after ALL implementation tasks)

4 review agents run in PARALLEL. ALL must APPROVE. Rejection → fix → re-run.

  • F1. Plan Compliance Auditoracle Read the plan end-to-end. For each "Must Have": verify implementation exists (read file, run command). For each "Must NOT Have": search codebase for forbidden patterns (torch.distributed imports in demo/, BaseModel subclassing). Check evidence files exist in .sisyphus/evidence/. Compare deliverables against plan. Output: Must Have [N/N] | Must NOT Have [N/N] | Tasks [N/N] | VERDICT: APPROVE/REJECT

  • F2. Code Quality Reviewunspecified-high Run linter + uv run pytest tests/demo/ -q. Review all new files in opengait/demo/ for: as any/type:ignore, empty catches, print statements used instead of logging, commented-out code, unused imports. Check AI slop: excessive comments, over-abstraction, generic variable names. Output: Tests [N pass/N fail] | Files [N clean/N issues] | VERDICT

  • F3. Real Manual QAunspecified-high Start from clean state. Run pipeline with sample video: uv run python -m opengait.demo --source ./assets/sample.mp4 --checkpoint ./ckpt/ScoNet-20000.pt --max-frames 120. Verify predictions are printed to console (no --nats-url = console output). Run with NATS: start container, run pipeline with --nats-url nats://127.0.0.1:4222, subscribe and validate JSON schema. Test edge cases: missing video file (graceful error), no checkpoint (graceful error), --help flag. Output: Scenarios [N/N pass] | Edge Cases [N tested] | VERDICT

  • F4. Scope Fidelity Checkdeep For each task: read "What to do", read actual files created. Verify 1:1 — everything in spec was built (no missing), nothing beyond spec was built (no creep). Check "Must NOT do" compliance: no torch.distributed in demo/, no BaseModel subclass, no TensorRT code, no multi-person logic. Flag unaccounted changes. Output: Tasks [N/N compliant] | Scope [CLEAN/N issues] | VERDICT


Commit Strategy

  • Wave 1: feat(demo): add ScoNetDemo inference wrapper — sconet_demo.py
  • Wave 1: feat(demo): add input adapters and silhouette preprocessing — input.py, preprocess.py
  • Wave 1: chore(demo): scaffold demo package and test infrastructure — __init__.py, conftest, pyproject.toml
  • Wave 2: feat(demo): add sliding window manager and NATS publisher — window.py, output.py
  • Wave 2: test(demo): add preprocessing and model unit tests — test_preprocess.py, test_sconet_demo.py
  • Wave 3: feat(demo): add main pipeline application with CLI — pipeline.py, __main__.py
  • Wave 3: test(demo): add window manager and single-person policy tests — test_window.py
  • Wave 4: test(demo): add integration and NATS tests — test_pipeline.py, test_nats.py

Success Criteria

Verification Commands

# Smoke test (no NATS)
uv run python -m opengait.demo --source ./assets/sample.mp4 --checkpoint ./ckpt/ScoNet-20000.pt --max-frames 120
# Expected: exits 0, prints ≥3 prediction lines with label in {negative, neutral, positive}

# Unit tests
uv run pytest tests/demo/ -q
# Expected: all tests pass

# Help flag
uv run python -m opengait.demo --help
# Expected: prints usage with --source, --checkpoint, --device, --window, --stride, --nats-url, --max-frames

Final Checklist

  • All "Must Have" present
  • All "Must NOT Have" absent
  • All tests pass
  • Pipeline runs at ≥15 FPS on desktop GPU
  • JSON schema matches spec
  • No torch.distributed imports in opengait/demo/