# Demo Pipeline Schema and Contracts ## Overview This document describes the input/output schema, flags/arguments, and positive detection indicators for the OpenGait demo pipeline (ScoliosisPipeline). ## Source Files - **Pipeline**: `/home/crosstyan/Code/OpenGait/opengait/demo/pipeline.py` - **Input adapters**: `/home/crosstyan/Code/OpenGait/opengait/demo/input.py` - **Output publishers**: `/home/crosstyan/Code/OpenGait/opengait/demo/output.py` - **Window management**: `/home/crosstyan/Code/OpenGait/opengait/demo/window.py` - **Classifier**: `/home/crosstyan/Code/OpenGait/opengait/demo/sconet_demo.py` ## Input Schema ### Video Source (`--source`) The `source` parameter accepts three formats (validated in `validate_runtime_inputs()`): 1. **Camera index**: Single digit string (e.g., `"0"`, `"1"`) - uses OpenCV VideoCapture 2. **cv-mmap shared memory**: `cvmmap://` - uses shared memory stream (e.g., `cvmmap://default`) 3. **Video file path**: Any other string treated as file path (e.g., `/path/to/video.mp4`) **Source validation** (lines 251-264 in pipeline.py): - Camera indices and cv-mmap URLs pass without file check - File paths must exist (`Path.is_file()`) ### FrameStream Contract (input.py) ```python FrameStream = Iterable[tuple[np.ndarray, dict[str, object]]] ``` Each iteration yields: - **frame**: `np.ndarray` - Raw frame array (H, W, C) in uint8 - **metadata**: `dict[str, object]` containing: - `frame_count`: int - Frame index (0-based) - `timestamp_ns`: int - Monotonic timestamp in nanoseconds - `source`: str - The source path/identifier ## Windowing Parameters ### SilhouetteWindow Class (window.py) Manages a sliding window of silhouettes for classification: **Constructor parameters**: - `window_size`: int (default: 30) - Maximum buffer size (number of frames) - `stride`: int (default: 1) - Frames between classifications - `gap_threshold`: int (default: 15) - Max frame gap before reset **CLI flags**: - `--window`: int, min=1, default=30 - Sets `window_size` - `--stride`: int, min=1, default=30 - Sets classification stride **Behavior**: - Window is "ready" when buffer has `window_size` frames - Classification triggers when `should_classify()` returns True (respects stride) - Track ID change or frame gap > `gap_threshold` resets the buffer - Silhouette shape must be `(64, 44)` float32 **Output tensor shape**: `[1, 1, window_size, 64, 44]` (batch, channel, seq, height, width) ## Required Flags/Arguments ### CLI Arguments (pipeline.py lines 267-287) | Flag | Type | Required | Default | Description | |------|------|----------|---------|-------------| | `--source` | str | **Yes** | - | Video source (file, camera index, or cvmmap://) | | `--checkpoint` | str | **Yes** | - | Model checkpoint path (.pt file) | | `--config` | str | No | `configs/sconet/sconet_scoliosis1k.yaml` | Model config YAML | | `--device` | str | No | `cuda:0` | Device for inference | | `--yolo-model` | str | No | `yolo11n-seg.pt` | YOLO segmentation model | | `--window` | int | No | 30 | Window size (frames) | | `--stride` | int | No | 30 | Classification stride | | `--nats-url` | str | No | None | NATS server URL (e.g., `nats://localhost:4222`) | | `--nats-subject` | str | No | `scoliosis.result` | NATS subject for publishing | | `--max-frames` | int | No | None | Maximum frames to process | ### Validation - Source must exist (file) or be valid camera index/cv-mmap URL - Checkpoint file must exist - Config file must exist ## Output Schema ### Result Format (output.py `create_result()`) ```python { "frame": int, # Frame number where classification occurred "track_id": int, # Person/track identifier "label": str, # Classification label "confidence": float, # Confidence score [0.0, 1.0] "window": int, # End frame of window (or window size) "timestamp_ns": int # Timestamp in nanoseconds } ``` ### Publishers (output.py) 1. **ConsolePublisher**: Outputs JSON Lines to stdout 2. **NatsPublisher**: Publishes to NATS message broker (async, background thread) ### Label Values (sconet_demo.py line 60) ```python LABEL_MAP = {0: "negative", 1: "neutral", 2: "positive"} ``` ## Positive Detection Indicator **Positive detection** is indicated when: ```python result["label"] == "positive" ``` The `confidence` field indicates the model's confidence in the prediction (0.0 to 1.0). ### Test Validation (test_pipeline.py lines 89-106) ```python def _assert_prediction_schema(prediction: dict[str, object]) -> None: assert isinstance(prediction["frame"], int) assert isinstance(prediction["track_id"], int) label = prediction["label"] assert isinstance(label, str) assert label in {"negative", "neutral", "positive"} # Valid labels confidence = prediction["confidence"] assert isinstance(confidence, (int, float)) assert 0.0 <= float(confidence) <= 1.0 window_obj = prediction["window"] assert isinstance(window_obj, int) assert window_obj >= 0 assert isinstance(prediction["timestamp_ns"], int) ``` ## Test References ### Pipeline Tests (`/home/crosstyan/Code/OpenGait/tests/demo/test_pipeline.py`) - `test_pipeline_cli_happy_path_outputs_json_predictions`: Validates full pipeline outputs JSON predictions - `test_pipeline_cli_fps_benchmark_smoke`: FPS benchmark with predictions - `test_pipeline_cli_max_frames_caps_output_frames`: Validates max-frames behavior - `test_pipeline_cli_invalid_source_path_returns_user_error`: Source validation - `test_pipeline_cli_invalid_checkpoint_path_returns_user_error`: Checkpoint validation ### Window Tests (`/home/crosstyan/Code/OpenGait/tests/demo/test_window.py`) - `test_window_fill_and_ready_behavior`: Window readiness logic - `test_track_id_change_resets_buffer`: Track change handling - `test_frame_gap_reset_behavior`: Gap threshold behavior - `test_get_tensor_shape`: Output tensor shape validation - `test_should_classify_stride_behavior`: Stride logic - `test_push_invalid_shape_raises`: Silhouette shape validation ### ScoNetDemo Tests (`/home/crosstyan/Code/OpenGait/tests/demo/test_sconet_demo.py`) - `test_predict_returns_tuple_with_valid_types`: Predict output validation - `test_predict_confidence_range`: Confidence range [0, 1] - `test_label_map_has_three_classes`: Label map validation - `test_forward_label_range`: Label indices {0, 1, 2} ## Processing Flow 1. **Input**: Video source → FrameStream (frame, metadata) 2. **Detection**: YOLO track() → Detection results with boxes, masks, track IDs 3. **Selection**: `select_person()` → Largest bbox person or fallback 4. **Preprocessing**: Mask → Silhouette (64, 44) float32 5. **Windowing**: `SilhouetteWindow.push()` → Buffer management 6. **Classification**: When `should_classify()` True → ScoNetDemo.predict() 7. **Output**: `create_result()` → Publisher (Console or NATS) ## Error Handling - Invalid source: Exit code 2, "Error: Video source not found" - Invalid checkpoint: Exit code 2, "Error: Checkpoint not found" - Runtime errors: Exit code 1, "Runtime error: ..." - Frame processing errors: Logged as warning, frame skipped - NATS unavailable: Graceful degradation (logs debug, continues)