175 lines
7.1 KiB
Markdown
175 lines
7.1 KiB
Markdown
# Demo Pipeline Schema and Contracts
|
|
|
|
## Overview
|
|
This document describes the input/output schema, flags/arguments, and positive detection indicators for the OpenGait demo pipeline (ScoliosisPipeline).
|
|
|
|
## Source Files
|
|
- **Pipeline**: `/home/crosstyan/Code/OpenGait/opengait/demo/pipeline.py`
|
|
- **Input adapters**: `/home/crosstyan/Code/OpenGait/opengait/demo/input.py`
|
|
- **Output publishers**: `/home/crosstyan/Code/OpenGait/opengait/demo/output.py`
|
|
- **Window management**: `/home/crosstyan/Code/OpenGait/opengait/demo/window.py`
|
|
- **Classifier**: `/home/crosstyan/Code/OpenGait/opengait/demo/sconet_demo.py`
|
|
|
|
## Input Schema
|
|
|
|
### Video Source (`--source`)
|
|
The `source` parameter accepts three formats (validated in `validate_runtime_inputs()`):
|
|
|
|
1. **Camera index**: Single digit string (e.g., `"0"`, `"1"`) - uses OpenCV VideoCapture
|
|
2. **cv-mmap shared memory**: `cvmmap://<name>` - uses shared memory stream (e.g., `cvmmap://default`)
|
|
3. **Video file path**: Any other string treated as file path (e.g., `/path/to/video.mp4`)
|
|
|
|
**Source validation** (lines 251-264 in pipeline.py):
|
|
- Camera indices and cv-mmap URLs pass without file check
|
|
- File paths must exist (`Path.is_file()`)
|
|
|
|
### FrameStream Contract (input.py)
|
|
```python
|
|
FrameStream = Iterable[tuple[np.ndarray, dict[str, object]]]
|
|
```
|
|
|
|
Each iteration yields:
|
|
- **frame**: `np.ndarray` - Raw frame array (H, W, C) in uint8
|
|
- **metadata**: `dict[str, object]` containing:
|
|
- `frame_count`: int - Frame index (0-based)
|
|
- `timestamp_ns`: int - Monotonic timestamp in nanoseconds
|
|
- `source`: str - The source path/identifier
|
|
|
|
## Windowing Parameters
|
|
|
|
### SilhouetteWindow Class (window.py)
|
|
Manages a sliding window of silhouettes for classification:
|
|
|
|
**Constructor parameters**:
|
|
- `window_size`: int (default: 30) - Maximum buffer size (number of frames)
|
|
- `stride`: int (default: 1) - Frames between classifications
|
|
- `gap_threshold`: int (default: 15) - Max frame gap before reset
|
|
|
|
**CLI flags**:
|
|
- `--window`: int, min=1, default=30 - Sets `window_size`
|
|
- `--stride`: int, min=1, default=30 - Sets classification stride
|
|
|
|
**Behavior**:
|
|
- Window is "ready" when buffer has `window_size` frames
|
|
- Classification triggers when `should_classify()` returns True (respects stride)
|
|
- Track ID change or frame gap > `gap_threshold` resets the buffer
|
|
- Silhouette shape must be `(64, 44)` float32
|
|
|
|
**Output tensor shape**: `[1, 1, window_size, 64, 44]` (batch, channel, seq, height, width)
|
|
|
|
## Required Flags/Arguments
|
|
|
|
### CLI Arguments (pipeline.py lines 267-287)
|
|
|
|
| Flag | Type | Required | Default | Description |
|
|
|------|------|----------|---------|-------------|
|
|
| `--source` | str | **Yes** | - | Video source (file, camera index, or cvmmap://) |
|
|
| `--checkpoint` | str | **Yes** | - | Model checkpoint path (.pt file) |
|
|
| `--config` | str | No | `configs/sconet/sconet_scoliosis1k.yaml` | Model config YAML |
|
|
| `--device` | str | No | `cuda:0` | Device for inference |
|
|
| `--yolo-model` | str | No | `yolo11n-seg.pt` | YOLO segmentation model |
|
|
| `--window` | int | No | 30 | Window size (frames) |
|
|
| `--stride` | int | No | 30 | Classification stride |
|
|
| `--nats-url` | str | No | None | NATS server URL (e.g., `nats://localhost:4222`) |
|
|
| `--nats-subject` | str | No | `scoliosis.result` | NATS subject for publishing |
|
|
| `--max-frames` | int | No | None | Maximum frames to process |
|
|
|
|
### Validation
|
|
- Source must exist (file) or be valid camera index/cv-mmap URL
|
|
- Checkpoint file must exist
|
|
- Config file must exist
|
|
|
|
## Output Schema
|
|
|
|
### Result Format (output.py `create_result()`)
|
|
```python
|
|
{
|
|
"frame": int, # Frame number where classification occurred
|
|
"track_id": int, # Person/track identifier
|
|
"label": str, # Classification label
|
|
"confidence": float, # Confidence score [0.0, 1.0]
|
|
"window": int, # End frame of window (or window size)
|
|
"timestamp_ns": int # Timestamp in nanoseconds
|
|
}
|
|
```
|
|
|
|
### Publishers (output.py)
|
|
1. **ConsolePublisher**: Outputs JSON Lines to stdout
|
|
2. **NatsPublisher**: Publishes to NATS message broker (async, background thread)
|
|
|
|
### Label Values (sconet_demo.py line 60)
|
|
```python
|
|
LABEL_MAP = {0: "negative", 1: "neutral", 2: "positive"}
|
|
```
|
|
|
|
## Positive Detection Indicator
|
|
|
|
**Positive detection** is indicated when:
|
|
```python
|
|
result["label"] == "positive"
|
|
```
|
|
|
|
The `confidence` field indicates the model's confidence in the prediction (0.0 to 1.0).
|
|
|
|
### Test Validation (test_pipeline.py lines 89-106)
|
|
```python
|
|
def _assert_prediction_schema(prediction: dict[str, object]) -> None:
|
|
assert isinstance(prediction["frame"], int)
|
|
assert isinstance(prediction["track_id"], int)
|
|
|
|
label = prediction["label"]
|
|
assert isinstance(label, str)
|
|
assert label in {"negative", "neutral", "positive"} # Valid labels
|
|
|
|
confidence = prediction["confidence"]
|
|
assert isinstance(confidence, (int, float))
|
|
assert 0.0 <= float(confidence) <= 1.0
|
|
|
|
window_obj = prediction["window"]
|
|
assert isinstance(window_obj, int)
|
|
assert window_obj >= 0
|
|
|
|
assert isinstance(prediction["timestamp_ns"], int)
|
|
```
|
|
|
|
## Test References
|
|
|
|
### Pipeline Tests (`/home/crosstyan/Code/OpenGait/tests/demo/test_pipeline.py`)
|
|
- `test_pipeline_cli_happy_path_outputs_json_predictions`: Validates full pipeline outputs JSON predictions
|
|
- `test_pipeline_cli_fps_benchmark_smoke`: FPS benchmark with predictions
|
|
- `test_pipeline_cli_max_frames_caps_output_frames`: Validates max-frames behavior
|
|
- `test_pipeline_cli_invalid_source_path_returns_user_error`: Source validation
|
|
- `test_pipeline_cli_invalid_checkpoint_path_returns_user_error`: Checkpoint validation
|
|
|
|
### Window Tests (`/home/crosstyan/Code/OpenGait/tests/demo/test_window.py`)
|
|
- `test_window_fill_and_ready_behavior`: Window readiness logic
|
|
- `test_track_id_change_resets_buffer`: Track change handling
|
|
- `test_frame_gap_reset_behavior`: Gap threshold behavior
|
|
- `test_get_tensor_shape`: Output tensor shape validation
|
|
- `test_should_classify_stride_behavior`: Stride logic
|
|
- `test_push_invalid_shape_raises`: Silhouette shape validation
|
|
|
|
### ScoNetDemo Tests (`/home/crosstyan/Code/OpenGait/tests/demo/test_sconet_demo.py`)
|
|
- `test_predict_returns_tuple_with_valid_types`: Predict output validation
|
|
- `test_predict_confidence_range`: Confidence range [0, 1]
|
|
- `test_label_map_has_three_classes`: Label map validation
|
|
- `test_forward_label_range`: Label indices {0, 1, 2}
|
|
|
|
## Processing Flow
|
|
|
|
1. **Input**: Video source → FrameStream (frame, metadata)
|
|
2. **Detection**: YOLO track() → Detection results with boxes, masks, track IDs
|
|
3. **Selection**: `select_person()` → Largest bbox person or fallback
|
|
4. **Preprocessing**: Mask → Silhouette (64, 44) float32
|
|
5. **Windowing**: `SilhouetteWindow.push()` → Buffer management
|
|
6. **Classification**: When `should_classify()` True → ScoNetDemo.predict()
|
|
7. **Output**: `create_result()` → Publisher (Console or NATS)
|
|
|
|
## Error Handling
|
|
|
|
- Invalid source: Exit code 2, "Error: Video source not found"
|
|
- Invalid checkpoint: Exit code 2, "Error: Checkpoint not found"
|
|
- Runtime errors: Exit code 1, "Runtime error: ..."
|
|
- Frame processing errors: Logged as warning, frame skipped
|
|
- NATS unavailable: Graceful degradation (logs debug, continues)
|