chore: cleanup

This commit is contained in:
2026-02-27 17:47:55 +08:00
parent f501119d43
commit 846549498c
5 changed files with 219 additions and 463 deletions
@@ -0,0 +1,174 @@
# Demo Pipeline Schema and Contracts
## Overview
This document describes the input/output schema, flags/arguments, and positive detection indicators for the OpenGait demo pipeline (ScoliosisPipeline).
## Source Files
- **Pipeline**: `/home/crosstyan/Code/OpenGait/opengait/demo/pipeline.py`
- **Input adapters**: `/home/crosstyan/Code/OpenGait/opengait/demo/input.py`
- **Output publishers**: `/home/crosstyan/Code/OpenGait/opengait/demo/output.py`
- **Window management**: `/home/crosstyan/Code/OpenGait/opengait/demo/window.py`
- **Classifier**: `/home/crosstyan/Code/OpenGait/opengait/demo/sconet_demo.py`
## Input Schema
### Video Source (`--source`)
The `source` parameter accepts three formats (validated in `validate_runtime_inputs()`):
1. **Camera index**: Single digit string (e.g., `"0"`, `"1"`) - uses OpenCV VideoCapture
2. **cv-mmap shared memory**: `cvmmap://<name>` - uses shared memory stream (e.g., `cvmmap://default`)
3. **Video file path**: Any other string treated as file path (e.g., `/path/to/video.mp4`)
**Source validation** (lines 251-264 in pipeline.py):
- Camera indices and cv-mmap URLs pass without file check
- File paths must exist (`Path.is_file()`)
### FrameStream Contract (input.py)
```python
FrameStream = Iterable[tuple[np.ndarray, dict[str, object]]]
```
Each iteration yields:
- **frame**: `np.ndarray` - Raw frame array (H, W, C) in uint8
- **metadata**: `dict[str, object]` containing:
- `frame_count`: int - Frame index (0-based)
- `timestamp_ns`: int - Monotonic timestamp in nanoseconds
- `source`: str - The source path/identifier
## Windowing Parameters
### SilhouetteWindow Class (window.py)
Manages a sliding window of silhouettes for classification:
**Constructor parameters**:
- `window_size`: int (default: 30) - Maximum buffer size (number of frames)
- `stride`: int (default: 1) - Frames between classifications
- `gap_threshold`: int (default: 15) - Max frame gap before reset
**CLI flags**:
- `--window`: int, min=1, default=30 - Sets `window_size`
- `--stride`: int, min=1, default=30 - Sets classification stride
**Behavior**:
- Window is "ready" when buffer has `window_size` frames
- Classification triggers when `should_classify()` returns True (respects stride)
- Track ID change or frame gap > `gap_threshold` resets the buffer
- Silhouette shape must be `(64, 44)` float32
**Output tensor shape**: `[1, 1, window_size, 64, 44]` (batch, channel, seq, height, width)
## Required Flags/Arguments
### CLI Arguments (pipeline.py lines 267-287)
| Flag | Type | Required | Default | Description |
|------|------|----------|---------|-------------|
| `--source` | str | **Yes** | - | Video source (file, camera index, or cvmmap://) |
| `--checkpoint` | str | **Yes** | - | Model checkpoint path (.pt file) |
| `--config` | str | No | `configs/sconet/sconet_scoliosis1k.yaml` | Model config YAML |
| `--device` | str | No | `cuda:0` | Device for inference |
| `--yolo-model` | str | No | `yolo11n-seg.pt` | YOLO segmentation model |
| `--window` | int | No | 30 | Window size (frames) |
| `--stride` | int | No | 30 | Classification stride |
| `--nats-url` | str | No | None | NATS server URL (e.g., `nats://localhost:4222`) |
| `--nats-subject` | str | No | `scoliosis.result` | NATS subject for publishing |
| `--max-frames` | int | No | None | Maximum frames to process |
### Validation
- Source must exist (file) or be valid camera index/cv-mmap URL
- Checkpoint file must exist
- Config file must exist
## Output Schema
### Result Format (output.py `create_result()`)
```python
{
"frame": int, # Frame number where classification occurred
"track_id": int, # Person/track identifier
"label": str, # Classification label
"confidence": float, # Confidence score [0.0, 1.0]
"window": int, # End frame of window (or window size)
"timestamp_ns": int # Timestamp in nanoseconds
}
```
### Publishers (output.py)
1. **ConsolePublisher**: Outputs JSON Lines to stdout
2. **NatsPublisher**: Publishes to NATS message broker (async, background thread)
### Label Values (sconet_demo.py line 60)
```python
LABEL_MAP = {0: "negative", 1: "neutral", 2: "positive"}
```
## Positive Detection Indicator
**Positive detection** is indicated when:
```python
result["label"] == "positive"
```
The `confidence` field indicates the model's confidence in the prediction (0.0 to 1.0).
### Test Validation (test_pipeline.py lines 89-106)
```python
def _assert_prediction_schema(prediction: dict[str, object]) -> None:
assert isinstance(prediction["frame"], int)
assert isinstance(prediction["track_id"], int)
label = prediction["label"]
assert isinstance(label, str)
assert label in {"negative", "neutral", "positive"} # Valid labels
confidence = prediction["confidence"]
assert isinstance(confidence, (int, float))
assert 0.0 <= float(confidence) <= 1.0
window_obj = prediction["window"]
assert isinstance(window_obj, int)
assert window_obj >= 0
assert isinstance(prediction["timestamp_ns"], int)
```
## Test References
### Pipeline Tests (`/home/crosstyan/Code/OpenGait/tests/demo/test_pipeline.py`)
- `test_pipeline_cli_happy_path_outputs_json_predictions`: Validates full pipeline outputs JSON predictions
- `test_pipeline_cli_fps_benchmark_smoke`: FPS benchmark with predictions
- `test_pipeline_cli_max_frames_caps_output_frames`: Validates max-frames behavior
- `test_pipeline_cli_invalid_source_path_returns_user_error`: Source validation
- `test_pipeline_cli_invalid_checkpoint_path_returns_user_error`: Checkpoint validation
### Window Tests (`/home/crosstyan/Code/OpenGait/tests/demo/test_window.py`)
- `test_window_fill_and_ready_behavior`: Window readiness logic
- `test_track_id_change_resets_buffer`: Track change handling
- `test_frame_gap_reset_behavior`: Gap threshold behavior
- `test_get_tensor_shape`: Output tensor shape validation
- `test_should_classify_stride_behavior`: Stride logic
- `test_push_invalid_shape_raises`: Silhouette shape validation
### ScoNetDemo Tests (`/home/crosstyan/Code/OpenGait/tests/demo/test_sconet_demo.py`)
- `test_predict_returns_tuple_with_valid_types`: Predict output validation
- `test_predict_confidence_range`: Confidence range [0, 1]
- `test_label_map_has_three_classes`: Label map validation
- `test_forward_label_range`: Label indices {0, 1, 2}
## Processing Flow
1. **Input**: Video source → FrameStream (frame, metadata)
2. **Detection**: YOLO track() → Detection results with boxes, masks, track IDs
3. **Selection**: `select_person()` → Largest bbox person or fallback
4. **Preprocessing**: Mask → Silhouette (64, 44) float32
5. **Windowing**: `SilhouetteWindow.push()` → Buffer management
6. **Classification**: When `should_classify()` True → ScoNetDemo.predict()
7. **Output**: `create_result()` → Publisher (Console or NATS)
## Error Handling
- Invalid source: Exit code 2, "Error: Video source not found"
- Invalid checkpoint: Exit code 2, "Error: Checkpoint not found"
- Runtime errors: Exit code 1, "Runtime error: ..."
- Frame processing errors: Logged as warning, frame skipped
- NATS unavailable: Graceful degradation (logs debug, continues)