docs(sisyphus): record demo fixes and preprocess research
Capture validated debugging outcomes and ScoNet preprocessing findings in persistent notes so future sessions can resume with verified context instead of redoing the same investigation.
This commit is contained in:
@@ -110,4 +110,191 @@ Updated opengait/demo/__main__.py to restore behavior parity:
|
||||
### Verification
|
||||
- `uv run pytest tests/demo/test_pipeline.py -q` -> 12 passed, 1 skipped
|
||||
- `uv run python -m opengait.demo --help` -> --visualize flag present
|
||||
- lsp_diagnostics -> only acceptable warnings (reportAny, unused call results)
|
||||
- lsp_diagnostics -> only acceptable warnings (reportAny, unused call results)
|
||||
## Fixed: NameError for _FrameMetadata in input.py
|
||||
|
||||
**Status:** RESOLVED
|
||||
|
||||
**Error:**
|
||||
```
|
||||
NameError: name '_FrameMetadata' is not defined
|
||||
File: opengait/demo/input.py
|
||||
Location: cvmmap_source() nested async generator return annotation
|
||||
```
|
||||
|
||||
**Resolution:**
|
||||
Moved `_FrameMetadata` and `_CvMmapClient` Protocol classes from inside `if TYPE_CHECKING:` block to module level in `opengait/demo/input.py`.
|
||||
|
||||
**Verification:**
|
||||
- Import test passes
|
||||
- All 74 demo tests pass (3 skipped)
|
||||
- Runtime test confirms `cvmmap_source()` returns generator without NameError
|
||||
|
||||
## Fixed: Window Freeze on No-Detection Frames
|
||||
|
||||
**Status:** RESOLVED
|
||||
|
||||
**Issue:** When running demo with `--visualize` flag on video streams with intermittent or no person detections, the OpenCV window would freeze because `viz_payload` was `None` when no detection occurred, and the visualizer update was skipped.
|
||||
|
||||
**Root Cause:** In `ScoliosisPipeline.run()`, the visualizer update was conditional on both `self._visualizer is not None` AND `viz_payload is not None`. When YOLO returned no detections, `process_frame()` returned `None`, skipping the visualizer update entirely.
|
||||
|
||||
**Fix:** Modified the condition to update visualizer whenever it's enabled, passing `None` values for missing detection data. The `OpenCVVisualizer.update()` method already handles `None` values gracefully by displaying placeholder views.
|
||||
|
||||
**Files Changed:**
|
||||
- `opengait/demo/pipeline.py`: Lines 344-393
|
||||
- `tests/demo/test_pipeline.py`: Added `MockVisualizer` class and `test_pipeline_visualizer_updates_on_no_detection()`
|
||||
|
||||
**Testing:**
|
||||
- New regression test: `test_pipeline_visualizer_updates_on_no_detection()` - PASSED
|
||||
- All existing pipeline tests: 13 passed, 1 skipped
|
||||
- LSP diagnostics: No new errors
|
||||
|
||||
## Task 1: Stabilize overlapped duplicate code blocks - RESOLVED
|
||||
|
||||
### Issues Fixed
|
||||
|
||||
1. **Tuple unpacking mismatch in _select_silhouette**:
|
||||
- Issue: `select_person()` returns 4-tuple `(mask, bbox_mask, bbox_frame, track_id)`, but code was unpacking as 3-tuple
|
||||
- Resolution: Updated unpacking to handle all 4 values and use correct bbox for mask_to_silhouette vs return value
|
||||
|
||||
2. **Window freeze on no-detection frames**:
|
||||
- Issue: Visualizer only updated when `viz_payload is not None`, causing window freeze when no person detected
|
||||
- Resolution: Update visualizer every frame when enabled, using cached payload for no-detection frames
|
||||
|
||||
3. **Cached payload mutation risk**:
|
||||
- Issue: Storing direct reference to viz_payload could allow mutation of cached data
|
||||
- Resolution: Create copies of mutable values (numpy arrays) when caching using `v.copy() if hasattr(v, 'copy')`
|
||||
|
||||
### Pre-existing Issues (Not Fixed)
|
||||
- LSP type errors with `DemoResult` vs `dict[str, object]` in `_result_buffer` (pre-existing)
|
||||
- PyArrow type warnings (pre-existing, acceptable)
|
||||
|
||||
### Tools Used
|
||||
- sed for targeted line replacements
|
||||
- Python script for complex multi-line block replacement
|
||||
- Native Edit tool for simple single-line changes
|
||||
- pytest for verification
|
||||
|
||||
|
||||
## Task: BBoxXYXY Type Alias Introduction - COMPLETED
|
||||
|
||||
### Scope
|
||||
Type-semantics cleanup only. No new features, no behavior changes.
|
||||
|
||||
### Changes Summary
|
||||
- **opengait/demo/pipeline.py**: 3 occurrences of `tuple[int, int, int, int]` converted to `BBoxXYXY`
|
||||
- Import statement updated to include BBoxXYXY from .preprocess
|
||||
- Function return type annotation in `_select_silhouette()`
|
||||
- Cast expression type annotation in `run()` method
|
||||
|
||||
### No Issues Encountered
|
||||
- All other scoped files (window.py, visualizer.py, preprocess.py) already used BBoxXYXY correctly
|
||||
- Tests pass without modification
|
||||
- No semantic regressions detected
|
||||
|
||||
### Pre-existing Conditions Preserved
|
||||
- LSP warnings in pipeline.py (DemoResult type mismatch, PyArrow Any types) - unchanged
|
||||
- LSP warnings in window.py (torch/numpy Any types) - unchanged
|
||||
- LSP warnings in visualizer.py (cv2.Any types) - unchanged
|
||||
|
||||
All warnings existed before this task and are unrelated to bbox type annotations.
|
||||
|
||||
## 2026-02-28: TypedDict Contract Application
|
||||
|
||||
### Pre-existing Issues Encountered
|
||||
1. **Visualization payload type error** (line 353): Dictionary comprehension with `hasattr(v, 'copy')` check caused basedpyright error because type checker cannot verify attribute existence through runtime introspection
|
||||
- Fixed by extracting to explicit for-loop with `Callable[[], object]` cast pattern
|
||||
|
||||
2. **Type narrowing in comprehensions**: Type narrowing (`is not None` checks) doesn't propagate into dictionary comprehensions, requiring explicit cast before the comprehension
|
||||
|
||||
### Remaining Warnings (Pre-existing, Not Related to Task)
|
||||
- pyarrow types show as `Any` (dynamic import)
|
||||
- Some `list.append()` methods show as partially unknown
|
||||
- These are in export code paths unrelated to DemoResult typing
|
||||
|
||||
## Fixed: BBox Scale Bug in Fallback Path (2026-02-28)
|
||||
|
||||
**Status:** RESOLVED
|
||||
|
||||
**Issue:** In `_select_silhouette()` fallback path, bbox returned from `frame_to_person_mask()` was in mask-space coordinates, causing visualization overlay to appear at wrong scale.
|
||||
|
||||
**Resolution:**
|
||||
Added coordinate space conversion using `result.orig_shape` to scale bbox from mask-space to frame-space:
|
||||
- Extract frame dimensions from `result.orig_shape` with safe fallback
|
||||
- Calculate scale factors between mask and frame dimensions
|
||||
- Apply scaling to bbox coordinates for frame-space positioning
|
||||
- Graceful fallback to mask-space bbox if conversion not possible
|
||||
|
||||
**Files Changed:**
|
||||
- `opengait/demo/pipeline.py`: Lines 206-226 in `_select_silhouette()` method
|
||||
|
||||
**Verification:**
|
||||
- `uv run pytest tests/demo/test_pipeline.py -q` - PASSED (14 passed, 1 skipped)
|
||||
- LSP diagnostics: No new errors
|
||||
|
||||
## Fixed: Both-Mode Raw Mask Empty Display (2026-02-28)
|
||||
|
||||
**Status:** RESOLVED
|
||||
|
||||
**Issue:** In `_prepare_both_view()`, raw mask pane appeared empty because primary path returns float32 [0,1] mask while display expects uint8 [0,255].
|
||||
|
||||
**Resolution:**
|
||||
Added dtype normalization to ensure mask is always uint8 before display:
|
||||
- Check if mask is float32/float64 dtype
|
||||
- If so, scale by 255 and convert to uint8
|
||||
- Handles both primary path (float) and fallback path (uint8) consistently
|
||||
|
||||
**Files Changed:**
|
||||
- `opengait/demo/visualizer.py`: Lines 330-332 in `_prepare_both_view()` method
|
||||
|
||||
**Verification:**
|
||||
- `uv run pytest tests/demo/test_pipeline.py -q` - PASSED (14 passed, 1 skipped)
|
||||
- LSP diagnostics: No new errors
|
||||
|
||||
|
||||
|
||||
## 2026-02-28: Tuple-to-TypedDict Migration Issues
|
||||
|
||||
### Regressions Encountered
|
||||
|
||||
1. **Duplicate code blocks in pipeline.py**
|
||||
- Root cause: Incremental edits left stale tuple unpacking alongside new dict access
|
||||
- Fix: Used Python script for atomic replacement instead of multiple small edits
|
||||
|
||||
2. **Stale tuple assertions in test_window.py**
|
||||
- Root cause: Tests had both dict assertions AND tuple unpacking/assertions
|
||||
- Fix: Systematically removed all `assert mask...`, `assert bbox...`, `assert tid...` lines
|
||||
|
||||
3. **Mock return value overwrite in test_pipeline.py**
|
||||
- Root cause: Dict mock was immediately overwritten by tuple mock
|
||||
- Fix: Removed the second tuple mock assignment
|
||||
|
||||
### Lessons Learned
|
||||
|
||||
- Use Python scripts for complex multi-line replacements to avoid file corruption
|
||||
- Always grep for ALL tuple unpacking patterns (`a, b = result`, `a, b, c = result`, etc.)
|
||||
- Tests may have BOTH dict and tuple assertions - must remove both
|
||||
- Mock return values may be defined twice - check for overwrites
|
||||
|
||||
### Pre-existing Issues (not introduced)
|
||||
|
||||
- `BBoxXYXY` import unused in window.py (pre-existing)
|
||||
- Various pyright warnings about Any types (pre-existing)
|
||||
- Unused imports in pipeline.py (pre-existing)
|
||||
|
||||
|
||||
## 2026-02-28: bbox_frame Documentation (Minor)
|
||||
|
||||
### Issue
|
||||
User requested clarification of `bbox_frame` semantics in docstrings.
|
||||
|
||||
### Resolution
|
||||
Updated `select_person` docstring in `window.py` to explicitly document:
|
||||
- `bbox_mask`: mask-space XYXY (downsampled, for processing)
|
||||
- `bbox_frame`: frame-space XYXY (full resolution, for visualization)
|
||||
|
||||
### Files Changed
|
||||
- `opengait/demo/window.py` - Docstring only
|
||||
|
||||
### Note
|
||||
The git checkout restored the tuple return; updated to dict return to match TypedDict contract.
|
||||
|
||||
@@ -286,3 +286,396 @@ Added `cvmmap-client` as a dependency sourced from local path `/home/crosstyan/C
|
||||
- Import path: `from cvmmap import CvMmapClient` (module name)
|
||||
- uv path sources require absolute paths
|
||||
- Lockfile captures the path dependency for reproducibility
|
||||
|
||||
## Bugfix: _FrameMetadata NameError in cvmmap_source()
|
||||
|
||||
### Root Cause
|
||||
The `_FrameMetadata` Protocol class was defined inside a `TYPE_CHECKING` block, making it available only during type checking but not at runtime. However, `cvmmap_source()` used `_FrameMetadata` in a runtime function annotation:
|
||||
|
||||
```python
|
||||
async def _async_generator() -> AsyncIterator[tuple[np.ndarray, _FrameMetadata]]:
|
||||
```
|
||||
|
||||
This caused `NameError: name '_FrameMetadata' is not defined` when the function was called.
|
||||
|
||||
### Fix
|
||||
Moved both `_FrameMetadata` and `_CvMmapClient` Protocol classes out of the `TYPE_CHECKING` block to module level. These protocols are lightweight (just method signatures) and don't require heavy imports, so there's minimal runtime cost.
|
||||
|
||||
### Verification
|
||||
- `lsp_diagnostics`: Only 1 warning (unnecessary type: ignore comment)
|
||||
- `uv run python -c "from opengait.demo.input import cvmmap_source"`: Import successful
|
||||
- `uv run pytest tests/demo/`: 74 passed, 3 skipped
|
||||
- Runtime test confirms generator creation works without NameError
|
||||
|
||||
### Pattern to Remember
|
||||
When using Protocol classes for type annotations:
|
||||
- If the annotation is used in a nested function (defined inside another function), it needs to be available at runtime
|
||||
- TYPE_CHECKING-guarded types can only be used in annotations that are evaluated at type-check time
|
||||
- Module-level function annotations are evaluated at definition time (runtime)
|
||||
|
||||
## Fix: Window Freeze on No-Detection Frames
|
||||
|
||||
### Problem
|
||||
When YOLO doesn't detect any person in a frame, `process_frame()` returns `None`. The visualizer update was conditional on `viz_payload is not None`, causing the OpenCV window to freeze when no detections were present for extended periods.
|
||||
|
||||
### Solution
|
||||
Modified `ScoliosisPipeline.run()` in `opengait/demo/pipeline.py` to always call `self._visualizer.update()` when visualization is enabled, even when `viz_payload` is `None`. When no detection data is available, the visualizer receives `None` values for bbox, mask_raw, silhouette, label, and confidence, which it handles gracefully by displaying placeholder views.
|
||||
|
||||
### Changes Made
|
||||
1. **opengait/demo/pipeline.py** (lines 344-393):
|
||||
- Changed condition from `if self._visualizer is not None and viz_payload is not None:` to `if self._visualizer is not None:`
|
||||
- Added else branch to set default values (all None/0) when `viz_payload is None`
|
||||
- Visualizer now receives frame and FPS data every frame, preventing window freeze
|
||||
|
||||
2. **tests/demo/test_pipeline.py**:
|
||||
- Added `MockVisualizer` class to track update calls
|
||||
- Added `test_pipeline_visualizer_updates_on_no_detection()` regression test
|
||||
- Test verifies visualizer is called for all frames even when YOLO returns no detections
|
||||
|
||||
### Verification
|
||||
- New regression test passes (RED -> GREEN TDD cycle)
|
||||
- All 13 existing pipeline tests pass (1 skipped)
|
||||
- No new LSP errors introduced
|
||||
- Visualizer handles None values gracefully (existing behavior)
|
||||
|
||||
## Task 1: Stabilize overlapped duplicate code blocks - COMPLETED
|
||||
|
||||
### Changes Made to opengait/demo/pipeline.py
|
||||
|
||||
1. **Fixed tuple unpacking in _select_silhouette (lines 181-187)**:
|
||||
- Changed `mask_raw, bbox, track_id = selected` to `mask_raw, bbox_mask, bbox_frame, track_id = selected`
|
||||
- Updated `mask_to_silhouette()` call to use `bbox_mask` (mask coordinate space)
|
||||
- Changed return statement to use `bbox_frame` (frame coordinate space)
|
||||
- This fixes the 4-tuple unpacking from `select_person()` which returns `(mask, bbox_mask, bbox_frame, track_id)`
|
||||
|
||||
2. **Added _last_viz_payload attribute**:
|
||||
- Added `_last_viz_payload: dict[str, object] | None` class field after `_visualizer`
|
||||
- Initialized to `None` in `__init__` after visualizer initialization
|
||||
|
||||
3. **Updated visualizer logic in run() for no-detection caching**:
|
||||
- Changed condition from `if self._visualizer is not None and viz_payload is not None:` to just `if self._visualizer is not None:`
|
||||
- Added caching logic: when `viz_payload is not None`, cache a copy with `v.copy() if hasattr(v, 'copy') else v` for each value
|
||||
- Use cached payload when current frame has no detection: `viz_data = viz_payload if viz_payload is not None else self._last_viz_payload`
|
||||
- Handle case when no cache exists: use default None/0 values
|
||||
- Visualizer now updates every frame when enabled, preventing window freeze
|
||||
|
||||
### Verification Results
|
||||
- LSP diagnostics: No new critical errors (only pre-existing warnings)
|
||||
- Tests: 14 passed, 1 skipped
|
||||
- Import test: OK
|
||||
|
||||
### Key Implementation Detail
|
||||
The caching logic uses a dict comprehension with `.copy()` for mutable values (numpy arrays) to prevent mutation issues, satisfying the test requirement that cached masks should be copies, not the same object reference.
|
||||
|
||||
|
||||
## Task: BBoxXYXY Type Alias Introduction - COMPLETED
|
||||
|
||||
### Summary
|
||||
Introduced and applied `BBoxXYXY` alias for bbox semantics across demo module files.
|
||||
The alias was already defined in `preprocess.py`; this task applied it consistently to
|
||||
`pipeline.py` which had raw `tuple[int, int, int, int]` annotations.
|
||||
|
||||
### Changes Made
|
||||
1. **opengait/demo/pipeline.py**:
|
||||
- Added import: `from .preprocess import BBoxXYXY, frame_to_person_mask, mask_to_silhouette`
|
||||
- Changed `_select_silhouette` return type annotation (line 176):
|
||||
- Before: `tuple[int, int, int, int]`
|
||||
- After: `BBoxXYXY`
|
||||
- Changed fallback cast annotation (line 192):
|
||||
- Before: `tuple[UInt8[ndarray, "h w"], tuple[int, int, int, int]] | None`
|
||||
- After: `tuple[UInt8[ndarray, "h w"], BBoxXYXY] | None`
|
||||
- Changed bbox cast in run() method (line 372):
|
||||
- Before: `cast(tuple[int, int, int, int] | None, bbox_obj)`
|
||||
- After: `cast(BBoxXYXY | None, bbox_obj)`
|
||||
|
||||
### Files Already Using BBoxXYXY (No Changes Needed)
|
||||
- **opengait/demo/window.py**: Already imports and uses `BBoxXYXY` for `select_person()` return type
|
||||
- **opengait/demo/visualizer.py**: Already imports and uses `BBoxXYXY` for `_draw_bbox()` and `update()` parameters
|
||||
- **opengait/demo/preprocess.py**: Defines `BBoxXYXY = tuple[int, int, int, int]` with docstring
|
||||
|
||||
### Verification
|
||||
- `grep 'tuple\[int, int, int, int\]'` in demo/ scope: Only matches alias definition (preprocess.py line 27)
|
||||
- `lsp_diagnostics`: No new errors introduced from changes
|
||||
- `uv run pytest tests/demo/test_pipeline.py -q`: 14 passed, 1 skipped
|
||||
|
||||
### Semantic Preservation
|
||||
- No runtime behavior changes - purely type annotation cleanup
|
||||
- Mask-space vs frame-space bbox semantics unchanged
|
||||
- All bbox coordinate calculations preserved exactly
|
||||
|
||||
## 2026-02-28: TypedDict Contract Applied for DemoResult
|
||||
|
||||
### Changes Made
|
||||
1. **opengait/demo/output.py**: `create_result()` already returns `DemoResult` TypedDict (confirmed)
|
||||
2. **opengait/demo/pipeline.py**:
|
||||
- Added `DemoResult` to imports from `.output`
|
||||
- Changed `_result_buffer` type from `list[dict[str, object]]` to `list[DemoResult]`
|
||||
- Fixed pre-existing type error in visualization payload caching by extracting dict comprehension to explicit loop with `Callable[[], object]` cast
|
||||
|
||||
### Key Patterns
|
||||
- TypedDict provides structural typing for dictionary results
|
||||
- `DemoResult` fields: `frame`, `track_id`, `label`, `confidence`, `window`, `timestamp_ns`
|
||||
- Pipeline buffer now enforces consistent result typing throughout the pipeline
|
||||
|
||||
### Type Safety
|
||||
- `output.py`: No diagnostics (clean)
|
||||
- `pipeline.py`: Only warnings (no errors) - warnings are pre-existing pyarrow/dynamic import related
|
||||
- All 14 tests pass
|
||||
|
||||
## 2026-02-28: Visualization Regression Fixes - BBox Scale and Both-Mode Raw Mask
|
||||
|
||||
### Issue A: BBox Scale Bug in Fallback Path
|
||||
|
||||
**Problem:** In `_select_silhouette()` fallback path (when `select_person()` returns None), the bbox from `frame_to_person_mask()` was in mask-space coordinates but was being drawn on the full-size frame, causing the overlay to appear much smaller than actual.
|
||||
|
||||
**Root Cause:**
|
||||
- `frame_to_person_mask()` returns bbox in mask coordinate space (scaled down from original frame)
|
||||
- Primary path through `select_person()` correctly returns both `bbox_mask` and `bbox_frame`
|
||||
- Fallback path was returning mask-space bbox directly without conversion
|
||||
|
||||
**Fix:**
|
||||
- Added frame-space bbox conversion in fallback path using `result.orig_shape`
|
||||
- Calculates scale factors: `scale_x = frame_w / mask_w`, `scale_y = frame_h / mask_h`
|
||||
- Applies scaling to bbox coordinates: `bbox_frame = (int(bbox_mask[0] * scale_x), ...)`
|
||||
- Safely falls back to mask-space bbox if `orig_shape` unavailable or dimensions invalid
|
||||
|
||||
**Code Location:** `opengait/demo/pipeline.py` lines 206-226 in `_select_silhouette()`
|
||||
|
||||
### Issue B: Both-Mode Raw Mask Appears Empty
|
||||
|
||||
**Problem:** In `_prepare_both_view()`, the raw mask pane appeared empty/black in "Both" display mode.
|
||||
|
||||
**Root Cause:**
|
||||
- Primary path returns mask as float32 with values in [0, 1] range
|
||||
- Fallback path returns mask as uint8 with values in [0, 255] range
|
||||
- OpenCV display expects uint8 [0, 255] for proper visualization
|
||||
- Float32 [0, 1] values displayed as nearly black
|
||||
|
||||
**Fix:**
|
||||
- Added dtype normalization in `_prepare_both_view()` before resizing
|
||||
- Check: `if mask_gray.dtype == np.float32 or mask_gray.dtype == np.float64:`
|
||||
- Convert: `mask_gray = (mask_gray * 255).astype(np.uint8)`
|
||||
- Ensures consistent uint8 [0, 255] display format regardless of input dtype
|
||||
|
||||
**Code Location:** `opengait/demo/visualizer.py` lines 330-332 in `_prepare_both_view()`
|
||||
|
||||
### Verification
|
||||
- All 14 pipeline tests pass (1 skipped)
|
||||
- LSP diagnostics show only pre-existing warnings
|
||||
- No new tests needed - existing test coverage sufficient
|
||||
|
||||
|
||||
|
||||
## 2026-02-28: Tuple-to-TypedDict Migration for select_person and _select_silhouette
|
||||
|
||||
### Summary
|
||||
Successfully migrated two function return contracts from anonymous tuples to explicit TypedDict structures:
|
||||
1. `window.select_person()` → `PersonSelection | None`
|
||||
2. `pipeline._select_silhouette()` → `SilhouetteSelection | None`
|
||||
|
||||
### Key TypedDict Definitions
|
||||
|
||||
**PersonSelection** (in `opengait/demo/window.py`):
|
||||
```python
|
||||
class PersonSelection(TypedDict):
|
||||
mask: "Float[ndarray, 'h w']" # jaxtyping annotated
|
||||
bbox_mask: "tuple[int, int, int, int]"
|
||||
bbox_frame: "tuple[int, int, int, int]"
|
||||
track_id: int
|
||||
```
|
||||
|
||||
**SilhouetteSelection** (in `opengait/demo/pipeline.py`):
|
||||
```python
|
||||
class SilhouetteSelection(TypedDict):
|
||||
silhouette: "Float[ndarray, '64 44']" # jaxtyping annotated
|
||||
mask_raw: "UInt8[ndarray, 'h w']" # jaxtyping annotated
|
||||
bbox: "BBoxXYXY"
|
||||
track_id: int
|
||||
```
|
||||
|
||||
### Migration Pattern
|
||||
- Replace `tuple[X, Y, Z] | None` return type with `TypedDictName | None`
|
||||
- Replace tuple unpacking at call sites with dict key access
|
||||
- Update all return statements to return dict literals
|
||||
- Ensure jaxtyping annotations are preserved on ndarray fields
|
||||
|
||||
### Files Modified
|
||||
- `opengait/demo/window.py` - Added PersonSelection TypedDict, updated select_person
|
||||
- `opengait/demo/pipeline.py` - Added SilhouetteSelection TypedDict, updated _select_silhouette
|
||||
- `tests/demo/test_window.py` - Updated assertions to use dict keys
|
||||
- `tests/demo/test_pipeline.py` - Updated mock return to use dict
|
||||
|
||||
### Verification
|
||||
- `uv run pytest tests/demo/test_window.py -q` → 22 passed, 1 skipped
|
||||
- `uv run pytest tests/demo/test_pipeline.py -q` → 14 passed, 1 skipped
|
||||
- No errors in lsp_diagnostics (only pre-existing warnings)
|
||||
|
||||
### Behavior Confirmation
|
||||
No behavior change - purely a typing/contract refactor. All visualization, caching, and bbox scaling logic remains unchanged.
|
||||
|
||||
|
||||
## 2026-02-28: ProcessFrameResult TypedDict Added
|
||||
|
||||
### Summary
|
||||
Added `ProcessFrameResult` TypedDict for `ScoliosisPipeline.process_frame()` return contract.
|
||||
|
||||
### ProcessFrameResult Definition (pipeline.py)
|
||||
```python
|
||||
class ProcessFrameResult(TypedDict):
|
||||
"""Structured return type for process_frame method.
|
||||
|
||||
Contains visualization payload and optional classification result.
|
||||
Uses jaxtyping for array fields to preserve shape information.
|
||||
"""
|
||||
|
||||
mask_raw: "UInt8[ndarray, 'h w'] | None"
|
||||
bbox: "BBoxXYXY | None"
|
||||
silhouette: "Float[ndarray, '64 44'] | None"
|
||||
track_id: int
|
||||
label: "str | None"
|
||||
confidence: "float | None"
|
||||
result: "DemoResult | None"
|
||||
```
|
||||
|
||||
### Files Modified
|
||||
- `opengait/demo/pipeline.py` - Added ProcessFrameResult TypedDict, updated process_frame return type
|
||||
|
||||
### Changes Made
|
||||
1. Added `ProcessFrameResult` TypedDict with jaxtyping annotations for array fields
|
||||
2. Updated `process_frame() -> ProcessFrameResult | None`
|
||||
3. Updated all return statements to include `result` field (None for non-classified paths)
|
||||
4. Updated `_last_viz_payload` type from `dict[str, object]` to `ProcessFrameResult`
|
||||
5. Updated cast in `run()` from `dict[str, object]` to `ProcessFrameResult`
|
||||
|
||||
### Verification
|
||||
- `uv run pytest tests/demo/test_pipeline.py -q` → 14 passed, 1 skipped
|
||||
- No errors in lsp_diagnostics (only pre-existing warnings)
|
||||
|
||||
### Behavior Confirmation
|
||||
No behavior change - purely a typing/contract refactor.
|
||||
|
||||
|
||||
## 2026-02-28: Type Quality Cleanup (Oracle Feedback)
|
||||
|
||||
### Changes Made
|
||||
|
||||
1. **window.py - PersonSelection bbox type consistency**
|
||||
- Changed `bbox_mask` and `bbox_frame` types from `"tuple[int, int, int, int]"` to `"BBoxXYXY"`
|
||||
- Uses the existing type alias for semantic clarity and consistency
|
||||
|
||||
2. **pipeline.py - run() visualization payload cleanup**
|
||||
- Replaced generic `cached: dict[str, object]` with `cached: ProcessFrameResult`
|
||||
- Replaced `.get()` access with direct key access on `ProcessFrameResult`
|
||||
- Simplified extraction: removed intermediate `*_obj` variables and casting cascade
|
||||
- Maintained copy semantics for mutable numpy arrays (mask_raw, silhouette)
|
||||
|
||||
### Benefits
|
||||
- Type-safe caching with explicit ProcessFrameResult structure
|
||||
- Direct key access eliminates unnecessary `.get()` calls and type casts
|
||||
- Consistent use of BBoxXYXY alias across window and pipeline modules
|
||||
|
||||
### Verification
|
||||
- `uv run pytest tests/demo/test_window.py -q` → 22 passed, 1 skipped
|
||||
- `uv run pytest tests/demo/test_pipeline.py -q` → 14 passed, 1 skipped
|
||||
- No runtime behavior changes
|
||||
|
||||
|
||||
## 2026-02-28: bbox_frame Documentation Added
|
||||
|
||||
### Documentation Changes
|
||||
|
||||
Updated docstrings to clarify coordinate space semantics for bbox fields:
|
||||
|
||||
**PersonSelection (window.py)**:
|
||||
```
|
||||
Coordinate spaces:
|
||||
- bbox_mask: Bounding box in mask coordinate space (XYXY: x1, y1, x2, y2).
|
||||
This is the downsampled mask resolution (e.g., 100x100), used for
|
||||
silhouette extraction and mask-space operations.
|
||||
- bbox_frame: Bounding box in original frame pixel coordinates (XYXY: x1, y1, x2, y2).
|
||||
This is the full-resolution frame size (e.g., 640x480), used for
|
||||
visualization overlay on the original video frame.
|
||||
```
|
||||
|
||||
**SilhouetteSelection (pipeline.py)**:
|
||||
```
|
||||
The bbox field is in original frame pixel coordinates (XYXY: x1, y1, x2, y2),
|
||||
suitable for visualization overlay on the original video frame.
|
||||
```
|
||||
|
||||
**ProcessFrameResult (pipeline.py)**:
|
||||
```
|
||||
The bbox field is in original frame pixel coordinates (XYXY: x1, y1, x2, y2),
|
||||
suitable for visualization overlay on the original video frame.
|
||||
```
|
||||
|
||||
**select_person function (window.py)**:
|
||||
```
|
||||
- bbox_mask: bounding box in mask coordinate space (XYXY format: x1, y1, x2, y2).
|
||||
This is the downsampled mask resolution, used for silhouette extraction.
|
||||
- bbox_frame: bounding box in original frame pixel coordinates (XYXY format: x1, y1, x2, y2).
|
||||
This is the full-resolution frame size, used for visualization overlay.
|
||||
```
|
||||
|
||||
### Key Distinction
|
||||
- `bbox_mask`: Mask-space coordinates (downsampled, for processing)
|
||||
- `bbox_frame`: Frame-space coordinates (full resolution, for visualization)
|
||||
|
||||
### Verification
|
||||
- `uv run pytest tests/demo/test_window.py tests/demo/test_pipeline.py -q` → 36 passed, 1 skipped
|
||||
|
||||
|
||||
## 2026-02-28: bbox_frame Documentation in select_person
|
||||
|
||||
### Docstring Updated (window.py)
|
||||
|
||||
Updated `select_person` function docstring to explicitly document coordinate spaces:
|
||||
|
||||
```python
|
||||
Returns:
|
||||
PersonSelection dictionary with keys:
|
||||
- mask: the person's segmentation mask
|
||||
- bbox_mask: bounding box in mask coordinate space (XYXY format: x1, y1, x2, y2).
|
||||
This is the downsampled mask resolution (e.g., 100x100), used for
|
||||
silhouette extraction and mask-space operations.
|
||||
- bbox_frame: bounding box in original frame pixel coordinates (XYXY format: x1, y1, x2, y2).
|
||||
This is the full-resolution frame size (e.g., 640x480), used for
|
||||
visualization overlay on the original video frame.
|
||||
- track_id: the person's track ID
|
||||
```
|
||||
|
||||
### Key Distinction
|
||||
- `bbox_mask`: Mask-space coordinates (downsampled resolution, for processing)
|
||||
- `bbox_frame`: Frame-space coordinates (full resolution, for visualization)
|
||||
|
||||
### Verification
|
||||
- `uv run pytest tests/demo/test_window.py -q` → 22 passed
|
||||
|
||||
|
||||
## 2026-02-28: pkl Alias for silhouette-export-format
|
||||
|
||||
### Summary
|
||||
Added `pkl` as an alias for `pickle` in `--silhouette-export-format` CLI option.
|
||||
|
||||
### Implementation
|
||||
In `opengait/demo/pipeline.py` `__init__` method:
|
||||
```python
|
||||
self._silhouette_export_format = silhouette_export_format
|
||||
# Normalize format alias: pkl -> pickle
|
||||
if self._silhouette_export_format == "pkl":
|
||||
self._silhouette_export_format = "pickle"
|
||||
```
|
||||
|
||||
### Behavior
|
||||
- `pkl` → normalized to `pickle` → pickle export
|
||||
- `pickle` → unchanged → pickle export
|
||||
- `parquet` → unchanged → parquet export
|
||||
- `invalid` → unchanged → ValueError on export
|
||||
|
||||
### Verification
|
||||
- All existing tests pass
|
||||
- Custom validation test confirms:
|
||||
- pkl alias works
|
||||
- pickle still works
|
||||
- parquet still works
|
||||
- invalid formats still rejected
|
||||
|
||||
@@ -0,0 +1,115 @@
|
||||
# ScoNet Preprocessing Research - Learnings
|
||||
|
||||
## Official Sources Identified
|
||||
|
||||
### 1. Primary Implementation (HIGHEST TRUST)
|
||||
- **Repository**: ShiqiYu/OpenGait (https://github.com/ShiqiYu/OpenGait)
|
||||
- **ScoNet Model Code**: opengait/modeling/models/sconet.py
|
||||
- **Preprocessing Code**: datasets/pretreatment.py (lines 18-95)
|
||||
- **Config**: configs/sconet/sconet_scoliosis1k.yaml
|
||||
|
||||
### 2. Dataset Source
|
||||
- **Scoliosis1K Dataset**: https://zhouzi180.github.io/Scoliosis1K/
|
||||
- **Raw silhouettes**: Extracted using PP-HumanSeg v2
|
||||
- **Raw pose**: Extracted using ViTPose
|
||||
|
||||
### 3. Academic Papers
|
||||
- **MICCAI 2024**: "Gait Patterns as Biomarkers: A Video-Based Approach for Classifying Scoliosis" (Zhou et al.)
|
||||
- PDF: https://arxiv.org/pdf/2407.05726
|
||||
- Introduces ScoNet and Scoliosis1K dataset
|
||||
|
||||
- **MICCAI 2025**: "Pose as Clinical Prior: Learning Dual Representations for Scoliosis Screening" (Zhou et al.)
|
||||
- PDF: https://arxiv.org/abs/2509.00872
|
||||
- Extends ScoNet with pose annotations
|
||||
|
||||
## Preprocessing Pipeline (Confirmed from Official Code)
|
||||
|
||||
### From `datasets/pretreatment.py` (imgs2pickle function):
|
||||
|
||||
```python
|
||||
# Step 1: Filter empty images
|
||||
if img.sum() <= 10000:
|
||||
continue
|
||||
|
||||
# Step 2: VERTICAL TIGHT CROP (y-axis projection)
|
||||
y_sum = img.sum(axis=1)
|
||||
y_top = (y_sum != 0).argmax(axis=0)
|
||||
y_btm = (y_sum != 0).cumsum(axis=0).argmax(axis=0)
|
||||
img = img[y_top: y_btm + 1, :] # <-- TIGHT CROP TO PERSON HEIGHT
|
||||
|
||||
# Step 3: Resize based on height (maintain aspect ratio)
|
||||
ratio = img.shape[1] / img.shape[0]
|
||||
img = cv2.resize(img, (int(img_size * ratio), img_size), interpolation=cv2.INTER_CUBIC)
|
||||
|
||||
# Step 4: Find x-center by cumulative sum
|
||||
x_csum = img.sum(axis=0).cumsum()
|
||||
for idx, csum in enumerate(x_csum):
|
||||
if csum > img.sum() / 2:
|
||||
x_center = idx
|
||||
break
|
||||
|
||||
# Step 5: Horizontal crop to img_size width (centered)
|
||||
half_width = img_size // 2
|
||||
left = x_center - half_width
|
||||
right = x_center + half_width
|
||||
|
||||
# Step 6: Padding if needed
|
||||
if left <= 0 or right >= img.shape[1]:
|
||||
left += half_width
|
||||
right += half_width
|
||||
_ = np.zeros((img.shape[0], half_width))
|
||||
img = np.concatenate([_, img, _], axis=1)
|
||||
|
||||
# Final crop
|
||||
to_pickle.append(img[:, left: right].astype('uint8'))
|
||||
```
|
||||
|
||||
### Key Parameters
|
||||
- **Default img_size**: 64 (configurable via `--img_size`)
|
||||
- **Interpolation**: cv2.INTER_CUBIC
|
||||
- **Output format**: uint8 grayscale, pickle files
|
||||
- **Normalization**: None during preprocessing (happens later in BaseSilTransform)
|
||||
|
||||
## Alignment with Local Implementation
|
||||
|
||||
### CONFIRMED: Vertical tight-crop before resize is OFFICIAL
|
||||
- **Evidence**: Line 50 in pretreatment.py: `img = img[y_top: y_btm + 1, :]`
|
||||
- **Purpose**: Removes vertical padding, focuses on actual person silhouette
|
||||
- **Resize behavior**: Height-based resize maintains aspect ratio
|
||||
|
||||
### Transform Pipeline (from config)
|
||||
```yaml
|
||||
evaluator_cfg:
|
||||
transform:
|
||||
- type: BaseSilCuttingTransform # Optional cutting
|
||||
|
||||
trainer_cfg:
|
||||
transform:
|
||||
- type: BaseSilCuttingTransform # Optional cutting
|
||||
```
|
||||
|
||||
From `opengait/data/transform.py`:
|
||||
- `BaseSilCuttingTransform`: Applies optional cutting + divides by 255.0
|
||||
- Default cutting: `int(x.shape[-1] // 64) * 10` pixels from sides
|
||||
- If cutting=0, only normalization is applied
|
||||
|
||||
## Differences from Standard Gait Recognition
|
||||
|
||||
1. **No horizontal flip augmentation** in ScoNet config
|
||||
2. **Evaluation uses**: `evaluate_scoliosis` function (not standard gait metrics)
|
||||
3. **Class num**: 3 (Positive, Neutral, Negative) vs 74+ for gait ID
|
||||
4. **Metric**: euclidean distance (not cosine)
|
||||
|
||||
## Critical Finding for User Concern
|
||||
|
||||
**Vertical tight-crop BEFORE resize is CORRECT and OFFICIAL.**
|
||||
|
||||
This is NOT a bug - it's the intended preprocessing pipeline:
|
||||
1. Crop to person's actual height (remove empty vertical space)
|
||||
2. Resize to fixed height (64px) maintaining aspect ratio
|
||||
3. Center crop/pad horizontally to get 64x64 output
|
||||
|
||||
This ensures:
|
||||
- Consistent scale across different camera distances
|
||||
- Person fills the frame vertically
|
||||
- Aspect ratio is preserved
|
||||
Reference in New Issue
Block a user