Files
OpenGait/.sisyphus/notepads/demo-visualizer/decisions.md
T
crosstyan 06a6cd1ccf chore: add local cvmmap source and persist sisyphus state
Wire cvmmap-client to the local development path and record ongoing orchestration artifacts for reproducible local workflow context.
2026-02-28 11:17:06 +08:00

3.0 KiB

Task 1: CLI Flag Addition

  • Decision: Used argparse instead of click to match explicit task requirement
  • Decision: Preserved all existing CLI options with same defaults as pipeline.py
  • Decision: Module entry point maintained: python -m opengait.demo still works

Task 1 Fix (Retry)

  • Decision: Use inspect.signature for forward compatibility
  • Decision: Conditionally pass visualize kwarg only if constructor accepts it
  • This allows Task 3 to add visualize parameter without breaking Task 1

Task 2: OpenCVVisualizer Design Decisions

Architecture

  • Class-based design encapsulates state (mask_mode, windows_created)
  • Lazy window creation via _ensure_windows() - windows created on first update()
  • In-place drawing methods (_draw_bbox, _draw_text_overlay) avoid unnecessary copies

Display Choices

  • DISPLAY_HEIGHT=256, DISPLAY_WIDTH=176 (4x upscale from 64x44 silhouette)
  • INTER_NEAREST interpolation preserves pixelated look of silhouette
  • Side-by-side view (mode 0) converts to grayscale then back to BGR for consistency

Error Handling

  • Graceful handling of None inputs with placeholder images
  • Type coercion for silhouette (float32 [0,1] -> uint8 [0,255])
  • Frame format auto-detection (grayscale, BGR, BGRA)

Keyboard Interface

  • cv2.waitKey(1) for non-blocking input
  • m key cycles: 0 (Both) -> 1 (Raw) -> 2 (Normalized) -> 0
  • q key returns False to signal application should exit

Task 3 Decisions

Type Annotation Choice

Used object | None for _visualizer attribute rather than importing OpenCVVisualizer type to avoid potential circular import issues and keep the module structure clean. Runtime type checking via getattr is used for the close() method.

EMA FPS Parameters

Selected alpha=0.1 for EMA smoothing as it provides a good balance between:

  • Responsiveness to FPS changes (not too sluggish)
  • Noise reduction (smooths out frame-to-frame variations)

Visualization Payload Structure

The payload dict structure was designed to match the OpenCVVisualizer.update() signature:

{
    "mask_raw": UInt8[ndarray, "h w"] | None,
    "bbox": tuple[int, int, int, int] | None,
    "silhouette": Float[ndarray, "64 44"] | None,
    "track_id": int,
    "label": str | None,
    "confidence": float | None,
}

Error Handling in run()

Frame processing errors are caught and logged (not raised) to ensure the visualizer loop continues even if individual frames fail. This maintains the real-time display even during transient errors.

Task 5 Decisions: YOLO Model Path Relocation

Decision: Model Path Structure

Decision: Move yolo11n-seg.pt to ckpt/yolo11n-seg.pt Rationale: The ckpt/ directory already exists and contains ScoNet checkpoint

Decision: Path Reference Style

Decision: Use relative path ckpt/yolo11n-seg.pt for CLI defaults Rationale: CLI tools run from repo root; test uses absolute path via REPO_ROOT

Decision: Preserve CLI Semantics

Decision: Keep existing CLI option names and only change default value Rationale: No breaking changes to existing scripts or user workflows