Files
zed-playground/py_workspace/.sisyphus/plans/ground-plane-refinement.md
T

46 KiB

Ground Plane Refinement & Depth Map Persistence

TL;DR

Quick Summary: Fix inter-camera ground plane disagreement by adding depth-based floor detection and per-camera extrinsic correction as a standalone post-processing tool. Also add HDF5 depth map persistence so SVO re-reading is not needed for iterative refinement.

Deliverables:

  • --save-depth flag in calibrate_extrinsics.py → HDF5 depth persistence
  • New aruco/depth_save.py module for HDF5 read/write
  • New aruco/ground_plane.py module for floor detection + consensus alignment
  • New refine_ground_plane.py standalone CLI tool
  • Plotly diagnostic visualization (before/after floor alignment)
  • Full TDD test suite for all new modules
  • New dependencies: open3d, h5py

Estimated Effort: Large Parallel Execution: YES — 3 waves Critical Path: Task 1 (deps) → Task 2 (depth save module) → Task 4 (CLI integration) → Task 5 (ground plane module) → Task 7 (CLI tool) → Task 8 (visualization)


Context

Original Request

User's calibrate_extrinsics.py produces extrinsics where the ground plane is not level — specifically, different cameras disagree about where the ground is when overlaying world-coordinate point clouds. The error is small (1-3° tilt, <2cm offset) across a 2-4 camera ZED setup. User wants:

  1. A way to refine the calibration using actual floor depth observations
  2. Saved pooled depth maps to avoid re-reading SVOs for iterative refinement

Interview Summary

Key Discussions:

  • Core problem: Inter-camera disagreement, not just global tilt. Point clouds from different cameras don't align on the floor surface.
  • Integration approach: Post-processing tool (standalone CLI), not integrated into existing pipeline.
  • Library choice: Open3D for point cloud operations (user wants it available for future work). h5py for HDF5 persistence.
  • Refinement granularity: Per-camera correction (each camera gets its own correction based on its floor observations).
  • Depth saving: Opt-in via --save-depth <dir> flag. Save pooled + raw best frames per camera.
  • Save format: HDF5 via h5py with versioned schema.
  • Visualization: Plotly HTML diagnostic (floor points per camera, consensus plane, before/after).
  • Test strategy: TDD with pytest, following existing test patterns.

Research Findings:

  • alignment.py has rotation_align_vectors() for aligning normals — reusable for floor alignment
  • depth_pool.py does median pooling but never persists results
  • depth_refine.py has scipy.optimize.least_squares infrastructure for pose optimization
  • compare_pose_sets.py has Kabsch rigid_transform_3d() for rigid alignment
  • depth_verify.py has project_point_to_pixel() and depth residual computation
  • Current pipeline: ArUco → PnP → RANSAC averaging → depth refinement (sparse, marker corners only) → alignment (marker normals only)
  • Open3D provides segment_plane() for RANSAC plane fitting on point clouds

Metis Review

Identified Gaps (addressed):

  • Correction DOF: Must constrain to pitch/roll + vertical translation only (no yaw drift, no lateral drift). Addressed via bounded optimization.
  • RANSAC plane robustness: Must constrain plane normal to near-vertical and height to expected range, plus ROI masking. Addressed via configurable constraints.
  • HDF5 schema versioning: Must include /meta/schema_version, units, intrinsics, coordinate frame. Addressed in schema design.
  • Failure mode for missing floor: If plane detection fails for one camera, skip that camera and warn (don't fail entire run). Addressed in error handling design.
  • Reproducibility: RANSAC seed control for deterministic tests. Addressed via seed parameter.
  • Per-camera correction risk: May break inter-camera rigidity. Addressed via correction bounds + pre/post metrics reporting.
  • Consensus plane definition: Use merged inlier points from all cameras, weighted by inlier count. Addressed in algorithm design.

Work Objectives

Core Objective

Enable depth-based ground plane refinement that corrects per-camera extrinsic errors (1-3° tilt, <2cm vertical offset) by detecting the actual physical floor surface from ZED depth maps and aligning all cameras to a consensus ground plane.

Concrete Deliverables

  • aruco/depth_save.py: HDF5 read/write module for depth maps + metadata
  • aruco/ground_plane.py: Floor detection (RANSAC), consensus plane fitting, per-camera correction
  • refine_ground_plane.py: Standalone Click CLI tool
  • --save-depth flag added to calibrate_extrinsics.py
  • tests/test_depth_save.py: TDD tests for depth persistence
  • tests/test_ground_plane.py: TDD tests for floor detection + alignment
  • tests/test_refine_ground_cli.py: TDD tests for CLI tool
  • Plotly diagnostic HTML output

Definition of Done

  • uv run pytest tests/test_depth_save.py → all tests pass
  • uv run pytest tests/test_ground_plane.py → all tests pass
  • uv run pytest tests/test_refine_ground_cli.py → all tests pass
  • uv run basedpyright aruco/depth_save.py aruco/ground_plane.py refine_ground_plane.py → no errors
  • uv run python calibrate_extrinsics.py --help shows --save-depth flag
  • uv run python refine_ground_plane.py --help shows expected options
  • End-to-end: calibrate → save depth → refine ground → produces valid extrinsics JSON

Must Have

  • Per-camera RANSAC floor plane detection from depth maps
  • Consensus plane fitting from merged floor points
  • Constrained per-camera correction (pitch/roll + vertical translation, no yaw/lateral)
  • Correction bounds with configurable limits (default: max 5° rotation, max 5cm translation)
  • "No-op if not confident" threshold — skip correction if RANSAC inlier ratio is too low
  • HDF5 schema with versioning and full metadata (intrinsics, units, resolution, frame indices)
  • Diagnostic metrics: per-camera plane normal angles, consensus disagreement before/after, correction magnitudes
  • Plotly visualization of floor points + consensus plane + before/after camera poses

Must NOT Have (Guardrails)

  • NO changes to ArUco detection, PnP, or RANSAC pose averaging logic
  • NO changes to existing depth_refine.py or depth_verify.py behavior
  • NO non-flat floor handling (ramps, stairs, multi-level)
  • NO dense multi-view reconstruction beyond the floor plane
  • NO automatic scene segmentation or ML-based floor detection
  • NO global bundle adjustment across all cameras
  • NO saving of every frame's depth data — only pooled + curated best subset
  • NO GUI requirements — visualization is optional Plotly HTML output
  • NO modification of the extrinsics JSON schema (output format matches existing convention)

Verification Strategy (MANDATORY)

UNIVERSAL RULE: ZERO HUMAN INTERVENTION

ALL tasks in this plan MUST be verifiable WITHOUT any human action.

Test Decision

  • Infrastructure exists: YES (pytest configured in pyproject.toml)
  • Automated tests: TDD (tests first)
  • Framework: pytest (existing)

If TDD Enabled

Each TODO follows RED-GREEN-REFACTOR:

Task Structure:

  1. RED: Write failing test first
    • Test file: tests/test_<module>.py
    • Test command: uv run pytest tests/test_<module>.py
    • Expected: FAIL (test exists, implementation doesn't)
  2. GREEN: Implement minimum code to pass
    • Command: uv run pytest tests/test_<module>.py
    • Expected: PASS
  3. REFACTOR: Clean up while keeping green
    • Command: uv run pytest tests/test_<module>.py
    • Expected: PASS (still)

Agent-Executed QA Scenarios (MANDATORY — ALL tasks)

Verification Tool by Deliverable Type:

Type Tool How Agent Verifies
Python module Bash (pytest) Run tests, assert pass count, zero failures
CLI tool Bash (click --help + invocation) Check help output, run with test data, verify exit code and output
HDF5 file Bash (python -c "import h5py; ...") Open file, check schema, validate data shapes
Type checking Bash (basedpyright) Run type checker, verify zero errors
Plotly output Bash (file existence + python parse) Check file exists, contains valid HTML, has expected traces

Execution Strategy

Parallel Execution Waves

Wave 1 (Start Immediately):
├── Task 1: Add open3d + h5py dependencies
├── Task 2: TDD depth save module (aruco/depth_save.py) [after Task 1]
└── Task 3: TDD ground plane core module (aruco/ground_plane.py) [after Task 1]

Wave 2 (After Wave 1):
├── Task 4: Integrate --save-depth into calibrate_extrinsics.py [depends: 1, 2]
└── Task 5: Ground plane consensus + per-camera correction [depends: 1, 3]

Wave 3 (After Wave 2):
├── Task 6: Plotly diagnostic visualization module [depends: 5]
├── Task 7: refine_ground_plane.py CLI tool [depends: 2, 5, 6]
└── Task 8: Integration tests + basedpyright pass [depends: all]

Critical Path: Task 1 → Task 2 → Task 4 (depth save path)
              Task 1 → Task 3 → Task 5 → Task 7 (ground plane path)

Dependency Matrix

Task Depends On Blocks Can Parallelize With
1 None 2, 3 None (must be first)
2 1 4, 7 3
3 1 5 2
4 1, 2 7, 8 5
5 1, 3 6, 7 4
6 5 7 4
7 2, 5, 6 8 None
8 All None None (final)

Agent Dispatch Summary

Wave Tasks Recommended Agents
1 1 task(category="quick", ...)
1→2 2, 3 task(category="unspecified-high", ...) — parallel after Task 1
2 4, 5 task(category="unspecified-high", ...) — parallel
3 6 task(category="unspecified-low", ...)
3 7 task(category="unspecified-high", ...)
3 8 task(category="unspecified-low", ...)

TODOs

  • 1. Add open3d and h5py dependencies to pyproject.toml

    What to do:

    • Add open3d and h5py to the [project] dependencies list in pyproject.toml
    • Run uv sync to install
    • Verify imports work: uv run python -c "import open3d; import h5py; print('ok')"

    Must NOT do:

    • Do not add unnecessary deps (no trimesh, no probreg, no pycpd)
    • Do not modify any other pyproject.toml sections

    Recommended Agent Profile:

    • Category: quick
      • Reason: Single file edit + one command
    • Skills: []
      • No special skills needed for a dependency addition

    Parallelization:

    • Can Run In Parallel: NO
    • Parallel Group: Wave 1 (solo — must complete before 2, 3)
    • Blocks: Tasks 2, 3, 4, 5, 6, 7, 8
    • Blocked By: None

    References:

    Pattern References:

    • pyproject.toml:7-27 — Existing dependency list format and conventions (e.g., "scipy>=1.17.0")

    Acceptance Criteria:

    • pyproject.toml contains open3d and h5py in dependencies
    • uv sync completes without error
    • uv run python -c "import open3d; import h5py; print('ok')" prints ok and exits 0

    Agent-Executed QA Scenarios:

    Scenario: Dependencies install and import correctly
      Tool: Bash
      Preconditions: pyproject.toml edited
      Steps:
        1. uv sync
        2. uv run python -c "import open3d; print(open3d.__version__)"
        3. Assert: exit code 0, version string printed
        4. uv run python -c "import h5py; print(h5py.__version__)"
        5. Assert: exit code 0, version string printed
      Expected Result: Both libraries installed and importable
      Evidence: Command output captured
    

    Commit: YES

    • Message: build(deps): add open3d and h5py for ground plane refinement
    • Files: pyproject.toml, uv.lock
    • Pre-commit: uv run python -c "import open3d; import h5py"

  • 2. TDD: Create aruco/depth_save.py — HDF5 depth map persistence module

    What to do:

    RED phase — Write tests/test_depth_save.py first with tests for:

    • save_depth_data(): saves pooled depth + confidence + raw frames + intrinsics + metadata to HDF5
    • load_depth_data(): loads HDF5 back into structured dict
    • Round-trip test: save → load → compare arrays with np.testing.assert_allclose
    • Schema validation: check /meta/schema_version, /meta/units, /meta/coordinate_frame
    • Per-camera groups: /<serial>/pooled_depth, /<serial>/pooled_confidence, /<serial>/raw_frames/<idx>/depth, /<serial>/intrinsics
    • Edge cases: single camera, no confidence map, no raw frames
    • Error handling: invalid path, empty data

    GREEN phase — Implement aruco/depth_save.py:

    • save_depth_data(path, camera_data, schema_version=1) — writes HDF5
    • load_depth_data(path) — reads HDF5 back to dict
    • Schema version 1 layout:
      /meta/
        schema_version: int = 1
        units: str = "meters"
        coordinate_frame: str = "world_from_cam"
        created_at: str (ISO 8601)
      /<serial>/
        intrinsics: (3, 3) float64  — camera matrix K
        resolution: (2,) int — [width, height]
        pooled_depth: (H, W) float32
        pooled_confidence: (H, W) float32  [optional]
        pool_metadata: JSON string (same dict currently in results)
        raw_frames/
          0/depth: (H, W) float32
          0/confidence: (H, W) float32  [optional]
          0/frame_index: int
          0/score: float
          1/depth: ...
      
    • Use h5py compression: compression="gzip", compression_opts=4
    • Type annotations on all public functions

    REFACTOR phase — Clean up, add docstrings, run basedpyright.

    Must NOT do:

    • Do not modify existing depth_pool.py or depth_verify.py
    • Do not add ZED SDK dependency to this module (pure numpy/h5py)
    • Do not save uncompressed data

    Recommended Agent Profile:

    • Category: unspecified-high
      • Reason: New module with TDD workflow, HDF5 schema design, comprehensive tests
    • Skills: []
      • No special skills needed — standard Python + h5py

    Parallelization:

    • Can Run In Parallel: YES
    • Parallel Group: Wave 1-2 (with Task 3, after Task 1)
    • Blocks: Tasks 4, 7
    • Blocked By: Task 1

    References:

    Pattern References:

    • aruco/depth_pool.py:1-90 — Data format conventions: depth maps are (H, W) float in meters, confidence maps are (H, W) float with ZED semantics (lower = more confident)
    • calibrate_extrinsics.py:143-305 — How depth maps and confidence maps are collected per camera, how pool_metadata dict is structured
    • calibrate_extrinsics.py:120-131 — Function signature of apply_depth_verify_refine_postprocess showing the verification_frames data structure

    API/Type References:

    • aruco/depth_verify.py:18-24project_point_to_pixel(P_cam, K) shows intrinsics matrix K format (3x3, fx/fy/cx/cy)

    Test References:

    • tests/test_depth_pool.py — Follow this test structure: parametric, synthetic data, edge cases with pytest.raises
    • tests/conftest.py — sys.path setup for imports

    Documentation References:

    • calibrate_extrinsics.py:338results[str(serial)]["depth_pool"] shows pool_metadata dict structure

    WHY Each Reference Matters:

    • depth_pool.py defines the array contracts (shape, dtype, units) the save module must preserve
    • calibrate_extrinsics.py:143-305 shows exactly where/how depth data is produced — the save module must capture this data
    • Test patterns in test_depth_pool.py establish the project's testing conventions

    Acceptance Criteria:

    TDD:

    • Test file created: tests/test_depth_save.py
    • Tests cover: save, load, round-trip, schema validation, edge cases, error handling
    • uv run pytest tests/test_depth_save.py -v → PASS (all tests, 0 failures)

    Agent-Executed QA Scenarios:

    Scenario: Round-trip save and load preserves data
      Tool: Bash (pytest)
      Preconditions: aruco/depth_save.py implemented
      Steps:
        1. uv run pytest tests/test_depth_save.py -v -k "round_trip"
        2. Assert: exit code 0
        3. Assert: output contains "PASSED"
      Expected Result: Saved HDF5 loads back with identical data
      Evidence: pytest output captured
    
    Scenario: HDF5 schema has required metadata
      Tool: Bash (pytest)
      Preconditions: aruco/depth_save.py implemented
      Steps:
        1. uv run pytest tests/test_depth_save.py -v -k "schema"
        2. Assert: exit code 0
        3. Assert: tests verify /meta/schema_version, /meta/units, /meta/coordinate_frame
      Expected Result: Schema metadata present and correct
      Evidence: pytest output captured
    
    Scenario: Module passes type checking
      Tool: Bash (basedpyright)
      Preconditions: Module implemented with type annotations
      Steps:
        1. uv run basedpyright aruco/depth_save.py
        2. Assert: exit code 0 or only non-error diagnostics
      Expected Result: No type errors
      Evidence: basedpyright output captured
    

    Commit: YES

    • Message: feat(aruco): add HDF5 depth map persistence module
    • Files: aruco/depth_save.py, tests/test_depth_save.py
    • Pre-commit: uv run pytest tests/test_depth_save.py

  • 3. TDD: Create aruco/ground_plane.py — floor detection & consensus alignment core

    What to do:

    RED phase — Write tests/test_ground_plane.py first with tests for:

    A. unproject_depth_to_points(depth_map, K, T_world_cam, stride=4):

    • Takes depth map + intrinsics + extrinsics → returns (N, 3) world-coordinate point cloud
    • Test: synthetic depth of a flat plane at known height → verify recovered 3D points match expected positions
    • Test: NaN/zero/negative depth values are excluded
    • Test: stride parameter reduces output point count proportionally

    B. detect_floor_plane(points, normal_constraint, height_range, min_inlier_ratio, seed):

    • Uses Open3D RANSAC segment_plane() on the point-cloud
    • Returns FloorPlaneResult(normal, offset, inliers, inlier_ratio, plane_model)
    • Test: synthetic flat floor + random noise → recovers correct plane within tolerance
    • Test: synthetic floor + wall points → RANSAC ignores wall, finds floor (normal_constraint filters)
    • Test: normal_constraint rejects planes that aren't near-vertical (e.g., wall plane)
    • Test: height_range rejects planes too far from expected floor height
    • Test: too few inliers → returns None (below min_inlier_ratio)
    • Test: seed parameter produces deterministic results

    C. compute_consensus_plane(floor_results, camera_weights=None):

    • Takes per-camera FloorPlaneResult list → fits a single consensus plane
    • Method: concatenate all inlier points, weighted by inlier count, fit plane via SVD
    • Test: two cameras seeing same plane → consensus matches individual planes
    • Test: two cameras with slight disagreement → consensus is between them
    • Test: camera weights affect result appropriately

    D. compute_floor_correction(T_world_cam, floor_result, consensus_plane, max_rotation_deg=5.0, max_translation_m=0.05):

    • Computes constrained correction for a single camera
    • Allowed DOF: pitch/roll + vertical translation ONLY (no yaw, no lateral)
    • Uses scipy.optimize.least_squares with bounds
    • Returns CorrectionResult(T_corrected, delta_rotation_deg, delta_translation_m, applied)
    • Test: camera with 2° tilt from consensus → correction brings it within 0.1°
    • Test: correction respects max_rotation_deg bound
    • Test: correction respects max_translation_m bound
    • Test: yaw component is preserved (no yaw drift)
    • Test: lateral translation is preserved (no X/Z drift)

    GREEN phase — Implement aruco/ground_plane.py:

    • Import open3d for RANSAC plane segmentation
    • Import scipy.optimize.least_squares for constrained correction
    • Reuse aruco.alignment.rotation_align_vectors where appropriate
    • Reuse aruco.pose_math.invert_transform and matrix_to_rvec_tvec
    • Use dataclasses for FloorPlaneResult and CorrectionResult
    • All functions are pure (no side effects, no file I/O)

    REFACTOR phase — Docstrings, type annotations, basedpyright.

    Must NOT do:

    • No ML/segmentation — RANSAC + geometric constraints only
    • No global bundle adjustment
    • No modification to existing alignment.py
    • No dense reconstruction beyond floor plane extraction

    Recommended Agent Profile:

    • Category: unspecified-high
      • Reason: Core algorithmic module with 4 major functions, each with multiple test cases. Requires understanding of SE3 geometry, RANSAC, and constrained optimization.
    • Skills: []

    Parallelization:

    • Can Run In Parallel: YES
    • Parallel Group: Wave 1-2 (with Task 2, after Task 1)
    • Blocks: Task 5
    • Blocked By: Task 1

    References:

    Pattern References:

    • aruco/alignment.py:54-114rotation_align_vectors(from_vec, to_vec) — reuse for aligning floor normal to target up vector
    • aruco/alignment.py:117-137apply_alignment_to_pose(T, R_align) — pattern for applying global rotation to extrinsics
    • aruco/alignment.py:140-202estimate_up_vector_from_cameras() — existing camera-based "up" estimation, useful as initial guess for floor normal
    • aruco/depth_refine.py:12-20extrinsics_to_params() / params_to_extrinsics() — 6-DOF parameterization for optimization
    • aruco/depth_refine.py:71-180refine_extrinsics_with_depth() — pattern for bounded least_squares optimization of camera pose
    • aruco/depth_verify.py:18-24project_point_to_pixel(P_cam, K) — projection math
    • aruco/pose_math.py:22-28invert_transform(T) — efficient SE3 inversion

    API/Type References:

    • aruco/alignment.py:7-16 — Type aliases: Vec3, Mat33, Mat44, CornersNC
    • aruco/depth_verify.py:8-15DepthVerificationResult dataclass pattern

    Test References:

    • tests/test_alignment.py — Testing convention for geometric functions (synthetic inputs, tolerance assertions)
    • tests/test_depth_refine.py — Testing convention for optimization functions (before/after metrics)

    External References:

    • Open3D docs: segment_plane(distance_threshold, ransac_n, num_iterations) — returns [a, b, c, d] plane model + inlier indices

    WHY Each Reference Matters:

    • alignment.py provides the exact rotation-alignment primitives we need — no need to reimplement
    • depth_refine.py establishes the bounded least-squares pattern with regularization — correction should follow the same style
    • test_alignment.py shows how geometric tests are structured in this project (synthetic data, assert_allclose)

    Acceptance Criteria:

    TDD:

    • Test file created: tests/test_ground_plane.py
    • Tests cover: unproject, floor detection (happy + noise + wall + failure), consensus, correction (tilt + bounds + yaw preservation)
    • uv run pytest tests/test_ground_plane.py -v → PASS (all tests, 0 failures)

    Agent-Executed QA Scenarios:

    Scenario: Floor detection on synthetic flat plane
      Tool: Bash (pytest)
      Preconditions: aruco/ground_plane.py implemented
      Steps:
        1. uv run pytest tests/test_ground_plane.py -v -k "detect_floor and synthetic_flat"
        2. Assert: exit code 0
        3. Assert: recovered normal within 1° of [0, -1, 0]
      Expected Result: RANSAC correctly identifies flat floor
      Evidence: pytest output captured
    
    Scenario: Per-camera correction preserves yaw
      Tool: Bash (pytest)
      Preconditions: aruco/ground_plane.py implemented
      Steps:
        1. uv run pytest tests/test_ground_plane.py -v -k "correction and yaw"
        2. Assert: exit code 0
        3. Assert: yaw angle before == yaw angle after (within 0.01°)
      Expected Result: Correction only affects pitch/roll + vertical translation
      Evidence: pytest output captured
    
    Scenario: Module passes type checking
      Tool: Bash (basedpyright)
      Preconditions: Module implemented with type annotations
      Steps:
        1. uv run basedpyright aruco/ground_plane.py
        2. Assert: exit code 0 or only non-error diagnostics
      Expected Result: No type errors
      Evidence: basedpyright output captured
    

    Commit: YES

    • Message: feat(aruco): add ground plane detection and per-camera correction module
    • Files: aruco/ground_plane.py, tests/test_ground_plane.py
    • Pre-commit: uv run pytest tests/test_ground_plane.py

  • 4. Integrate --save-depth flag into calibrate_extrinsics.py

    What to do:

    • Add --save-depth Click option (type click.Path(), default None)
    • When provided, after depth pooling/selection in apply_depth_verify_refine_postprocess, call save_depth_data() to persist:
      • Pooled depth + confidence per camera
      • Raw best-scored frames (depth + confidence + frame index + score)
      • Camera intrinsics matrix K
      • Pool metadata dict
    • Log the output path and file size

    Must NOT do:

    • Do not change existing depth processing behavior
    • Do not make saving mandatory (only when --save-depth is provided)
    • Do not save if depth verification/refinement is not enabled (warn and skip)

    Recommended Agent Profile:

    • Category: unspecified-high
      • Reason: Integration into existing CLI with complex data flow, needs careful threading of data through the function
    • Skills: []

    Parallelization:

    • Can Run In Parallel: YES
    • Parallel Group: Wave 2 (with Task 5)
    • Blocks: Tasks 7, 8
    • Blocked By: Tasks 1, 2

    References:

    Pattern References:

    • calibrate_extrinsics.py:562-678 — Click option definitions and main() function signature — follow exact same patterns for the new flag
    • calibrate_extrinsics.py:606-611--depth-pool-size option as example of depth-related flag
    • calibrate_extrinsics.py:120-305apply_depth_verify_refine_postprocess() — this is where depth data is available and where save should be triggered
    • calibrate_extrinsics.py:143-165 — Where depth_maps and confidence_maps lists are built per camera — data to capture for raw frames
    • calibrate_extrinsics.py:267-270 — Where final_depth and pool_metadata are determined — data to capture for pooled result

    API/Type References:

    • aruco/depth_save.py (Task 2 output) — save_depth_data(path, camera_data, schema_version=1) function signature

    Test References:

    • tests/test_depth_cli_postprocess.py — Existing test patterns for calibrate_extrinsics CLI post-processing behavior
    • tests/test_depth_pool_integration.py — Integration test patterns with mocked depth data

    WHY Each Reference Matters:

    • calibrate_extrinsics.py:562-678 is the exact location where the new flag must be added, following identical Click patterns
    • apply_depth_verify_refine_postprocess is the function that has access to all depth data — save must be called from here or just after it
    • Integration tests show how to mock ZED data for testing the full flow

    Acceptance Criteria:

    TDD:

    • Test file updated or created: tests/test_depth_save_integration.py
    • Tests cover: flag appears in help, save is called when flag provided, save is NOT called without flag
    • uv run pytest tests/test_depth_save_integration.py -v → PASS

    Agent-Executed QA Scenarios:

    Scenario: --save-depth flag appears in CLI help
      Tool: Bash
      Preconditions: calibrate_extrinsics.py updated
      Steps:
        1. uv run python calibrate_extrinsics.py --help
        2. Assert: output contains "--save-depth"
        3. Assert: output contains "HDF5" or "depth" in the help text for the flag
      Expected Result: Flag is documented in help
      Evidence: Help output captured
    
    Scenario: Existing tests still pass after integration
      Tool: Bash (pytest)
      Preconditions: calibrate_extrinsics.py updated
      Steps:
        1. uv run pytest tests/test_depth_cli_postprocess.py tests/test_depth_pool_integration.py -v
        2. Assert: exit code 0, no regressions
      Expected Result: No existing behavior broken
      Evidence: pytest output captured
    

    Commit: YES

    • Message: feat(calibrate): add --save-depth flag for HDF5 depth persistence
    • Files: calibrate_extrinsics.py, tests/test_depth_save_integration.py
    • Pre-commit: uv run pytest tests/test_depth_cli_postprocess.py tests/test_depth_pool_integration.py

  • 5. Extend aruco/ground_plane.py with multi-camera workflow orchestration

    What to do:

    Add high-level orchestration functions that compose the primitives from Task 3:

    A. refine_ground_from_depth(camera_data, extrinsics, config):

    • Main entry point: takes per-camera depth data + current extrinsics → returns corrected extrinsics + metrics
    • Flow:
      1. Per camera: unproject_depth_to_pointsdetect_floor_plane
      2. compute_consensus_plane from all successful detections
      3. Per camera: compute_floor_correction relative to consensus
      4. Return corrected extrinsics dict + RefinementMetrics
    • Config dataclass with: max_rotation_deg, max_translation_m, ransac_distance_threshold, min_inlier_ratio, height_range, normal_constraint_deg, stride, seed
    • Metrics dataclass with: per-camera floor angles (before/after), consensus plane model, correction magnitudes, skipped cameras + reasons

    B. Error handling:

    • If floor detection fails for a camera → skip it, log warning, include in metrics
    • If fewer than 2 cameras have valid floor → abort, return original extrinsics + error reason
    • If correction exceeds bounds → cap at bounds, mark as clamped in metrics

    Must NOT do:

    • No file I/O in this module — pure computation
    • No visualization — that's Task 6

    Recommended Agent Profile:

    • Category: unspecified-high
      • Reason: Orchestration logic with error handling, config management, metrics collection
    • Skills: []

    Parallelization:

    • Can Run In Parallel: YES
    • Parallel Group: Wave 2 (with Task 4)
    • Blocks: Tasks 6, 7
    • Blocked By: Tasks 1, 3

    References:

    Pattern References:

    • calibrate_extrinsics.py:120-131apply_depth_verify_refine_postprocess() signature — pattern for multi-camera orchestration function
    • aruco/depth_refine.py:71-227refine_extrinsics_with_depth() return value pattern: (result, stats_dict)
    • aruco/depth_verify.py:8-15DepthVerificationResult dataclass — pattern for structured results

    API/Type References:

    • aruco/ground_plane.py (Task 3 output) — All primitive functions: unproject_depth_to_points, detect_floor_plane, compute_consensus_plane, compute_floor_correction

    Test References:

    • tests/test_ground_plane.py (Task 3 output) — Unit test patterns to follow for orchestration tests

    WHY Each Reference Matters:

    • apply_depth_verify_refine_postprocess shows how multi-camera iteration with fallback is done in this codebase
    • depth_refine.py shows the (result, stats) return pattern that should be followed

    Acceptance Criteria:

    TDD:

    • Tests added to tests/test_ground_plane.py for orchestration functions
    • Tests cover: full pipeline with 2-camera synthetic data, single-camera skip, all-cameras-fail abort, config bounds
    • uv run pytest tests/test_ground_plane.py -v → PASS (all tests, 0 failures)

    Agent-Executed QA Scenarios:

    Scenario: Two-camera synthetic refinement produces level ground
      Tool: Bash (pytest)
      Preconditions: Orchestration functions implemented
      Steps:
        1. uv run pytest tests/test_ground_plane.py -v -k "refine_ground_from_depth and two_camera"
        2. Assert: exit code 0
        3. Assert: after correction, floor angle disagreement < 0.5°
      Expected Result: Per-camera corrections level the ground
      Evidence: pytest output captured
    
    Scenario: Graceful fallback when floor detection fails for one camera
      Tool: Bash (pytest)
      Preconditions: Orchestration functions implemented
      Steps:
        1. uv run pytest tests/test_ground_plane.py -v -k "skip_camera"
        2. Assert: exit code 0
        3. Assert: skipped camera's extrinsics unchanged, other cameras corrected
      Expected Result: Partial failure handled gracefully
      Evidence: pytest output captured
    

    Commit: YES

    • Message: feat(aruco): add multi-camera ground plane refinement orchestration
    • Files: aruco/ground_plane.py, tests/test_ground_plane.py
    • Pre-commit: uv run pytest tests/test_ground_plane.py

  • 6. Create Plotly diagnostic visualization for ground plane refinement

    What to do:

    • Add a function create_ground_diagnostic_plot(metrics, camera_data, extrinsics_before, extrinsics_after) → returns plotly.graph_objects.Figure
    • Add a function save_diagnostic_plot(fig, path) → writes HTML file
    • Visualization contents:
      • 3D scatter: floor inlier points per camera (color-coded by camera serial)
      • Surface: consensus plane (semi-transparent)
      • Camera frustums: before (dashed/faded) and after (solid) positions
      • Annotation: per-camera correction magnitude (degrees + cm)
      • Title: summary metrics (total disagreement before/after)
    • Follow existing Plotly patterns from visualize_extrinsics.py and compare_pose_sets.py

    Must NOT do:

    • No interactive server or GUI — static HTML file only
    • No Open3D visualization (use Plotly only, already a dep)
    • No complex camera frustum rendering — simple cone or pyramid is fine

    Recommended Agent Profile:

    • Category: unspecified-low
      • Reason: Visualization code following existing Plotly patterns, no complex algorithm
    • Skills: []

    Parallelization:

    • Can Run In Parallel: NO
    • Parallel Group: Wave 3 (sequential after Task 5)
    • Blocks: Task 7
    • Blocked By: Task 5

    References:

    Pattern References:

    • compare_pose_sets.py:145-200add_camera_trace() — Plotly camera visualization pattern (frustum + axes + labels)
    • visualize_extrinsics.py — Full Plotly 3D scene setup with layout, ground plane, axis labels (check head of file for imports and patterns)

    Test References:

    • No heavy test required — visualization is a "nice to have". A smoke test that the function returns a go.Figure with expected trace count is sufficient.

    WHY Each Reference Matters:

    • compare_pose_sets.py already has Plotly camera rendering code that can be adapted
    • visualize_extrinsics.py shows the full 3D scene pattern including ground plane rendering

    Acceptance Criteria:

    • Function create_ground_diagnostic_plot returns a plotly.graph_objects.Figure
    • Figure contains traces for: floor points per camera, consensus plane surface, camera markers
    • Smoke test: uv run pytest tests/test_ground_plane.py -v -k "diagnostic_plot" → PASS

    Agent-Executed QA Scenarios:

    Scenario: Diagnostic plot generates valid HTML
      Tool: Bash (pytest)
      Preconditions: Visualization function implemented
      Steps:
        1. uv run pytest tests/test_ground_plane.py -v -k "diagnostic_plot"
        2. Assert: exit code 0
        3. Assert: test verifies Figure has correct number of traces
      Expected Result: Plot function produces valid Plotly figure
      Evidence: pytest output captured
    

    Commit: YES (groups with Task 7)

    • Message: feat(aruco): add Plotly diagnostic visualization for ground plane
    • Files: aruco/ground_plane.py (viz function added), tests/test_ground_plane.py
    • Pre-commit: uv run pytest tests/test_ground_plane.py

  • 7. Create refine_ground_plane.py — standalone CLI tool

    What to do:

    • Click CLI tool with the following options:
      • --input-depth / -d: Path to HDF5 depth file (from --save-depth)
      • --input-extrinsics / -i: Path to extrinsics JSON (from calibrate_extrinsics.py)
      • --output-extrinsics / -o: Path for corrected extrinsics JSON
      • --metrics-json: Optional path for machine-readable metrics output
      • --plot / --no-plot: Generate Plotly diagnostic (default: --plot)
      • --plot-output: Path for diagnostic HTML (default: <output_dir>/ground_diagnostic.html)
      • --max-rotation-deg: Max correction rotation (default: 5.0)
      • --max-translation-m: Max correction translation (default: 0.05)
      • --ransac-threshold: RANSAC distance threshold in meters (default: 0.02)
      • --min-inlier-ratio: Minimum inlier ratio to accept floor detection (default: 0.3)
      • --height-range: Expected floor height range as "min,max" (default: auto from data)
      • --stride: Depth map downsampling stride (default: 4)
      • --seed: Random seed for reproducibility (default: 42)
      • --debug / --no-debug: Verbose logging
    • Flow:
      1. Load extrinsics JSON (reuse compare_pose_sets.py:load_poses_from_json)
      2. Load depth data from HDF5 (use depth_save.load_depth_data)
      3. Call refine_ground_from_depth() orchestration function
      4. Save corrected extrinsics (same JSON format as input, with _meta.ground_refined: true)
      5. Print summary metrics to stdout
      6. Optionally save metrics JSON
      7. Optionally generate diagnostic Plotly HTML
    • Output extrinsics format: identical to calibrate_extrinsics.py output, with added _meta.ground_refined flag

    Must NOT do:

    • No ZED SDK dependency — works entirely from saved files
    • No modification of input files
    • No interactive prompts

    Recommended Agent Profile:

    • Category: unspecified-high
      • Reason: Full CLI tool composing multiple modules, end-to-end data flow, error handling, multiple output formats
    • Skills: []

    Parallelization:

    • Can Run In Parallel: NO
    • Parallel Group: Wave 3 (depends on 2, 5, 6)
    • Blocks: Task 8
    • Blocked By: Tasks 2, 5, 6

    References:

    Pattern References:

    • calibrate_extrinsics.py:562-678 — Click CLI pattern with extensive options, logging, error handling
    • compare_pose_sets.py:52-88load_poses_from_json() — JSON extrinsics loading pattern
    • compare_pose_sets.py:91-92serialize_pose() — JSON extrinsics saving pattern
    • visualize_extrinsics.py — CLI tool that loads extrinsics + generates Plotly output

    API/Type References:

    • aruco/depth_save.py (Task 2) — load_depth_data(path) return type
    • aruco/ground_plane.py (Tasks 3, 5) — refine_ground_from_depth() signature and return type
    • aruco/ground_plane.py (Task 6) — create_ground_diagnostic_plot() signature

    WHY Each Reference Matters:

    • calibrate_extrinsics.py CLI is the canonical pattern for Click tools in this project
    • compare_pose_sets.py shows how to load/save the extrinsics JSON format correctly
    • The ground_plane module provides all computation — CLI just wires I/O to computation

    Acceptance Criteria:

    TDD:

    • Test file created: tests/test_refine_ground_cli.py
    • Tests cover: help output, valid invocation with synthetic data, missing input error, output file creation, metrics JSON format
    • uv run pytest tests/test_refine_ground_cli.py -v → PASS

    Agent-Executed QA Scenarios:

    Scenario: CLI help shows all expected options
      Tool: Bash
      Preconditions: refine_ground_plane.py created
      Steps:
        1. uv run python refine_ground_plane.py --help
        2. Assert: output contains "--input-depth", "--input-extrinsics", "--output-extrinsics"
        3. Assert: output contains "--max-rotation-deg", "--ransac-threshold", "--seed"
        4. Assert: exit code 0
      Expected Result: All options documented
      Evidence: Help output captured
    
    Scenario: Tool produces valid extrinsics JSON
      Tool: Bash
      Preconditions: Synthetic HDF5 and extrinsics JSON created by test fixtures
      Steps:
        1. uv run pytest tests/test_refine_ground_cli.py -v -k "produces_valid_json"
        2. Assert: exit code 0
        3. Assert: output JSON is valid, contains all camera serials, has _meta.ground_refined
      Expected Result: Output matches extrinsics JSON schema
      Evidence: pytest output captured
    
    Scenario: Metrics JSON contains before/after comparison
      Tool: Bash
      Preconditions: Test creates and runs CLI with --metrics-json
      Steps:
        1. uv run pytest tests/test_refine_ground_cli.py -v -k "metrics_json"
        2. Assert: exit code 0
        3. Assert: metrics has 'floor.angle_disagreement_deg_before' and 'floor.angle_disagreement_deg_after'
      Expected Result: Machine-readable improvement metrics produced
      Evidence: pytest output captured
    

    Commit: YES

    • Message: feat: add refine_ground_plane.py standalone CLI tool
    • Files: refine_ground_plane.py, tests/test_refine_ground_cli.py
    • Pre-commit: uv run pytest tests/test_refine_ground_cli.py

  • 8. Final integration: full test suite pass + basedpyright + README update

    What to do:

    • Run the FULL test suite: uv run pytest -x -vv
    • Run basedpyright on all new files: uv run basedpyright aruco/depth_save.py aruco/ground_plane.py refine_ground_plane.py
    • Fix any regressions or type errors
    • Add usage example to README.md showing the depth-save → ground-refine workflow:
      # Step 1: Calibrate with depth saving
      uv run calibrate_extrinsics.py ... --refine-depth --save-depth output/depth_data.h5
      
      # Step 2: Refine ground plane
      uv run refine_ground_plane.py \
          --input-depth output/depth_data.h5 \
          --input-extrinsics output/extrinsics.json \
          --output-extrinsics output/extrinsics_ground_refined.json \
          --plot-output output/ground_diagnostic.html
      

    Must NOT do:

    • Do not modify any test behavior — only fix genuine regressions
    • Do not add features — this is stabilization only

    Recommended Agent Profile:

    • Category: unspecified-low
      • Reason: Verification and minor fixups, no new features
    • Skills: []

    Parallelization:

    • Can Run In Parallel: NO
    • Parallel Group: Wave 3 (final, sequential)
    • Blocks: None (terminal)
    • Blocked By: All previous tasks

    References:

    Pattern References:

    • README.md — Existing usage examples for calibrate_extrinsics.py and visualize_extrinsics.py
    • pyproject.toml:39-41 — pytest configuration (testpaths, norecursedirs)

    WHY Each Reference Matters:

    • README has existing command examples that the new workflow should follow in format/style
    • pyproject.toml pytest config ensures all test directories are covered

    Acceptance Criteria:

    • uv run pytest -x -vv → ALL tests pass, 0 failures, 0 errors
    • uv run basedpyright aruco/depth_save.py aruco/ground_plane.py refine_ground_plane.py → 0 errors
    • README.md contains usage example for the new ground refinement workflow

    Agent-Executed QA Scenarios:

    Scenario: Full test suite passes
      Tool: Bash (pytest)
      Preconditions: All previous tasks completed
      Steps:
        1. uv run pytest -x -vv
        2. Assert: exit code 0
        3. Assert: all tests pass, 0 failures
      Expected Result: No regressions introduced
      Evidence: Full pytest output captured
    
    Scenario: Type checking passes
      Tool: Bash (basedpyright)
      Preconditions: All new modules written
      Steps:
        1. uv run basedpyright aruco/depth_save.py aruco/ground_plane.py refine_ground_plane.py
        2. Assert: no error-level diagnostics
      Expected Result: Type-safe code
      Evidence: basedpyright output captured
    

    Commit: YES

    • Message: chore: final integration pass — tests, types, README for ground plane refinement
    • Files: README.md, any fixup files
    • Pre-commit: uv run pytest -x -vv

Commit Strategy

After Task Message Files Verification
1 build(deps): add open3d and h5py for ground plane refinement pyproject.toml, uv.lock uv run python -c "import open3d; import h5py"
2 feat(aruco): add HDF5 depth map persistence module aruco/depth_save.py, tests/test_depth_save.py uv run pytest tests/test_depth_save.py
3 feat(aruco): add ground plane detection and per-camera correction module aruco/ground_plane.py, tests/test_ground_plane.py uv run pytest tests/test_ground_plane.py
4 feat(calibrate): add --save-depth flag for HDF5 depth persistence calibrate_extrinsics.py, tests/test_depth_save_integration.py uv run pytest tests/test_depth_cli_postprocess.py tests/test_depth_pool_integration.py
5 feat(aruco): add multi-camera ground plane refinement orchestration aruco/ground_plane.py, tests/test_ground_plane.py uv run pytest tests/test_ground_plane.py
6 feat(aruco): add Plotly diagnostic visualization for ground plane aruco/ground_plane.py, tests/test_ground_plane.py uv run pytest tests/test_ground_plane.py
7 feat: add refine_ground_plane.py standalone CLI tool refine_ground_plane.py, tests/test_refine_ground_cli.py uv run pytest tests/test_refine_ground_cli.py
8 chore: final integration pass — tests, types, README for ground plane refinement README.md, fixups uv run pytest -x -vv

Success Criteria

Verification Commands

# All tests pass
uv run pytest -x -vv  # Expected: 0 failures

# Type checking passes
uv run basedpyright aruco/depth_save.py aruco/ground_plane.py refine_ground_plane.py  # Expected: 0 errors

# CLI tools have correct help
uv run python calibrate_extrinsics.py --help | grep "save-depth"  # Expected: --save-depth appears
uv run python refine_ground_plane.py --help  # Expected: all options listed

# Dependencies installed
uv run python -c "import open3d; import h5py; print('ok')"  # Expected: ok

Final Checklist

  • All "Must Have" requirements present
  • All "Must NOT Have" exclusions absent (no core pipeline changes, no ML, no non-flat floors)
  • All tests pass (uv run pytest -x -vv)
  • Type checking passes (uv run basedpyright)
  • HDF5 depth saving works end-to-end (save → load round-trip)
  • Ground plane refinement produces measurably improved floor alignment
  • Output extrinsics JSON matches existing format (compatible with visualize_extrinsics.py)
  • Diagnostic Plotly HTML generated successfully
  • README updated with usage workflow