# Ground Plane Refinement & Depth Map Persistence ## TL;DR > **Quick Summary**: Fix inter-camera ground plane disagreement by adding depth-based floor detection and per-camera extrinsic correction as a standalone post-processing tool. Also add HDF5 depth map persistence so SVO re-reading is not needed for iterative refinement. > > **Deliverables**: > - `--save-depth` flag in `calibrate_extrinsics.py` → HDF5 depth persistence > - New `aruco/depth_save.py` module for HDF5 read/write > - New `aruco/ground_plane.py` module for floor detection + consensus alignment > - New `refine_ground_plane.py` standalone CLI tool > - Plotly diagnostic visualization (before/after floor alignment) > - Full TDD test suite for all new modules > - New dependencies: `open3d`, `h5py` > > **Estimated Effort**: Large > **Parallel Execution**: YES — 3 waves > **Critical Path**: Task 1 (deps) → Task 2 (depth save module) → Task 4 (CLI integration) → Task 5 (ground plane module) → Task 7 (CLI tool) → Task 8 (visualization) --- ## Context ### Original Request User's `calibrate_extrinsics.py` produces extrinsics where the ground plane is not level — specifically, different cameras disagree about where the ground is when overlaying world-coordinate point clouds. The error is small (1-3° tilt, <2cm offset) across a 2-4 camera ZED setup. User wants: 1. A way to refine the calibration using actual floor depth observations 2. Saved pooled depth maps to avoid re-reading SVOs for iterative refinement ### Interview Summary **Key Discussions**: - **Core problem**: Inter-camera disagreement, not just global tilt. Point clouds from different cameras don't align on the floor surface. - **Integration approach**: Post-processing tool (standalone CLI), not integrated into existing pipeline. - **Library choice**: Open3D for point cloud operations (user wants it available for future work). h5py for HDF5 persistence. - **Refinement granularity**: Per-camera correction (each camera gets its own correction based on its floor observations). - **Depth saving**: Opt-in via `--save-depth ` flag. Save pooled + raw best frames per camera. - **Save format**: HDF5 via h5py with versioned schema. - **Visualization**: Plotly HTML diagnostic (floor points per camera, consensus plane, before/after). - **Test strategy**: TDD with pytest, following existing test patterns. **Research Findings**: - `alignment.py` has `rotation_align_vectors()` for aligning normals — reusable for floor alignment - `depth_pool.py` does median pooling but never persists results - `depth_refine.py` has `scipy.optimize.least_squares` infrastructure for pose optimization - `compare_pose_sets.py` has Kabsch `rigid_transform_3d()` for rigid alignment - `depth_verify.py` has `project_point_to_pixel()` and depth residual computation - Current pipeline: ArUco → PnP → RANSAC averaging → depth refinement (sparse, marker corners only) → alignment (marker normals only) - Open3D provides `segment_plane()` for RANSAC plane fitting on point clouds ### Metis Review **Identified Gaps** (addressed): - **Correction DOF**: Must constrain to pitch/roll + vertical translation only (no yaw drift, no lateral drift). Addressed via bounded optimization. - **RANSAC plane robustness**: Must constrain plane normal to near-vertical and height to expected range, plus ROI masking. Addressed via configurable constraints. - **HDF5 schema versioning**: Must include `/meta/schema_version`, units, intrinsics, coordinate frame. Addressed in schema design. - **Failure mode for missing floor**: If plane detection fails for one camera, skip that camera and warn (don't fail entire run). Addressed in error handling design. - **Reproducibility**: RANSAC seed control for deterministic tests. Addressed via `seed` parameter. - **Per-camera correction risk**: May break inter-camera rigidity. Addressed via correction bounds + pre/post metrics reporting. - **Consensus plane definition**: Use merged inlier points from all cameras, weighted by inlier count. Addressed in algorithm design. --- ## Work Objectives ### Core Objective Enable depth-based ground plane refinement that corrects per-camera extrinsic errors (1-3° tilt, <2cm vertical offset) by detecting the actual physical floor surface from ZED depth maps and aligning all cameras to a consensus ground plane. ### Concrete Deliverables - `aruco/depth_save.py`: HDF5 read/write module for depth maps + metadata - `aruco/ground_plane.py`: Floor detection (RANSAC), consensus plane fitting, per-camera correction - `refine_ground_plane.py`: Standalone Click CLI tool - `--save-depth` flag added to `calibrate_extrinsics.py` - `tests/test_depth_save.py`: TDD tests for depth persistence - `tests/test_ground_plane.py`: TDD tests for floor detection + alignment - `tests/test_refine_ground_cli.py`: TDD tests for CLI tool - Plotly diagnostic HTML output ### Definition of Done - [x] `uv run pytest tests/test_depth_save.py` → all tests pass - [x] `uv run pytest tests/test_ground_plane.py` → all tests pass - [x] `uv run pytest tests/test_refine_ground_cli.py` → all tests pass - [x] `uv run basedpyright aruco/depth_save.py aruco/ground_plane.py refine_ground_plane.py` → no errors - [x] `uv run python calibrate_extrinsics.py --help` shows `--save-depth` flag - [x] `uv run python refine_ground_plane.py --help` shows expected options - [x] End-to-end: calibrate → save depth → refine ground → produces valid extrinsics JSON ### Must Have - Per-camera RANSAC floor plane detection from depth maps - Consensus plane fitting from merged floor points - Constrained per-camera correction (pitch/roll + vertical translation, no yaw/lateral) - Correction bounds with configurable limits (default: max 5° rotation, max 5cm translation) - "No-op if not confident" threshold — skip correction if RANSAC inlier ratio is too low - HDF5 schema with versioning and full metadata (intrinsics, units, resolution, frame indices) - Diagnostic metrics: per-camera plane normal angles, consensus disagreement before/after, correction magnitudes - Plotly visualization of floor points + consensus plane + before/after camera poses ### Must NOT Have (Guardrails) - NO changes to ArUco detection, PnP, or RANSAC pose averaging logic - NO changes to existing `depth_refine.py` or `depth_verify.py` behavior - NO non-flat floor handling (ramps, stairs, multi-level) - NO dense multi-view reconstruction beyond the floor plane - NO automatic scene segmentation or ML-based floor detection - NO global bundle adjustment across all cameras - NO saving of every frame's depth data — only pooled + curated best subset - NO GUI requirements — visualization is optional Plotly HTML output - NO modification of the extrinsics JSON schema (output format matches existing convention) --- ## Verification Strategy (MANDATORY) > **UNIVERSAL RULE: ZERO HUMAN INTERVENTION** > > ALL tasks in this plan MUST be verifiable WITHOUT any human action. ### Test Decision - **Infrastructure exists**: YES (`pytest` configured in `pyproject.toml`) - **Automated tests**: TDD (tests first) - **Framework**: `pytest` (existing) ### If TDD Enabled Each TODO follows RED-GREEN-REFACTOR: **Task Structure:** 1. **RED**: Write failing test first - Test file: `tests/test_.py` - Test command: `uv run pytest tests/test_.py` - Expected: FAIL (test exists, implementation doesn't) 2. **GREEN**: Implement minimum code to pass - Command: `uv run pytest tests/test_.py` - Expected: PASS 3. **REFACTOR**: Clean up while keeping green - Command: `uv run pytest tests/test_.py` - Expected: PASS (still) ### Agent-Executed QA Scenarios (MANDATORY — ALL tasks) **Verification Tool by Deliverable Type:** | Type | Tool | How Agent Verifies | |------|------|-------------------| | **Python module** | Bash (pytest) | Run tests, assert pass count, zero failures | | **CLI tool** | Bash (click --help + invocation) | Check help output, run with test data, verify exit code and output | | **HDF5 file** | Bash (python -c "import h5py; ...") | Open file, check schema, validate data shapes | | **Type checking** | Bash (basedpyright) | Run type checker, verify zero errors | | **Plotly output** | Bash (file existence + python parse) | Check file exists, contains valid HTML, has expected traces | --- ## Execution Strategy ### Parallel Execution Waves ``` Wave 1 (Start Immediately): ├── Task 1: Add open3d + h5py dependencies ├── Task 2: TDD depth save module (aruco/depth_save.py) [after Task 1] └── Task 3: TDD ground plane core module (aruco/ground_plane.py) [after Task 1] Wave 2 (After Wave 1): ├── Task 4: Integrate --save-depth into calibrate_extrinsics.py [depends: 1, 2] └── Task 5: Ground plane consensus + per-camera correction [depends: 1, 3] Wave 3 (After Wave 2): ├── Task 6: Plotly diagnostic visualization module [depends: 5] ├── Task 7: refine_ground_plane.py CLI tool [depends: 2, 5, 6] └── Task 8: Integration tests + basedpyright pass [depends: all] Critical Path: Task 1 → Task 2 → Task 4 (depth save path) Task 1 → Task 3 → Task 5 → Task 7 (ground plane path) ``` ### Dependency Matrix | Task | Depends On | Blocks | Can Parallelize With | |------|------------|--------|---------------------| | 1 | None | 2, 3 | None (must be first) | | 2 | 1 | 4, 7 | 3 | | 3 | 1 | 5 | 2 | | 4 | 1, 2 | 7, 8 | 5 | | 5 | 1, 3 | 6, 7 | 4 | | 6 | 5 | 7 | 4 | | 7 | 2, 5, 6 | 8 | None | | 8 | All | None | None (final) | ### Agent Dispatch Summary | Wave | Tasks | Recommended Agents | |------|-------|-------------------| | 1 | 1 | task(category="quick", ...) | | 1→2 | 2, 3 | task(category="unspecified-high", ...) — parallel after Task 1 | | 2 | 4, 5 | task(category="unspecified-high", ...) — parallel | | 3 | 6 | task(category="unspecified-low", ...) | | 3 | 7 | task(category="unspecified-high", ...) | | 3 | 8 | task(category="unspecified-low", ...) | --- ## TODOs - [x] 1. Add `open3d` and `h5py` dependencies to `pyproject.toml` **What to do**: - Add `open3d` and `h5py` to the `[project] dependencies` list in `pyproject.toml` - Run `uv sync` to install - Verify imports work: `uv run python -c "import open3d; import h5py; print('ok')"` **Must NOT do**: - Do not add unnecessary deps (no trimesh, no probreg, no pycpd) - Do not modify any other pyproject.toml sections **Recommended Agent Profile**: - **Category**: `quick` - Reason: Single file edit + one command - **Skills**: [] - No special skills needed for a dependency addition **Parallelization**: - **Can Run In Parallel**: NO - **Parallel Group**: Wave 1 (solo — must complete before 2, 3) - **Blocks**: Tasks 2, 3, 4, 5, 6, 7, 8 - **Blocked By**: None **References**: **Pattern References**: - `pyproject.toml:7-27` — Existing dependency list format and conventions (e.g., `"scipy>=1.17.0"`) **Acceptance Criteria**: - [ ] `pyproject.toml` contains `open3d` and `h5py` in dependencies - [ ] `uv sync` completes without error - [ ] `uv run python -c "import open3d; import h5py; print('ok')"` prints `ok` and exits 0 **Agent-Executed QA Scenarios:** ``` Scenario: Dependencies install and import correctly Tool: Bash Preconditions: pyproject.toml edited Steps: 1. uv sync 2. uv run python -c "import open3d; print(open3d.__version__)" 3. Assert: exit code 0, version string printed 4. uv run python -c "import h5py; print(h5py.__version__)" 5. Assert: exit code 0, version string printed Expected Result: Both libraries installed and importable Evidence: Command output captured ``` **Commit**: YES - Message: `build(deps): add open3d and h5py for ground plane refinement` - Files: `pyproject.toml`, `uv.lock` - Pre-commit: `uv run python -c "import open3d; import h5py"` --- - [x] 2. TDD: Create `aruco/depth_save.py` — HDF5 depth map persistence module **What to do**: **RED phase** — Write `tests/test_depth_save.py` first with tests for: - `save_depth_data()`: saves pooled depth + confidence + raw frames + intrinsics + metadata to HDF5 - `load_depth_data()`: loads HDF5 back into structured dict - Round-trip test: save → load → compare arrays with `np.testing.assert_allclose` - Schema validation: check `/meta/schema_version`, `/meta/units`, `/meta/coordinate_frame` - Per-camera groups: `//pooled_depth`, `//pooled_confidence`, `//raw_frames//depth`, `//intrinsics` - Edge cases: single camera, no confidence map, no raw frames - Error handling: invalid path, empty data **GREEN phase** — Implement `aruco/depth_save.py`: - `save_depth_data(path, camera_data, schema_version=1)` — writes HDF5 - `load_depth_data(path)` — reads HDF5 back to dict - Schema version 1 layout: ``` /meta/ schema_version: int = 1 units: str = "meters" coordinate_frame: str = "world_from_cam" created_at: str (ISO 8601) // intrinsics: (3, 3) float64 — camera matrix K resolution: (2,) int — [width, height] pooled_depth: (H, W) float32 pooled_confidence: (H, W) float32 [optional] pool_metadata: JSON string (same dict currently in results) raw_frames/ 0/depth: (H, W) float32 0/confidence: (H, W) float32 [optional] 0/frame_index: int 0/score: float 1/depth: ... ``` - Use `h5py` compression: `compression="gzip"`, `compression_opts=4` - Type annotations on all public functions **REFACTOR phase** — Clean up, add docstrings, run basedpyright. **Must NOT do**: - Do not modify existing `depth_pool.py` or `depth_verify.py` - Do not add ZED SDK dependency to this module (pure numpy/h5py) - Do not save uncompressed data **Recommended Agent Profile**: - **Category**: `unspecified-high` - Reason: New module with TDD workflow, HDF5 schema design, comprehensive tests - **Skills**: [] - No special skills needed — standard Python + h5py **Parallelization**: - **Can Run In Parallel**: YES - **Parallel Group**: Wave 1-2 (with Task 3, after Task 1) - **Blocks**: Tasks 4, 7 - **Blocked By**: Task 1 **References**: **Pattern References**: - `aruco/depth_pool.py:1-90` — Data format conventions: depth maps are `(H, W) float` in meters, confidence maps are `(H, W) float` with ZED semantics (lower = more confident) - `calibrate_extrinsics.py:143-305` — How depth maps and confidence maps are collected per camera, how pool_metadata dict is structured - `calibrate_extrinsics.py:120-131` — Function signature of `apply_depth_verify_refine_postprocess` showing the `verification_frames` data structure **API/Type References**: - `aruco/depth_verify.py:18-24` — `project_point_to_pixel(P_cam, K)` shows intrinsics matrix K format (3x3, fx/fy/cx/cy) **Test References**: - `tests/test_depth_pool.py` — Follow this test structure: parametric, synthetic data, edge cases with `pytest.raises` - `tests/conftest.py` — sys.path setup for imports **Documentation References**: - `calibrate_extrinsics.py:338` — `results[str(serial)]["depth_pool"]` shows pool_metadata dict structure **WHY Each Reference Matters**: - `depth_pool.py` defines the array contracts (shape, dtype, units) the save module must preserve - `calibrate_extrinsics.py:143-305` shows exactly where/how depth data is produced — the save module must capture this data - Test patterns in `test_depth_pool.py` establish the project's testing conventions **Acceptance Criteria**: **TDD:** - [ ] Test file created: `tests/test_depth_save.py` - [ ] Tests cover: save, load, round-trip, schema validation, edge cases, error handling - [ ] `uv run pytest tests/test_depth_save.py -v` → PASS (all tests, 0 failures) **Agent-Executed QA Scenarios:** ``` Scenario: Round-trip save and load preserves data Tool: Bash (pytest) Preconditions: aruco/depth_save.py implemented Steps: 1. uv run pytest tests/test_depth_save.py -v -k "round_trip" 2. Assert: exit code 0 3. Assert: output contains "PASSED" Expected Result: Saved HDF5 loads back with identical data Evidence: pytest output captured Scenario: HDF5 schema has required metadata Tool: Bash (pytest) Preconditions: aruco/depth_save.py implemented Steps: 1. uv run pytest tests/test_depth_save.py -v -k "schema" 2. Assert: exit code 0 3. Assert: tests verify /meta/schema_version, /meta/units, /meta/coordinate_frame Expected Result: Schema metadata present and correct Evidence: pytest output captured Scenario: Module passes type checking Tool: Bash (basedpyright) Preconditions: Module implemented with type annotations Steps: 1. uv run basedpyright aruco/depth_save.py 2. Assert: exit code 0 or only non-error diagnostics Expected Result: No type errors Evidence: basedpyright output captured ``` **Commit**: YES - Message: `feat(aruco): add HDF5 depth map persistence module` - Files: `aruco/depth_save.py`, `tests/test_depth_save.py` - Pre-commit: `uv run pytest tests/test_depth_save.py` --- - [x] 3. TDD: Create `aruco/ground_plane.py` — floor detection & consensus alignment core **What to do**: **RED phase** — Write `tests/test_ground_plane.py` first with tests for: A. `unproject_depth_to_points(depth_map, K, T_world_cam, stride=4)`: - Takes depth map + intrinsics + extrinsics → returns (N, 3) world-coordinate point cloud - Test: synthetic depth of a flat plane at known height → verify recovered 3D points match expected positions - Test: NaN/zero/negative depth values are excluded - Test: stride parameter reduces output point count proportionally B. `detect_floor_plane(points, normal_constraint, height_range, min_inlier_ratio, seed)`: - Uses Open3D RANSAC `segment_plane()` on the point-cloud - Returns `FloorPlaneResult(normal, offset, inliers, inlier_ratio, plane_model)` - Test: synthetic flat floor + random noise → recovers correct plane within tolerance - Test: synthetic floor + wall points → RANSAC ignores wall, finds floor (normal_constraint filters) - Test: normal_constraint rejects planes that aren't near-vertical (e.g., wall plane) - Test: height_range rejects planes too far from expected floor height - Test: too few inliers → returns None (below min_inlier_ratio) - Test: seed parameter produces deterministic results C. `compute_consensus_plane(floor_results, camera_weights=None)`: - Takes per-camera FloorPlaneResult list → fits a single consensus plane - Method: concatenate all inlier points, weighted by inlier count, fit plane via SVD - Test: two cameras seeing same plane → consensus matches individual planes - Test: two cameras with slight disagreement → consensus is between them - Test: camera weights affect result appropriately D. `compute_floor_correction(T_world_cam, floor_result, consensus_plane, max_rotation_deg=5.0, max_translation_m=0.05)`: - Computes constrained correction for a single camera - Allowed DOF: pitch/roll + vertical translation ONLY (no yaw, no lateral) - Uses `scipy.optimize.least_squares` with bounds - Returns `CorrectionResult(T_corrected, delta_rotation_deg, delta_translation_m, applied)` - Test: camera with 2° tilt from consensus → correction brings it within 0.1° - Test: correction respects max_rotation_deg bound - Test: correction respects max_translation_m bound - Test: yaw component is preserved (no yaw drift) - Test: lateral translation is preserved (no X/Z drift) **GREEN phase** — Implement `aruco/ground_plane.py`: - Import `open3d` for RANSAC plane segmentation - Import `scipy.optimize.least_squares` for constrained correction - Reuse `aruco.alignment.rotation_align_vectors` where appropriate - Reuse `aruco.pose_math.invert_transform` and `matrix_to_rvec_tvec` - Use dataclasses for `FloorPlaneResult` and `CorrectionResult` - All functions are pure (no side effects, no file I/O) **REFACTOR phase** — Docstrings, type annotations, basedpyright. **Must NOT do**: - No ML/segmentation — RANSAC + geometric constraints only - No global bundle adjustment - No modification to existing alignment.py - No dense reconstruction beyond floor plane extraction **Recommended Agent Profile**: - **Category**: `unspecified-high` - Reason: Core algorithmic module with 4 major functions, each with multiple test cases. Requires understanding of SE3 geometry, RANSAC, and constrained optimization. - **Skills**: [] **Parallelization**: - **Can Run In Parallel**: YES - **Parallel Group**: Wave 1-2 (with Task 2, after Task 1) - **Blocks**: Task 5 - **Blocked By**: Task 1 **References**: **Pattern References**: - `aruco/alignment.py:54-114` — `rotation_align_vectors(from_vec, to_vec)` — reuse for aligning floor normal to target up vector - `aruco/alignment.py:117-137` — `apply_alignment_to_pose(T, R_align)` — pattern for applying global rotation to extrinsics - `aruco/alignment.py:140-202` — `estimate_up_vector_from_cameras()` — existing camera-based "up" estimation, useful as initial guess for floor normal - `aruco/depth_refine.py:12-20` — `extrinsics_to_params()` / `params_to_extrinsics()` — 6-DOF parameterization for optimization - `aruco/depth_refine.py:71-180` — `refine_extrinsics_with_depth()` — pattern for bounded least_squares optimization of camera pose - `aruco/depth_verify.py:18-24` — `project_point_to_pixel(P_cam, K)` — projection math - `aruco/pose_math.py:22-28` — `invert_transform(T)` — efficient SE3 inversion **API/Type References**: - `aruco/alignment.py:7-16` — Type aliases: `Vec3`, `Mat33`, `Mat44`, `CornersNC` - `aruco/depth_verify.py:8-15` — `DepthVerificationResult` dataclass pattern **Test References**: - `tests/test_alignment.py` — Testing convention for geometric functions (synthetic inputs, tolerance assertions) - `tests/test_depth_refine.py` — Testing convention for optimization functions (before/after metrics) **External References**: - Open3D docs: `segment_plane(distance_threshold, ransac_n, num_iterations)` — returns `[a, b, c, d]` plane model + inlier indices **WHY Each Reference Matters**: - `alignment.py` provides the exact rotation-alignment primitives we need — no need to reimplement - `depth_refine.py` establishes the bounded least-squares pattern with regularization — correction should follow the same style - `test_alignment.py` shows how geometric tests are structured in this project (synthetic data, `assert_allclose`) **Acceptance Criteria**: **TDD:** - [ ] Test file created: `tests/test_ground_plane.py` - [ ] Tests cover: unproject, floor detection (happy + noise + wall + failure), consensus, correction (tilt + bounds + yaw preservation) - [ ] `uv run pytest tests/test_ground_plane.py -v` → PASS (all tests, 0 failures) **Agent-Executed QA Scenarios:** ``` Scenario: Floor detection on synthetic flat plane Tool: Bash (pytest) Preconditions: aruco/ground_plane.py implemented Steps: 1. uv run pytest tests/test_ground_plane.py -v -k "detect_floor and synthetic_flat" 2. Assert: exit code 0 3. Assert: recovered normal within 1° of [0, -1, 0] Expected Result: RANSAC correctly identifies flat floor Evidence: pytest output captured Scenario: Per-camera correction preserves yaw Tool: Bash (pytest) Preconditions: aruco/ground_plane.py implemented Steps: 1. uv run pytest tests/test_ground_plane.py -v -k "correction and yaw" 2. Assert: exit code 0 3. Assert: yaw angle before == yaw angle after (within 0.01°) Expected Result: Correction only affects pitch/roll + vertical translation Evidence: pytest output captured Scenario: Module passes type checking Tool: Bash (basedpyright) Preconditions: Module implemented with type annotations Steps: 1. uv run basedpyright aruco/ground_plane.py 2. Assert: exit code 0 or only non-error diagnostics Expected Result: No type errors Evidence: basedpyright output captured ``` **Commit**: YES - Message: `feat(aruco): add ground plane detection and per-camera correction module` - Files: `aruco/ground_plane.py`, `tests/test_ground_plane.py` - Pre-commit: `uv run pytest tests/test_ground_plane.py` --- - [x] 4. Integrate `--save-depth` flag into `calibrate_extrinsics.py` **What to do**: - Add `--save-depth` Click option (type `click.Path()`, default `None`) - When provided, after depth pooling/selection in `apply_depth_verify_refine_postprocess`, call `save_depth_data()` to persist: - Pooled depth + confidence per camera - Raw best-scored frames (depth + confidence + frame index + score) - Camera intrinsics matrix K - Pool metadata dict - Log the output path and file size **Must NOT do**: - Do not change existing depth processing behavior - Do not make saving mandatory (only when `--save-depth` is provided) - Do not save if depth verification/refinement is not enabled (warn and skip) **Recommended Agent Profile**: - **Category**: `unspecified-high` - Reason: Integration into existing CLI with complex data flow, needs careful threading of data through the function - **Skills**: [] **Parallelization**: - **Can Run In Parallel**: YES - **Parallel Group**: Wave 2 (with Task 5) - **Blocks**: Tasks 7, 8 - **Blocked By**: Tasks 1, 2 **References**: **Pattern References**: - `calibrate_extrinsics.py:562-678` — Click option definitions and `main()` function signature — follow exact same patterns for the new flag - `calibrate_extrinsics.py:606-611` — `--depth-pool-size` option as example of depth-related flag - `calibrate_extrinsics.py:120-305` — `apply_depth_verify_refine_postprocess()` — this is where depth data is available and where save should be triggered - `calibrate_extrinsics.py:143-165` — Where `depth_maps` and `confidence_maps` lists are built per camera — data to capture for raw frames - `calibrate_extrinsics.py:267-270` — Where `final_depth` and `pool_metadata` are determined — data to capture for pooled result **API/Type References**: - `aruco/depth_save.py` (Task 2 output) — `save_depth_data(path, camera_data, schema_version=1)` function signature **Test References**: - `tests/test_depth_cli_postprocess.py` — Existing test patterns for calibrate_extrinsics CLI post-processing behavior - `tests/test_depth_pool_integration.py` — Integration test patterns with mocked depth data **WHY Each Reference Matters**: - `calibrate_extrinsics.py:562-678` is the exact location where the new flag must be added, following identical Click patterns - `apply_depth_verify_refine_postprocess` is the function that has access to all depth data — save must be called from here or just after it - Integration tests show how to mock ZED data for testing the full flow **Acceptance Criteria**: **TDD:** - [ ] Test file updated or created: `tests/test_depth_save_integration.py` - [ ] Tests cover: flag appears in help, save is called when flag provided, save is NOT called without flag - [ ] `uv run pytest tests/test_depth_save_integration.py -v` → PASS **Agent-Executed QA Scenarios:** ``` Scenario: --save-depth flag appears in CLI help Tool: Bash Preconditions: calibrate_extrinsics.py updated Steps: 1. uv run python calibrate_extrinsics.py --help 2. Assert: output contains "--save-depth" 3. Assert: output contains "HDF5" or "depth" in the help text for the flag Expected Result: Flag is documented in help Evidence: Help output captured Scenario: Existing tests still pass after integration Tool: Bash (pytest) Preconditions: calibrate_extrinsics.py updated Steps: 1. uv run pytest tests/test_depth_cli_postprocess.py tests/test_depth_pool_integration.py -v 2. Assert: exit code 0, no regressions Expected Result: No existing behavior broken Evidence: pytest output captured ``` **Commit**: YES - Message: `feat(calibrate): add --save-depth flag for HDF5 depth persistence` - Files: `calibrate_extrinsics.py`, `tests/test_depth_save_integration.py` - Pre-commit: `uv run pytest tests/test_depth_cli_postprocess.py tests/test_depth_pool_integration.py` --- - [x] 5. Extend `aruco/ground_plane.py` with multi-camera workflow orchestration **What to do**: Add high-level orchestration functions that compose the primitives from Task 3: A. `refine_ground_from_depth(camera_data, extrinsics, config)`: - Main entry point: takes per-camera depth data + current extrinsics → returns corrected extrinsics + metrics - Flow: 1. Per camera: `unproject_depth_to_points` → `detect_floor_plane` 2. `compute_consensus_plane` from all successful detections 3. Per camera: `compute_floor_correction` relative to consensus 4. Return corrected extrinsics dict + `RefinementMetrics` - Config dataclass with: `max_rotation_deg`, `max_translation_m`, `ransac_distance_threshold`, `min_inlier_ratio`, `height_range`, `normal_constraint_deg`, `stride`, `seed` - Metrics dataclass with: per-camera floor angles (before/after), consensus plane model, correction magnitudes, skipped cameras + reasons B. Error handling: - If floor detection fails for a camera → skip it, log warning, include in metrics - If fewer than 2 cameras have valid floor → abort, return original extrinsics + error reason - If correction exceeds bounds → cap at bounds, mark as `clamped` in metrics **Must NOT do**: - No file I/O in this module — pure computation - No visualization — that's Task 6 **Recommended Agent Profile**: - **Category**: `unspecified-high` - Reason: Orchestration logic with error handling, config management, metrics collection - **Skills**: [] **Parallelization**: - **Can Run In Parallel**: YES - **Parallel Group**: Wave 2 (with Task 4) - **Blocks**: Tasks 6, 7 - **Blocked By**: Tasks 1, 3 **References**: **Pattern References**: - `calibrate_extrinsics.py:120-131` — `apply_depth_verify_refine_postprocess()` signature — pattern for multi-camera orchestration function - `aruco/depth_refine.py:71-227` — `refine_extrinsics_with_depth()` return value pattern: (result, stats_dict) - `aruco/depth_verify.py:8-15` — `DepthVerificationResult` dataclass — pattern for structured results **API/Type References**: - `aruco/ground_plane.py` (Task 3 output) — All primitive functions: `unproject_depth_to_points`, `detect_floor_plane`, `compute_consensus_plane`, `compute_floor_correction` **Test References**: - `tests/test_ground_plane.py` (Task 3 output) — Unit test patterns to follow for orchestration tests **WHY Each Reference Matters**: - `apply_depth_verify_refine_postprocess` shows how multi-camera iteration with fallback is done in this codebase - `depth_refine.py` shows the (result, stats) return pattern that should be followed **Acceptance Criteria**: **TDD:** - [ ] Tests added to `tests/test_ground_plane.py` for orchestration functions - [ ] Tests cover: full pipeline with 2-camera synthetic data, single-camera skip, all-cameras-fail abort, config bounds - [ ] `uv run pytest tests/test_ground_plane.py -v` → PASS (all tests, 0 failures) **Agent-Executed QA Scenarios:** ``` Scenario: Two-camera synthetic refinement produces level ground Tool: Bash (pytest) Preconditions: Orchestration functions implemented Steps: 1. uv run pytest tests/test_ground_plane.py -v -k "refine_ground_from_depth and two_camera" 2. Assert: exit code 0 3. Assert: after correction, floor angle disagreement < 0.5° Expected Result: Per-camera corrections level the ground Evidence: pytest output captured Scenario: Graceful fallback when floor detection fails for one camera Tool: Bash (pytest) Preconditions: Orchestration functions implemented Steps: 1. uv run pytest tests/test_ground_plane.py -v -k "skip_camera" 2. Assert: exit code 0 3. Assert: skipped camera's extrinsics unchanged, other cameras corrected Expected Result: Partial failure handled gracefully Evidence: pytest output captured ``` **Commit**: YES - Message: `feat(aruco): add multi-camera ground plane refinement orchestration` - Files: `aruco/ground_plane.py`, `tests/test_ground_plane.py` - Pre-commit: `uv run pytest tests/test_ground_plane.py` --- - [x] 6. Create Plotly diagnostic visualization for ground plane refinement **What to do**: - Add a function `create_ground_diagnostic_plot(metrics, camera_data, extrinsics_before, extrinsics_after)` → returns `plotly.graph_objects.Figure` - Add a function `save_diagnostic_plot(fig, path)` → writes HTML file - Visualization contents: - 3D scatter: floor inlier points per camera (color-coded by camera serial) - Surface: consensus plane (semi-transparent) - Camera frustums: before (dashed/faded) and after (solid) positions - Annotation: per-camera correction magnitude (degrees + cm) - Title: summary metrics (total disagreement before/after) - Follow existing Plotly patterns from `visualize_extrinsics.py` and `compare_pose_sets.py` **Must NOT do**: - No interactive server or GUI — static HTML file only - No Open3D visualization (use Plotly only, already a dep) - No complex camera frustum rendering — simple cone or pyramid is fine **Recommended Agent Profile**: - **Category**: `unspecified-low` - Reason: Visualization code following existing Plotly patterns, no complex algorithm - **Skills**: [] **Parallelization**: - **Can Run In Parallel**: NO - **Parallel Group**: Wave 3 (sequential after Task 5) - **Blocks**: Task 7 - **Blocked By**: Task 5 **References**: **Pattern References**: - `compare_pose_sets.py:145-200` — `add_camera_trace()` — Plotly camera visualization pattern (frustum + axes + labels) - `visualize_extrinsics.py` — Full Plotly 3D scene setup with layout, ground plane, axis labels (check head of file for imports and patterns) **Test References**: - No heavy test required — visualization is a "nice to have". A smoke test that the function returns a `go.Figure` with expected trace count is sufficient. **WHY Each Reference Matters**: - `compare_pose_sets.py` already has Plotly camera rendering code that can be adapted - `visualize_extrinsics.py` shows the full 3D scene pattern including ground plane rendering **Acceptance Criteria**: - [ ] Function `create_ground_diagnostic_plot` returns a `plotly.graph_objects.Figure` - [ ] Figure contains traces for: floor points per camera, consensus plane surface, camera markers - [ ] Smoke test: `uv run pytest tests/test_ground_plane.py -v -k "diagnostic_plot"` → PASS **Agent-Executed QA Scenarios:** ``` Scenario: Diagnostic plot generates valid HTML Tool: Bash (pytest) Preconditions: Visualization function implemented Steps: 1. uv run pytest tests/test_ground_plane.py -v -k "diagnostic_plot" 2. Assert: exit code 0 3. Assert: test verifies Figure has correct number of traces Expected Result: Plot function produces valid Plotly figure Evidence: pytest output captured ``` **Commit**: YES (groups with Task 7) - Message: `feat(aruco): add Plotly diagnostic visualization for ground plane` - Files: `aruco/ground_plane.py` (viz function added), `tests/test_ground_plane.py` - Pre-commit: `uv run pytest tests/test_ground_plane.py` --- - [x] 7. Create `refine_ground_plane.py` — standalone CLI tool **What to do**: - Click CLI tool with the following options: - `--input-depth` / `-d`: Path to HDF5 depth file (from `--save-depth`) - `--input-extrinsics` / `-i`: Path to extrinsics JSON (from `calibrate_extrinsics.py`) - `--output-extrinsics` / `-o`: Path for corrected extrinsics JSON - `--metrics-json`: Optional path for machine-readable metrics output - `--plot` / `--no-plot`: Generate Plotly diagnostic (default: `--plot`) - `--plot-output`: Path for diagnostic HTML (default: `/ground_diagnostic.html`) - `--max-rotation-deg`: Max correction rotation (default: 5.0) - `--max-translation-m`: Max correction translation (default: 0.05) - `--ransac-threshold`: RANSAC distance threshold in meters (default: 0.02) - `--min-inlier-ratio`: Minimum inlier ratio to accept floor detection (default: 0.3) - `--height-range`: Expected floor height range as "min,max" (default: auto from data) - `--stride`: Depth map downsampling stride (default: 4) - `--seed`: Random seed for reproducibility (default: 42) - `--debug / --no-debug`: Verbose logging - Flow: 1. Load extrinsics JSON (reuse `compare_pose_sets.py:load_poses_from_json`) 2. Load depth data from HDF5 (use `depth_save.load_depth_data`) 3. Call `refine_ground_from_depth()` orchestration function 4. Save corrected extrinsics (same JSON format as input, with `_meta.ground_refined: true`) 5. Print summary metrics to stdout 6. Optionally save metrics JSON 7. Optionally generate diagnostic Plotly HTML - Output extrinsics format: identical to `calibrate_extrinsics.py` output, with added `_meta.ground_refined` flag **Must NOT do**: - No ZED SDK dependency — works entirely from saved files - No modification of input files - No interactive prompts **Recommended Agent Profile**: - **Category**: `unspecified-high` - Reason: Full CLI tool composing multiple modules, end-to-end data flow, error handling, multiple output formats - **Skills**: [] **Parallelization**: - **Can Run In Parallel**: NO - **Parallel Group**: Wave 3 (depends on 2, 5, 6) - **Blocks**: Task 8 - **Blocked By**: Tasks 2, 5, 6 **References**: **Pattern References**: - `calibrate_extrinsics.py:562-678` — Click CLI pattern with extensive options, logging, error handling - `compare_pose_sets.py:52-88` — `load_poses_from_json()` — JSON extrinsics loading pattern - `compare_pose_sets.py:91-92` — `serialize_pose()` — JSON extrinsics saving pattern - `visualize_extrinsics.py` — CLI tool that loads extrinsics + generates Plotly output **API/Type References**: - `aruco/depth_save.py` (Task 2) — `load_depth_data(path)` return type - `aruco/ground_plane.py` (Tasks 3, 5) — `refine_ground_from_depth()` signature and return type - `aruco/ground_plane.py` (Task 6) — `create_ground_diagnostic_plot()` signature **WHY Each Reference Matters**: - `calibrate_extrinsics.py` CLI is the canonical pattern for Click tools in this project - `compare_pose_sets.py` shows how to load/save the extrinsics JSON format correctly - The ground_plane module provides all computation — CLI just wires I/O to computation **Acceptance Criteria**: **TDD:** - [ ] Test file created: `tests/test_refine_ground_cli.py` - [ ] Tests cover: help output, valid invocation with synthetic data, missing input error, output file creation, metrics JSON format - [ ] `uv run pytest tests/test_refine_ground_cli.py -v` → PASS **Agent-Executed QA Scenarios:** ``` Scenario: CLI help shows all expected options Tool: Bash Preconditions: refine_ground_plane.py created Steps: 1. uv run python refine_ground_plane.py --help 2. Assert: output contains "--input-depth", "--input-extrinsics", "--output-extrinsics" 3. Assert: output contains "--max-rotation-deg", "--ransac-threshold", "--seed" 4. Assert: exit code 0 Expected Result: All options documented Evidence: Help output captured Scenario: Tool produces valid extrinsics JSON Tool: Bash Preconditions: Synthetic HDF5 and extrinsics JSON created by test fixtures Steps: 1. uv run pytest tests/test_refine_ground_cli.py -v -k "produces_valid_json" 2. Assert: exit code 0 3. Assert: output JSON is valid, contains all camera serials, has _meta.ground_refined Expected Result: Output matches extrinsics JSON schema Evidence: pytest output captured Scenario: Metrics JSON contains before/after comparison Tool: Bash Preconditions: Test creates and runs CLI with --metrics-json Steps: 1. uv run pytest tests/test_refine_ground_cli.py -v -k "metrics_json" 2. Assert: exit code 0 3. Assert: metrics has 'floor.angle_disagreement_deg_before' and 'floor.angle_disagreement_deg_after' Expected Result: Machine-readable improvement metrics produced Evidence: pytest output captured ``` **Commit**: YES - Message: `feat: add refine_ground_plane.py standalone CLI tool` - Files: `refine_ground_plane.py`, `tests/test_refine_ground_cli.py` - Pre-commit: `uv run pytest tests/test_refine_ground_cli.py` --- - [x] 8. Final integration: full test suite pass + basedpyright + README update **What to do**: - Run the FULL test suite: `uv run pytest -x -vv` - Run basedpyright on all new files: `uv run basedpyright aruco/depth_save.py aruco/ground_plane.py refine_ground_plane.py` - Fix any regressions or type errors - Add usage example to `README.md` showing the depth-save → ground-refine workflow: ```bash # Step 1: Calibrate with depth saving uv run calibrate_extrinsics.py ... --refine-depth --save-depth output/depth_data.h5 # Step 2: Refine ground plane uv run refine_ground_plane.py \ --input-depth output/depth_data.h5 \ --input-extrinsics output/extrinsics.json \ --output-extrinsics output/extrinsics_ground_refined.json \ --plot-output output/ground_diagnostic.html ``` **Must NOT do**: - Do not modify any test behavior — only fix genuine regressions - Do not add features — this is stabilization only **Recommended Agent Profile**: - **Category**: `unspecified-low` - Reason: Verification and minor fixups, no new features - **Skills**: [] **Parallelization**: - **Can Run In Parallel**: NO - **Parallel Group**: Wave 3 (final, sequential) - **Blocks**: None (terminal) - **Blocked By**: All previous tasks **References**: **Pattern References**: - `README.md` — Existing usage examples for `calibrate_extrinsics.py` and `visualize_extrinsics.py` - `pyproject.toml:39-41` — pytest configuration (`testpaths`, `norecursedirs`) **WHY Each Reference Matters**: - README has existing command examples that the new workflow should follow in format/style - pyproject.toml pytest config ensures all test directories are covered **Acceptance Criteria**: - [ ] `uv run pytest -x -vv` → ALL tests pass, 0 failures, 0 errors - [ ] `uv run basedpyright aruco/depth_save.py aruco/ground_plane.py refine_ground_plane.py` → 0 errors - [ ] README.md contains usage example for the new ground refinement workflow **Agent-Executed QA Scenarios:** ``` Scenario: Full test suite passes Tool: Bash (pytest) Preconditions: All previous tasks completed Steps: 1. uv run pytest -x -vv 2. Assert: exit code 0 3. Assert: all tests pass, 0 failures Expected Result: No regressions introduced Evidence: Full pytest output captured Scenario: Type checking passes Tool: Bash (basedpyright) Preconditions: All new modules written Steps: 1. uv run basedpyright aruco/depth_save.py aruco/ground_plane.py refine_ground_plane.py 2. Assert: no error-level diagnostics Expected Result: Type-safe code Evidence: basedpyright output captured ``` **Commit**: YES - Message: `chore: final integration pass — tests, types, README for ground plane refinement` - Files: `README.md`, any fixup files - Pre-commit: `uv run pytest -x -vv` --- ## Commit Strategy | After Task | Message | Files | Verification | |------------|---------|-------|--------------| | 1 | `build(deps): add open3d and h5py for ground plane refinement` | `pyproject.toml`, `uv.lock` | `uv run python -c "import open3d; import h5py"` | | 2 | `feat(aruco): add HDF5 depth map persistence module` | `aruco/depth_save.py`, `tests/test_depth_save.py` | `uv run pytest tests/test_depth_save.py` | | 3 | `feat(aruco): add ground plane detection and per-camera correction module` | `aruco/ground_plane.py`, `tests/test_ground_plane.py` | `uv run pytest tests/test_ground_plane.py` | | 4 | `feat(calibrate): add --save-depth flag for HDF5 depth persistence` | `calibrate_extrinsics.py`, `tests/test_depth_save_integration.py` | `uv run pytest tests/test_depth_cli_postprocess.py tests/test_depth_pool_integration.py` | | 5 | `feat(aruco): add multi-camera ground plane refinement orchestration` | `aruco/ground_plane.py`, `tests/test_ground_plane.py` | `uv run pytest tests/test_ground_plane.py` | | 6 | `feat(aruco): add Plotly diagnostic visualization for ground plane` | `aruco/ground_plane.py`, `tests/test_ground_plane.py` | `uv run pytest tests/test_ground_plane.py` | | 7 | `feat: add refine_ground_plane.py standalone CLI tool` | `refine_ground_plane.py`, `tests/test_refine_ground_cli.py` | `uv run pytest tests/test_refine_ground_cli.py` | | 8 | `chore: final integration pass — tests, types, README for ground plane refinement` | `README.md`, fixups | `uv run pytest -x -vv` | --- ## Success Criteria ### Verification Commands ```bash # All tests pass uv run pytest -x -vv # Expected: 0 failures # Type checking passes uv run basedpyright aruco/depth_save.py aruco/ground_plane.py refine_ground_plane.py # Expected: 0 errors # CLI tools have correct help uv run python calibrate_extrinsics.py --help | grep "save-depth" # Expected: --save-depth appears uv run python refine_ground_plane.py --help # Expected: all options listed # Dependencies installed uv run python -c "import open3d; import h5py; print('ok')" # Expected: ok ``` ### Final Checklist - [x] All "Must Have" requirements present - [x] All "Must NOT Have" exclusions absent (no core pipeline changes, no ML, no non-flat floors) - [x] All tests pass (`uv run pytest -x -vv`) - [x] Type checking passes (`uv run basedpyright`) - [x] HDF5 depth saving works end-to-end (save → load round-trip) - [x] Ground plane refinement produces measurably improved floor alignment - [x] Output extrinsics JSON matches existing format (compatible with `visualize_extrinsics.py`) - [x] Diagnostic Plotly HTML generated successfully - [x] README updated with usage workflow