# Ground Plane Refinement & Depth Map Persistence
## TL;DR
> **Quick Summary**: Fix inter-camera ground plane disagreement by adding depth-based floor detection and per-camera extrinsic correction as a standalone post-processing tool. Also add HDF5 depth map persistence so SVO re-reading is not needed for iterative refinement.
>
> **Deliverables**:
> - `--save-depth` flag in `calibrate_extrinsics.py` → HDF5 depth persistence
> - New `aruco/depth_save.py` module for HDF5 read/write
> - New `aruco/ground_plane.py` module for floor detection + consensus alignment
> - New `refine_ground_plane.py` standalone CLI tool
> - Plotly diagnostic visualization (before/after floor alignment)
> - Full TDD test suite for all new modules
> - New dependencies: `open3d`, `h5py`
>
> **Estimated Effort**: Large
> **Parallel Execution**: YES — 3 waves
> **Critical Path**: Task 1 (deps) → Task 2 (depth save module) → Task 4 (CLI integration) → Task 5 (ground plane module) → Task 7 (CLI tool) → Task 8 (visualization)
---
## Context
### Original Request
User's `calibrate_extrinsics.py` produces extrinsics where the ground plane is not level — specifically, different cameras disagree about where the ground is when overlaying world-coordinate point clouds. The error is small (1-3° tilt, <2cm offset) across a 2-4 camera ZED setup. User wants:
1. A way to refine the calibration using actual floor depth observations
2. Saved pooled depth maps to avoid re-reading SVOs for iterative refinement
### Interview Summary
**Key Discussions**:
- **Core problem**: Inter-camera disagreement, not just global tilt. Point clouds from different cameras don't align on the floor surface.
- **Integration approach**: Post-processing tool (standalone CLI), not integrated into existing pipeline.
- **Library choice**: Open3D for point cloud operations (user wants it available for future work). h5py for HDF5 persistence.
- **Refinement granularity**: Per-camera correction (each camera gets its own correction based on its floor observations).
- **Depth saving**: Opt-in via `--save-depth
` flag. Save pooled + raw best frames per camera.
- **Save format**: HDF5 via h5py with versioned schema.
- **Visualization**: Plotly HTML diagnostic (floor points per camera, consensus plane, before/after).
- **Test strategy**: TDD with pytest, following existing test patterns.
**Research Findings**:
- `alignment.py` has `rotation_align_vectors()` for aligning normals — reusable for floor alignment
- `depth_pool.py` does median pooling but never persists results
- `depth_refine.py` has `scipy.optimize.least_squares` infrastructure for pose optimization
- `compare_pose_sets.py` has Kabsch `rigid_transform_3d()` for rigid alignment
- `depth_verify.py` has `project_point_to_pixel()` and depth residual computation
- Current pipeline: ArUco → PnP → RANSAC averaging → depth refinement (sparse, marker corners only) → alignment (marker normals only)
- Open3D provides `segment_plane()` for RANSAC plane fitting on point clouds
### Metis Review
**Identified Gaps** (addressed):
- **Correction DOF**: Must constrain to pitch/roll + vertical translation only (no yaw drift, no lateral drift). Addressed via bounded optimization.
- **RANSAC plane robustness**: Must constrain plane normal to near-vertical and height to expected range, plus ROI masking. Addressed via configurable constraints.
- **HDF5 schema versioning**: Must include `/meta/schema_version`, units, intrinsics, coordinate frame. Addressed in schema design.
- **Failure mode for missing floor**: If plane detection fails for one camera, skip that camera and warn (don't fail entire run). Addressed in error handling design.
- **Reproducibility**: RANSAC seed control for deterministic tests. Addressed via `seed` parameter.
- **Per-camera correction risk**: May break inter-camera rigidity. Addressed via correction bounds + pre/post metrics reporting.
- **Consensus plane definition**: Use merged inlier points from all cameras, weighted by inlier count. Addressed in algorithm design.
---
## Work Objectives
### Core Objective
Enable depth-based ground plane refinement that corrects per-camera extrinsic errors (1-3° tilt, <2cm vertical offset) by detecting the actual physical floor surface from ZED depth maps and aligning all cameras to a consensus ground plane.
### Concrete Deliverables
- `aruco/depth_save.py`: HDF5 read/write module for depth maps + metadata
- `aruco/ground_plane.py`: Floor detection (RANSAC), consensus plane fitting, per-camera correction
- `refine_ground_plane.py`: Standalone Click CLI tool
- `--save-depth` flag added to `calibrate_extrinsics.py`
- `tests/test_depth_save.py`: TDD tests for depth persistence
- `tests/test_ground_plane.py`: TDD tests for floor detection + alignment
- `tests/test_refine_ground_cli.py`: TDD tests for CLI tool
- Plotly diagnostic HTML output
### Definition of Done
- [x] `uv run pytest tests/test_depth_save.py` → all tests pass
- [x] `uv run pytest tests/test_ground_plane.py` → all tests pass
- [x] `uv run pytest tests/test_refine_ground_cli.py` → all tests pass
- [x] `uv run basedpyright aruco/depth_save.py aruco/ground_plane.py refine_ground_plane.py` → no errors
- [x] `uv run python calibrate_extrinsics.py --help` shows `--save-depth` flag
- [x] `uv run python refine_ground_plane.py --help` shows expected options
- [x] End-to-end: calibrate → save depth → refine ground → produces valid extrinsics JSON
### Must Have
- Per-camera RANSAC floor plane detection from depth maps
- Consensus plane fitting from merged floor points
- Constrained per-camera correction (pitch/roll + vertical translation, no yaw/lateral)
- Correction bounds with configurable limits (default: max 5° rotation, max 5cm translation)
- "No-op if not confident" threshold — skip correction if RANSAC inlier ratio is too low
- HDF5 schema with versioning and full metadata (intrinsics, units, resolution, frame indices)
- Diagnostic metrics: per-camera plane normal angles, consensus disagreement before/after, correction magnitudes
- Plotly visualization of floor points + consensus plane + before/after camera poses
### Must NOT Have (Guardrails)
- NO changes to ArUco detection, PnP, or RANSAC pose averaging logic
- NO changes to existing `depth_refine.py` or `depth_verify.py` behavior
- NO non-flat floor handling (ramps, stairs, multi-level)
- NO dense multi-view reconstruction beyond the floor plane
- NO automatic scene segmentation or ML-based floor detection
- NO global bundle adjustment across all cameras
- NO saving of every frame's depth data — only pooled + curated best subset
- NO GUI requirements — visualization is optional Plotly HTML output
- NO modification of the extrinsics JSON schema (output format matches existing convention)
---
## Verification Strategy (MANDATORY)
> **UNIVERSAL RULE: ZERO HUMAN INTERVENTION**
>
> ALL tasks in this plan MUST be verifiable WITHOUT any human action.
### Test Decision
- **Infrastructure exists**: YES (`pytest` configured in `pyproject.toml`)
- **Automated tests**: TDD (tests first)
- **Framework**: `pytest` (existing)
### If TDD Enabled
Each TODO follows RED-GREEN-REFACTOR:
**Task Structure:**
1. **RED**: Write failing test first
- Test file: `tests/test_.py`
- Test command: `uv run pytest tests/test_.py`
- Expected: FAIL (test exists, implementation doesn't)
2. **GREEN**: Implement minimum code to pass
- Command: `uv run pytest tests/test_.py`
- Expected: PASS
3. **REFACTOR**: Clean up while keeping green
- Command: `uv run pytest tests/test_.py`
- Expected: PASS (still)
### Agent-Executed QA Scenarios (MANDATORY — ALL tasks)
**Verification Tool by Deliverable Type:**
| Type | Tool | How Agent Verifies |
|------|------|-------------------|
| **Python module** | Bash (pytest) | Run tests, assert pass count, zero failures |
| **CLI tool** | Bash (click --help + invocation) | Check help output, run with test data, verify exit code and output |
| **HDF5 file** | Bash (python -c "import h5py; ...") | Open file, check schema, validate data shapes |
| **Type checking** | Bash (basedpyright) | Run type checker, verify zero errors |
| **Plotly output** | Bash (file existence + python parse) | Check file exists, contains valid HTML, has expected traces |
---
## Execution Strategy
### Parallel Execution Waves
```
Wave 1 (Start Immediately):
├── Task 1: Add open3d + h5py dependencies
├── Task 2: TDD depth save module (aruco/depth_save.py) [after Task 1]
└── Task 3: TDD ground plane core module (aruco/ground_plane.py) [after Task 1]
Wave 2 (After Wave 1):
├── Task 4: Integrate --save-depth into calibrate_extrinsics.py [depends: 1, 2]
└── Task 5: Ground plane consensus + per-camera correction [depends: 1, 3]
Wave 3 (After Wave 2):
├── Task 6: Plotly diagnostic visualization module [depends: 5]
├── Task 7: refine_ground_plane.py CLI tool [depends: 2, 5, 6]
└── Task 8: Integration tests + basedpyright pass [depends: all]
Critical Path: Task 1 → Task 2 → Task 4 (depth save path)
Task 1 → Task 3 → Task 5 → Task 7 (ground plane path)
```
### Dependency Matrix
| Task | Depends On | Blocks | Can Parallelize With |
|------|------------|--------|---------------------|
| 1 | None | 2, 3 | None (must be first) |
| 2 | 1 | 4, 7 | 3 |
| 3 | 1 | 5 | 2 |
| 4 | 1, 2 | 7, 8 | 5 |
| 5 | 1, 3 | 6, 7 | 4 |
| 6 | 5 | 7 | 4 |
| 7 | 2, 5, 6 | 8 | None |
| 8 | All | None | None (final) |
### Agent Dispatch Summary
| Wave | Tasks | Recommended Agents |
|------|-------|-------------------|
| 1 | 1 | task(category="quick", ...) |
| 1→2 | 2, 3 | task(category="unspecified-high", ...) — parallel after Task 1 |
| 2 | 4, 5 | task(category="unspecified-high", ...) — parallel |
| 3 | 6 | task(category="unspecified-low", ...) |
| 3 | 7 | task(category="unspecified-high", ...) |
| 3 | 8 | task(category="unspecified-low", ...) |
---
## TODOs
- [x] 1. Add `open3d` and `h5py` dependencies to `pyproject.toml`
**What to do**:
- Add `open3d` and `h5py` to the `[project] dependencies` list in `pyproject.toml`
- Run `uv sync` to install
- Verify imports work: `uv run python -c "import open3d; import h5py; print('ok')"`
**Must NOT do**:
- Do not add unnecessary deps (no trimesh, no probreg, no pycpd)
- Do not modify any other pyproject.toml sections
**Recommended Agent Profile**:
- **Category**: `quick`
- Reason: Single file edit + one command
- **Skills**: []
- No special skills needed for a dependency addition
**Parallelization**:
- **Can Run In Parallel**: NO
- **Parallel Group**: Wave 1 (solo — must complete before 2, 3)
- **Blocks**: Tasks 2, 3, 4, 5, 6, 7, 8
- **Blocked By**: None
**References**:
**Pattern References**:
- `pyproject.toml:7-27` — Existing dependency list format and conventions (e.g., `"scipy>=1.17.0"`)
**Acceptance Criteria**:
- [ ] `pyproject.toml` contains `open3d` and `h5py` in dependencies
- [ ] `uv sync` completes without error
- [ ] `uv run python -c "import open3d; import h5py; print('ok')"` prints `ok` and exits 0
**Agent-Executed QA Scenarios:**
```
Scenario: Dependencies install and import correctly
Tool: Bash
Preconditions: pyproject.toml edited
Steps:
1. uv sync
2. uv run python -c "import open3d; print(open3d.__version__)"
3. Assert: exit code 0, version string printed
4. uv run python -c "import h5py; print(h5py.__version__)"
5. Assert: exit code 0, version string printed
Expected Result: Both libraries installed and importable
Evidence: Command output captured
```
**Commit**: YES
- Message: `build(deps): add open3d and h5py for ground plane refinement`
- Files: `pyproject.toml`, `uv.lock`
- Pre-commit: `uv run python -c "import open3d; import h5py"`
---
- [x] 2. TDD: Create `aruco/depth_save.py` — HDF5 depth map persistence module
**What to do**:
**RED phase** — Write `tests/test_depth_save.py` first with tests for:
- `save_depth_data()`: saves pooled depth + confidence + raw frames + intrinsics + metadata to HDF5
- `load_depth_data()`: loads HDF5 back into structured dict
- Round-trip test: save → load → compare arrays with `np.testing.assert_allclose`
- Schema validation: check `/meta/schema_version`, `/meta/units`, `/meta/coordinate_frame`
- Per-camera groups: `//pooled_depth`, `//pooled_confidence`, `//raw_frames//depth`, `//intrinsics`
- Edge cases: single camera, no confidence map, no raw frames
- Error handling: invalid path, empty data
**GREEN phase** — Implement `aruco/depth_save.py`:
- `save_depth_data(path, camera_data, schema_version=1)` — writes HDF5
- `load_depth_data(path)` — reads HDF5 back to dict
- Schema version 1 layout:
```
/meta/
schema_version: int = 1
units: str = "meters"
coordinate_frame: str = "world_from_cam"
created_at: str (ISO 8601)
//
intrinsics: (3, 3) float64 — camera matrix K
resolution: (2,) int — [width, height]
pooled_depth: (H, W) float32
pooled_confidence: (H, W) float32 [optional]
pool_metadata: JSON string (same dict currently in results)
raw_frames/
0/depth: (H, W) float32
0/confidence: (H, W) float32 [optional]
0/frame_index: int
0/score: float
1/depth: ...
```
- Use `h5py` compression: `compression="gzip"`, `compression_opts=4`
- Type annotations on all public functions
**REFACTOR phase** — Clean up, add docstrings, run basedpyright.
**Must NOT do**:
- Do not modify existing `depth_pool.py` or `depth_verify.py`
- Do not add ZED SDK dependency to this module (pure numpy/h5py)
- Do not save uncompressed data
**Recommended Agent Profile**:
- **Category**: `unspecified-high`
- Reason: New module with TDD workflow, HDF5 schema design, comprehensive tests
- **Skills**: []
- No special skills needed — standard Python + h5py
**Parallelization**:
- **Can Run In Parallel**: YES
- **Parallel Group**: Wave 1-2 (with Task 3, after Task 1)
- **Blocks**: Tasks 4, 7
- **Blocked By**: Task 1
**References**:
**Pattern References**:
- `aruco/depth_pool.py:1-90` — Data format conventions: depth maps are `(H, W) float` in meters, confidence maps are `(H, W) float` with ZED semantics (lower = more confident)
- `calibrate_extrinsics.py:143-305` — How depth maps and confidence maps are collected per camera, how pool_metadata dict is structured
- `calibrate_extrinsics.py:120-131` — Function signature of `apply_depth_verify_refine_postprocess` showing the `verification_frames` data structure
**API/Type References**:
- `aruco/depth_verify.py:18-24` — `project_point_to_pixel(P_cam, K)` shows intrinsics matrix K format (3x3, fx/fy/cx/cy)
**Test References**:
- `tests/test_depth_pool.py` — Follow this test structure: parametric, synthetic data, edge cases with `pytest.raises`
- `tests/conftest.py` — sys.path setup for imports
**Documentation References**:
- `calibrate_extrinsics.py:338` — `results[str(serial)]["depth_pool"]` shows pool_metadata dict structure
**WHY Each Reference Matters**:
- `depth_pool.py` defines the array contracts (shape, dtype, units) the save module must preserve
- `calibrate_extrinsics.py:143-305` shows exactly where/how depth data is produced — the save module must capture this data
- Test patterns in `test_depth_pool.py` establish the project's testing conventions
**Acceptance Criteria**:
**TDD:**
- [ ] Test file created: `tests/test_depth_save.py`
- [ ] Tests cover: save, load, round-trip, schema validation, edge cases, error handling
- [ ] `uv run pytest tests/test_depth_save.py -v` → PASS (all tests, 0 failures)
**Agent-Executed QA Scenarios:**
```
Scenario: Round-trip save and load preserves data
Tool: Bash (pytest)
Preconditions: aruco/depth_save.py implemented
Steps:
1. uv run pytest tests/test_depth_save.py -v -k "round_trip"
2. Assert: exit code 0
3. Assert: output contains "PASSED"
Expected Result: Saved HDF5 loads back with identical data
Evidence: pytest output captured
Scenario: HDF5 schema has required metadata
Tool: Bash (pytest)
Preconditions: aruco/depth_save.py implemented
Steps:
1. uv run pytest tests/test_depth_save.py -v -k "schema"
2. Assert: exit code 0
3. Assert: tests verify /meta/schema_version, /meta/units, /meta/coordinate_frame
Expected Result: Schema metadata present and correct
Evidence: pytest output captured
Scenario: Module passes type checking
Tool: Bash (basedpyright)
Preconditions: Module implemented with type annotations
Steps:
1. uv run basedpyright aruco/depth_save.py
2. Assert: exit code 0 or only non-error diagnostics
Expected Result: No type errors
Evidence: basedpyright output captured
```
**Commit**: YES
- Message: `feat(aruco): add HDF5 depth map persistence module`
- Files: `aruco/depth_save.py`, `tests/test_depth_save.py`
- Pre-commit: `uv run pytest tests/test_depth_save.py`
---
- [x] 3. TDD: Create `aruco/ground_plane.py` — floor detection & consensus alignment core
**What to do**:
**RED phase** — Write `tests/test_ground_plane.py` first with tests for:
A. `unproject_depth_to_points(depth_map, K, T_world_cam, stride=4)`:
- Takes depth map + intrinsics + extrinsics → returns (N, 3) world-coordinate point cloud
- Test: synthetic depth of a flat plane at known height → verify recovered 3D points match expected positions
- Test: NaN/zero/negative depth values are excluded
- Test: stride parameter reduces output point count proportionally
B. `detect_floor_plane(points, normal_constraint, height_range, min_inlier_ratio, seed)`:
- Uses Open3D RANSAC `segment_plane()` on the point-cloud
- Returns `FloorPlaneResult(normal, offset, inliers, inlier_ratio, plane_model)`
- Test: synthetic flat floor + random noise → recovers correct plane within tolerance
- Test: synthetic floor + wall points → RANSAC ignores wall, finds floor (normal_constraint filters)
- Test: normal_constraint rejects planes that aren't near-vertical (e.g., wall plane)
- Test: height_range rejects planes too far from expected floor height
- Test: too few inliers → returns None (below min_inlier_ratio)
- Test: seed parameter produces deterministic results
C. `compute_consensus_plane(floor_results, camera_weights=None)`:
- Takes per-camera FloorPlaneResult list → fits a single consensus plane
- Method: concatenate all inlier points, weighted by inlier count, fit plane via SVD
- Test: two cameras seeing same plane → consensus matches individual planes
- Test: two cameras with slight disagreement → consensus is between them
- Test: camera weights affect result appropriately
D. `compute_floor_correction(T_world_cam, floor_result, consensus_plane, max_rotation_deg=5.0, max_translation_m=0.05)`:
- Computes constrained correction for a single camera
- Allowed DOF: pitch/roll + vertical translation ONLY (no yaw, no lateral)
- Uses `scipy.optimize.least_squares` with bounds
- Returns `CorrectionResult(T_corrected, delta_rotation_deg, delta_translation_m, applied)`
- Test: camera with 2° tilt from consensus → correction brings it within 0.1°
- Test: correction respects max_rotation_deg bound
- Test: correction respects max_translation_m bound
- Test: yaw component is preserved (no yaw drift)
- Test: lateral translation is preserved (no X/Z drift)
**GREEN phase** — Implement `aruco/ground_plane.py`:
- Import `open3d` for RANSAC plane segmentation
- Import `scipy.optimize.least_squares` for constrained correction
- Reuse `aruco.alignment.rotation_align_vectors` where appropriate
- Reuse `aruco.pose_math.invert_transform` and `matrix_to_rvec_tvec`
- Use dataclasses for `FloorPlaneResult` and `CorrectionResult`
- All functions are pure (no side effects, no file I/O)
**REFACTOR phase** — Docstrings, type annotations, basedpyright.
**Must NOT do**:
- No ML/segmentation — RANSAC + geometric constraints only
- No global bundle adjustment
- No modification to existing alignment.py
- No dense reconstruction beyond floor plane extraction
**Recommended Agent Profile**:
- **Category**: `unspecified-high`
- Reason: Core algorithmic module with 4 major functions, each with multiple test cases. Requires understanding of SE3 geometry, RANSAC, and constrained optimization.
- **Skills**: []
**Parallelization**:
- **Can Run In Parallel**: YES
- **Parallel Group**: Wave 1-2 (with Task 2, after Task 1)
- **Blocks**: Task 5
- **Blocked By**: Task 1
**References**:
**Pattern References**:
- `aruco/alignment.py:54-114` — `rotation_align_vectors(from_vec, to_vec)` — reuse for aligning floor normal to target up vector
- `aruco/alignment.py:117-137` — `apply_alignment_to_pose(T, R_align)` — pattern for applying global rotation to extrinsics
- `aruco/alignment.py:140-202` — `estimate_up_vector_from_cameras()` — existing camera-based "up" estimation, useful as initial guess for floor normal
- `aruco/depth_refine.py:12-20` — `extrinsics_to_params()` / `params_to_extrinsics()` — 6-DOF parameterization for optimization
- `aruco/depth_refine.py:71-180` — `refine_extrinsics_with_depth()` — pattern for bounded least_squares optimization of camera pose
- `aruco/depth_verify.py:18-24` — `project_point_to_pixel(P_cam, K)` — projection math
- `aruco/pose_math.py:22-28` — `invert_transform(T)` — efficient SE3 inversion
**API/Type References**:
- `aruco/alignment.py:7-16` — Type aliases: `Vec3`, `Mat33`, `Mat44`, `CornersNC`
- `aruco/depth_verify.py:8-15` — `DepthVerificationResult` dataclass pattern
**Test References**:
- `tests/test_alignment.py` — Testing convention for geometric functions (synthetic inputs, tolerance assertions)
- `tests/test_depth_refine.py` — Testing convention for optimization functions (before/after metrics)
**External References**:
- Open3D docs: `segment_plane(distance_threshold, ransac_n, num_iterations)` — returns `[a, b, c, d]` plane model + inlier indices
**WHY Each Reference Matters**:
- `alignment.py` provides the exact rotation-alignment primitives we need — no need to reimplement
- `depth_refine.py` establishes the bounded least-squares pattern with regularization — correction should follow the same style
- `test_alignment.py` shows how geometric tests are structured in this project (synthetic data, `assert_allclose`)
**Acceptance Criteria**:
**TDD:**
- [ ] Test file created: `tests/test_ground_plane.py`
- [ ] Tests cover: unproject, floor detection (happy + noise + wall + failure), consensus, correction (tilt + bounds + yaw preservation)
- [ ] `uv run pytest tests/test_ground_plane.py -v` → PASS (all tests, 0 failures)
**Agent-Executed QA Scenarios:**
```
Scenario: Floor detection on synthetic flat plane
Tool: Bash (pytest)
Preconditions: aruco/ground_plane.py implemented
Steps:
1. uv run pytest tests/test_ground_plane.py -v -k "detect_floor and synthetic_flat"
2. Assert: exit code 0
3. Assert: recovered normal within 1° of [0, -1, 0]
Expected Result: RANSAC correctly identifies flat floor
Evidence: pytest output captured
Scenario: Per-camera correction preserves yaw
Tool: Bash (pytest)
Preconditions: aruco/ground_plane.py implemented
Steps:
1. uv run pytest tests/test_ground_plane.py -v -k "correction and yaw"
2. Assert: exit code 0
3. Assert: yaw angle before == yaw angle after (within 0.01°)
Expected Result: Correction only affects pitch/roll + vertical translation
Evidence: pytest output captured
Scenario: Module passes type checking
Tool: Bash (basedpyright)
Preconditions: Module implemented with type annotations
Steps:
1. uv run basedpyright aruco/ground_plane.py
2. Assert: exit code 0 or only non-error diagnostics
Expected Result: No type errors
Evidence: basedpyright output captured
```
**Commit**: YES
- Message: `feat(aruco): add ground plane detection and per-camera correction module`
- Files: `aruco/ground_plane.py`, `tests/test_ground_plane.py`
- Pre-commit: `uv run pytest tests/test_ground_plane.py`
---
- [x] 4. Integrate `--save-depth` flag into `calibrate_extrinsics.py`
**What to do**:
- Add `--save-depth` Click option (type `click.Path()`, default `None`)
- When provided, after depth pooling/selection in `apply_depth_verify_refine_postprocess`, call `save_depth_data()` to persist:
- Pooled depth + confidence per camera
- Raw best-scored frames (depth + confidence + frame index + score)
- Camera intrinsics matrix K
- Pool metadata dict
- Log the output path and file size
**Must NOT do**:
- Do not change existing depth processing behavior
- Do not make saving mandatory (only when `--save-depth` is provided)
- Do not save if depth verification/refinement is not enabled (warn and skip)
**Recommended Agent Profile**:
- **Category**: `unspecified-high`
- Reason: Integration into existing CLI with complex data flow, needs careful threading of data through the function
- **Skills**: []
**Parallelization**:
- **Can Run In Parallel**: YES
- **Parallel Group**: Wave 2 (with Task 5)
- **Blocks**: Tasks 7, 8
- **Blocked By**: Tasks 1, 2
**References**:
**Pattern References**:
- `calibrate_extrinsics.py:562-678` — Click option definitions and `main()` function signature — follow exact same patterns for the new flag
- `calibrate_extrinsics.py:606-611` — `--depth-pool-size` option as example of depth-related flag
- `calibrate_extrinsics.py:120-305` — `apply_depth_verify_refine_postprocess()` — this is where depth data is available and where save should be triggered
- `calibrate_extrinsics.py:143-165` — Where `depth_maps` and `confidence_maps` lists are built per camera — data to capture for raw frames
- `calibrate_extrinsics.py:267-270` — Where `final_depth` and `pool_metadata` are determined — data to capture for pooled result
**API/Type References**:
- `aruco/depth_save.py` (Task 2 output) — `save_depth_data(path, camera_data, schema_version=1)` function signature
**Test References**:
- `tests/test_depth_cli_postprocess.py` — Existing test patterns for calibrate_extrinsics CLI post-processing behavior
- `tests/test_depth_pool_integration.py` — Integration test patterns with mocked depth data
**WHY Each Reference Matters**:
- `calibrate_extrinsics.py:562-678` is the exact location where the new flag must be added, following identical Click patterns
- `apply_depth_verify_refine_postprocess` is the function that has access to all depth data — save must be called from here or just after it
- Integration tests show how to mock ZED data for testing the full flow
**Acceptance Criteria**:
**TDD:**
- [ ] Test file updated or created: `tests/test_depth_save_integration.py`
- [ ] Tests cover: flag appears in help, save is called when flag provided, save is NOT called without flag
- [ ] `uv run pytest tests/test_depth_save_integration.py -v` → PASS
**Agent-Executed QA Scenarios:**
```
Scenario: --save-depth flag appears in CLI help
Tool: Bash
Preconditions: calibrate_extrinsics.py updated
Steps:
1. uv run python calibrate_extrinsics.py --help
2. Assert: output contains "--save-depth"
3. Assert: output contains "HDF5" or "depth" in the help text for the flag
Expected Result: Flag is documented in help
Evidence: Help output captured
Scenario: Existing tests still pass after integration
Tool: Bash (pytest)
Preconditions: calibrate_extrinsics.py updated
Steps:
1. uv run pytest tests/test_depth_cli_postprocess.py tests/test_depth_pool_integration.py -v
2. Assert: exit code 0, no regressions
Expected Result: No existing behavior broken
Evidence: pytest output captured
```
**Commit**: YES
- Message: `feat(calibrate): add --save-depth flag for HDF5 depth persistence`
- Files: `calibrate_extrinsics.py`, `tests/test_depth_save_integration.py`
- Pre-commit: `uv run pytest tests/test_depth_cli_postprocess.py tests/test_depth_pool_integration.py`
---
- [x] 5. Extend `aruco/ground_plane.py` with multi-camera workflow orchestration
**What to do**:
Add high-level orchestration functions that compose the primitives from Task 3:
A. `refine_ground_from_depth(camera_data, extrinsics, config)`:
- Main entry point: takes per-camera depth data + current extrinsics → returns corrected extrinsics + metrics
- Flow:
1. Per camera: `unproject_depth_to_points` → `detect_floor_plane`
2. `compute_consensus_plane` from all successful detections
3. Per camera: `compute_floor_correction` relative to consensus
4. Return corrected extrinsics dict + `RefinementMetrics`
- Config dataclass with: `max_rotation_deg`, `max_translation_m`, `ransac_distance_threshold`, `min_inlier_ratio`, `height_range`, `normal_constraint_deg`, `stride`, `seed`
- Metrics dataclass with: per-camera floor angles (before/after), consensus plane model, correction magnitudes, skipped cameras + reasons
B. Error handling:
- If floor detection fails for a camera → skip it, log warning, include in metrics
- If fewer than 2 cameras have valid floor → abort, return original extrinsics + error reason
- If correction exceeds bounds → cap at bounds, mark as `clamped` in metrics
**Must NOT do**:
- No file I/O in this module — pure computation
- No visualization — that's Task 6
**Recommended Agent Profile**:
- **Category**: `unspecified-high`
- Reason: Orchestration logic with error handling, config management, metrics collection
- **Skills**: []
**Parallelization**:
- **Can Run In Parallel**: YES
- **Parallel Group**: Wave 2 (with Task 4)
- **Blocks**: Tasks 6, 7
- **Blocked By**: Tasks 1, 3
**References**:
**Pattern References**:
- `calibrate_extrinsics.py:120-131` — `apply_depth_verify_refine_postprocess()` signature — pattern for multi-camera orchestration function
- `aruco/depth_refine.py:71-227` — `refine_extrinsics_with_depth()` return value pattern: (result, stats_dict)
- `aruco/depth_verify.py:8-15` — `DepthVerificationResult` dataclass — pattern for structured results
**API/Type References**:
- `aruco/ground_plane.py` (Task 3 output) — All primitive functions: `unproject_depth_to_points`, `detect_floor_plane`, `compute_consensus_plane`, `compute_floor_correction`
**Test References**:
- `tests/test_ground_plane.py` (Task 3 output) — Unit test patterns to follow for orchestration tests
**WHY Each Reference Matters**:
- `apply_depth_verify_refine_postprocess` shows how multi-camera iteration with fallback is done in this codebase
- `depth_refine.py` shows the (result, stats) return pattern that should be followed
**Acceptance Criteria**:
**TDD:**
- [ ] Tests added to `tests/test_ground_plane.py` for orchestration functions
- [ ] Tests cover: full pipeline with 2-camera synthetic data, single-camera skip, all-cameras-fail abort, config bounds
- [ ] `uv run pytest tests/test_ground_plane.py -v` → PASS (all tests, 0 failures)
**Agent-Executed QA Scenarios:**
```
Scenario: Two-camera synthetic refinement produces level ground
Tool: Bash (pytest)
Preconditions: Orchestration functions implemented
Steps:
1. uv run pytest tests/test_ground_plane.py -v -k "refine_ground_from_depth and two_camera"
2. Assert: exit code 0
3. Assert: after correction, floor angle disagreement < 0.5°
Expected Result: Per-camera corrections level the ground
Evidence: pytest output captured
Scenario: Graceful fallback when floor detection fails for one camera
Tool: Bash (pytest)
Preconditions: Orchestration functions implemented
Steps:
1. uv run pytest tests/test_ground_plane.py -v -k "skip_camera"
2. Assert: exit code 0
3. Assert: skipped camera's extrinsics unchanged, other cameras corrected
Expected Result: Partial failure handled gracefully
Evidence: pytest output captured
```
**Commit**: YES
- Message: `feat(aruco): add multi-camera ground plane refinement orchestration`
- Files: `aruco/ground_plane.py`, `tests/test_ground_plane.py`
- Pre-commit: `uv run pytest tests/test_ground_plane.py`
---
- [x] 6. Create Plotly diagnostic visualization for ground plane refinement
**What to do**:
- Add a function `create_ground_diagnostic_plot(metrics, camera_data, extrinsics_before, extrinsics_after)` → returns `plotly.graph_objects.Figure`
- Add a function `save_diagnostic_plot(fig, path)` → writes HTML file
- Visualization contents:
- 3D scatter: floor inlier points per camera (color-coded by camera serial)
- Surface: consensus plane (semi-transparent)
- Camera frustums: before (dashed/faded) and after (solid) positions
- Annotation: per-camera correction magnitude (degrees + cm)
- Title: summary metrics (total disagreement before/after)
- Follow existing Plotly patterns from `visualize_extrinsics.py` and `compare_pose_sets.py`
**Must NOT do**:
- No interactive server or GUI — static HTML file only
- No Open3D visualization (use Plotly only, already a dep)
- No complex camera frustum rendering — simple cone or pyramid is fine
**Recommended Agent Profile**:
- **Category**: `unspecified-low`
- Reason: Visualization code following existing Plotly patterns, no complex algorithm
- **Skills**: []
**Parallelization**:
- **Can Run In Parallel**: NO
- **Parallel Group**: Wave 3 (sequential after Task 5)
- **Blocks**: Task 7
- **Blocked By**: Task 5
**References**:
**Pattern References**:
- `compare_pose_sets.py:145-200` — `add_camera_trace()` — Plotly camera visualization pattern (frustum + axes + labels)
- `visualize_extrinsics.py` — Full Plotly 3D scene setup with layout, ground plane, axis labels (check head of file for imports and patterns)
**Test References**:
- No heavy test required — visualization is a "nice to have". A smoke test that the function returns a `go.Figure` with expected trace count is sufficient.
**WHY Each Reference Matters**:
- `compare_pose_sets.py` already has Plotly camera rendering code that can be adapted
- `visualize_extrinsics.py` shows the full 3D scene pattern including ground plane rendering
**Acceptance Criteria**:
- [ ] Function `create_ground_diagnostic_plot` returns a `plotly.graph_objects.Figure`
- [ ] Figure contains traces for: floor points per camera, consensus plane surface, camera markers
- [ ] Smoke test: `uv run pytest tests/test_ground_plane.py -v -k "diagnostic_plot"` → PASS
**Agent-Executed QA Scenarios:**
```
Scenario: Diagnostic plot generates valid HTML
Tool: Bash (pytest)
Preconditions: Visualization function implemented
Steps:
1. uv run pytest tests/test_ground_plane.py -v -k "diagnostic_plot"
2. Assert: exit code 0
3. Assert: test verifies Figure has correct number of traces
Expected Result: Plot function produces valid Plotly figure
Evidence: pytest output captured
```
**Commit**: YES (groups with Task 7)
- Message: `feat(aruco): add Plotly diagnostic visualization for ground plane`
- Files: `aruco/ground_plane.py` (viz function added), `tests/test_ground_plane.py`
- Pre-commit: `uv run pytest tests/test_ground_plane.py`
---
- [x] 7. Create `refine_ground_plane.py` — standalone CLI tool
**What to do**:
- Click CLI tool with the following options:
- `--input-depth` / `-d`: Path to HDF5 depth file (from `--save-depth`)
- `--input-extrinsics` / `-i`: Path to extrinsics JSON (from `calibrate_extrinsics.py`)
- `--output-extrinsics` / `-o`: Path for corrected extrinsics JSON
- `--metrics-json`: Optional path for machine-readable metrics output
- `--plot` / `--no-plot`: Generate Plotly diagnostic (default: `--plot`)
- `--plot-output`: Path for diagnostic HTML (default: `/ground_diagnostic.html`)
- `--max-rotation-deg`: Max correction rotation (default: 5.0)
- `--max-translation-m`: Max correction translation (default: 0.05)
- `--ransac-threshold`: RANSAC distance threshold in meters (default: 0.02)
- `--min-inlier-ratio`: Minimum inlier ratio to accept floor detection (default: 0.3)
- `--height-range`: Expected floor height range as "min,max" (default: auto from data)
- `--stride`: Depth map downsampling stride (default: 4)
- `--seed`: Random seed for reproducibility (default: 42)
- `--debug / --no-debug`: Verbose logging
- Flow:
1. Load extrinsics JSON (reuse `compare_pose_sets.py:load_poses_from_json`)
2. Load depth data from HDF5 (use `depth_save.load_depth_data`)
3. Call `refine_ground_from_depth()` orchestration function
4. Save corrected extrinsics (same JSON format as input, with `_meta.ground_refined: true`)
5. Print summary metrics to stdout
6. Optionally save metrics JSON
7. Optionally generate diagnostic Plotly HTML
- Output extrinsics format: identical to `calibrate_extrinsics.py` output, with added `_meta.ground_refined` flag
**Must NOT do**:
- No ZED SDK dependency — works entirely from saved files
- No modification of input files
- No interactive prompts
**Recommended Agent Profile**:
- **Category**: `unspecified-high`
- Reason: Full CLI tool composing multiple modules, end-to-end data flow, error handling, multiple output formats
- **Skills**: []
**Parallelization**:
- **Can Run In Parallel**: NO
- **Parallel Group**: Wave 3 (depends on 2, 5, 6)
- **Blocks**: Task 8
- **Blocked By**: Tasks 2, 5, 6
**References**:
**Pattern References**:
- `calibrate_extrinsics.py:562-678` — Click CLI pattern with extensive options, logging, error handling
- `compare_pose_sets.py:52-88` — `load_poses_from_json()` — JSON extrinsics loading pattern
- `compare_pose_sets.py:91-92` — `serialize_pose()` — JSON extrinsics saving pattern
- `visualize_extrinsics.py` — CLI tool that loads extrinsics + generates Plotly output
**API/Type References**:
- `aruco/depth_save.py` (Task 2) — `load_depth_data(path)` return type
- `aruco/ground_plane.py` (Tasks 3, 5) — `refine_ground_from_depth()` signature and return type
- `aruco/ground_plane.py` (Task 6) — `create_ground_diagnostic_plot()` signature
**WHY Each Reference Matters**:
- `calibrate_extrinsics.py` CLI is the canonical pattern for Click tools in this project
- `compare_pose_sets.py` shows how to load/save the extrinsics JSON format correctly
- The ground_plane module provides all computation — CLI just wires I/O to computation
**Acceptance Criteria**:
**TDD:**
- [ ] Test file created: `tests/test_refine_ground_cli.py`
- [ ] Tests cover: help output, valid invocation with synthetic data, missing input error, output file creation, metrics JSON format
- [ ] `uv run pytest tests/test_refine_ground_cli.py -v` → PASS
**Agent-Executed QA Scenarios:**
```
Scenario: CLI help shows all expected options
Tool: Bash
Preconditions: refine_ground_plane.py created
Steps:
1. uv run python refine_ground_plane.py --help
2. Assert: output contains "--input-depth", "--input-extrinsics", "--output-extrinsics"
3. Assert: output contains "--max-rotation-deg", "--ransac-threshold", "--seed"
4. Assert: exit code 0
Expected Result: All options documented
Evidence: Help output captured
Scenario: Tool produces valid extrinsics JSON
Tool: Bash
Preconditions: Synthetic HDF5 and extrinsics JSON created by test fixtures
Steps:
1. uv run pytest tests/test_refine_ground_cli.py -v -k "produces_valid_json"
2. Assert: exit code 0
3. Assert: output JSON is valid, contains all camera serials, has _meta.ground_refined
Expected Result: Output matches extrinsics JSON schema
Evidence: pytest output captured
Scenario: Metrics JSON contains before/after comparison
Tool: Bash
Preconditions: Test creates and runs CLI with --metrics-json
Steps:
1. uv run pytest tests/test_refine_ground_cli.py -v -k "metrics_json"
2. Assert: exit code 0
3. Assert: metrics has 'floor.angle_disagreement_deg_before' and 'floor.angle_disagreement_deg_after'
Expected Result: Machine-readable improvement metrics produced
Evidence: pytest output captured
```
**Commit**: YES
- Message: `feat: add refine_ground_plane.py standalone CLI tool`
- Files: `refine_ground_plane.py`, `tests/test_refine_ground_cli.py`
- Pre-commit: `uv run pytest tests/test_refine_ground_cli.py`
---
- [x] 8. Final integration: full test suite pass + basedpyright + README update
**What to do**:
- Run the FULL test suite: `uv run pytest -x -vv`
- Run basedpyright on all new files: `uv run basedpyright aruco/depth_save.py aruco/ground_plane.py refine_ground_plane.py`
- Fix any regressions or type errors
- Add usage example to `README.md` showing the depth-save → ground-refine workflow:
```bash
# Step 1: Calibrate with depth saving
uv run calibrate_extrinsics.py ... --refine-depth --save-depth output/depth_data.h5
# Step 2: Refine ground plane
uv run refine_ground_plane.py \
--input-depth output/depth_data.h5 \
--input-extrinsics output/extrinsics.json \
--output-extrinsics output/extrinsics_ground_refined.json \
--plot-output output/ground_diagnostic.html
```
**Must NOT do**:
- Do not modify any test behavior — only fix genuine regressions
- Do not add features — this is stabilization only
**Recommended Agent Profile**:
- **Category**: `unspecified-low`
- Reason: Verification and minor fixups, no new features
- **Skills**: []
**Parallelization**:
- **Can Run In Parallel**: NO
- **Parallel Group**: Wave 3 (final, sequential)
- **Blocks**: None (terminal)
- **Blocked By**: All previous tasks
**References**:
**Pattern References**:
- `README.md` — Existing usage examples for `calibrate_extrinsics.py` and `visualize_extrinsics.py`
- `pyproject.toml:39-41` — pytest configuration (`testpaths`, `norecursedirs`)
**WHY Each Reference Matters**:
- README has existing command examples that the new workflow should follow in format/style
- pyproject.toml pytest config ensures all test directories are covered
**Acceptance Criteria**:
- [ ] `uv run pytest -x -vv` → ALL tests pass, 0 failures, 0 errors
- [ ] `uv run basedpyright aruco/depth_save.py aruco/ground_plane.py refine_ground_plane.py` → 0 errors
- [ ] README.md contains usage example for the new ground refinement workflow
**Agent-Executed QA Scenarios:**
```
Scenario: Full test suite passes
Tool: Bash (pytest)
Preconditions: All previous tasks completed
Steps:
1. uv run pytest -x -vv
2. Assert: exit code 0
3. Assert: all tests pass, 0 failures
Expected Result: No regressions introduced
Evidence: Full pytest output captured
Scenario: Type checking passes
Tool: Bash (basedpyright)
Preconditions: All new modules written
Steps:
1. uv run basedpyright aruco/depth_save.py aruco/ground_plane.py refine_ground_plane.py
2. Assert: no error-level diagnostics
Expected Result: Type-safe code
Evidence: basedpyright output captured
```
**Commit**: YES
- Message: `chore: final integration pass — tests, types, README for ground plane refinement`
- Files: `README.md`, any fixup files
- Pre-commit: `uv run pytest -x -vv`
---
## Commit Strategy
| After Task | Message | Files | Verification |
|------------|---------|-------|--------------|
| 1 | `build(deps): add open3d and h5py for ground plane refinement` | `pyproject.toml`, `uv.lock` | `uv run python -c "import open3d; import h5py"` |
| 2 | `feat(aruco): add HDF5 depth map persistence module` | `aruco/depth_save.py`, `tests/test_depth_save.py` | `uv run pytest tests/test_depth_save.py` |
| 3 | `feat(aruco): add ground plane detection and per-camera correction module` | `aruco/ground_plane.py`, `tests/test_ground_plane.py` | `uv run pytest tests/test_ground_plane.py` |
| 4 | `feat(calibrate): add --save-depth flag for HDF5 depth persistence` | `calibrate_extrinsics.py`, `tests/test_depth_save_integration.py` | `uv run pytest tests/test_depth_cli_postprocess.py tests/test_depth_pool_integration.py` |
| 5 | `feat(aruco): add multi-camera ground plane refinement orchestration` | `aruco/ground_plane.py`, `tests/test_ground_plane.py` | `uv run pytest tests/test_ground_plane.py` |
| 6 | `feat(aruco): add Plotly diagnostic visualization for ground plane` | `aruco/ground_plane.py`, `tests/test_ground_plane.py` | `uv run pytest tests/test_ground_plane.py` |
| 7 | `feat: add refine_ground_plane.py standalone CLI tool` | `refine_ground_plane.py`, `tests/test_refine_ground_cli.py` | `uv run pytest tests/test_refine_ground_cli.py` |
| 8 | `chore: final integration pass — tests, types, README for ground plane refinement` | `README.md`, fixups | `uv run pytest -x -vv` |
---
## Success Criteria
### Verification Commands
```bash
# All tests pass
uv run pytest -x -vv # Expected: 0 failures
# Type checking passes
uv run basedpyright aruco/depth_save.py aruco/ground_plane.py refine_ground_plane.py # Expected: 0 errors
# CLI tools have correct help
uv run python calibrate_extrinsics.py --help | grep "save-depth" # Expected: --save-depth appears
uv run python refine_ground_plane.py --help # Expected: all options listed
# Dependencies installed
uv run python -c "import open3d; import h5py; print('ok')" # Expected: ok
```
### Final Checklist
- [x] All "Must Have" requirements present
- [x] All "Must NOT Have" exclusions absent (no core pipeline changes, no ML, no non-flat floors)
- [x] All tests pass (`uv run pytest -x -vv`)
- [x] Type checking passes (`uv run basedpyright`)
- [x] HDF5 depth saving works end-to-end (save → load round-trip)
- [x] Ground plane refinement produces measurably improved floor alignment
- [x] Output extrinsics JSON matches existing format (compatible with `visualize_extrinsics.py`)
- [x] Diagnostic Plotly HTML generated successfully
- [x] README updated with usage workflow