46 KiB
Ground Plane Refinement & Depth Map Persistence
TL;DR
Quick Summary: Fix inter-camera ground plane disagreement by adding depth-based floor detection and per-camera extrinsic correction as a standalone post-processing tool. Also add HDF5 depth map persistence so SVO re-reading is not needed for iterative refinement.
Deliverables:
--save-depthflag incalibrate_extrinsics.py→ HDF5 depth persistence- New
aruco/depth_save.pymodule for HDF5 read/write- New
aruco/ground_plane.pymodule for floor detection + consensus alignment- New
refine_ground_plane.pystandalone CLI tool- Plotly diagnostic visualization (before/after floor alignment)
- Full TDD test suite for all new modules
- New dependencies:
open3d,h5pyEstimated Effort: Large Parallel Execution: YES — 3 waves Critical Path: Task 1 (deps) → Task 2 (depth save module) → Task 4 (CLI integration) → Task 5 (ground plane module) → Task 7 (CLI tool) → Task 8 (visualization)
Context
Original Request
User's calibrate_extrinsics.py produces extrinsics where the ground plane is not level — specifically, different cameras disagree about where the ground is when overlaying world-coordinate point clouds. The error is small (1-3° tilt, <2cm offset) across a 2-4 camera ZED setup. User wants:
- A way to refine the calibration using actual floor depth observations
- Saved pooled depth maps to avoid re-reading SVOs for iterative refinement
Interview Summary
Key Discussions:
- Core problem: Inter-camera disagreement, not just global tilt. Point clouds from different cameras don't align on the floor surface.
- Integration approach: Post-processing tool (standalone CLI), not integrated into existing pipeline.
- Library choice: Open3D for point cloud operations (user wants it available for future work). h5py for HDF5 persistence.
- Refinement granularity: Per-camera correction (each camera gets its own correction based on its floor observations).
- Depth saving: Opt-in via
--save-depth <dir>flag. Save pooled + raw best frames per camera. - Save format: HDF5 via h5py with versioned schema.
- Visualization: Plotly HTML diagnostic (floor points per camera, consensus plane, before/after).
- Test strategy: TDD with pytest, following existing test patterns.
Research Findings:
alignment.pyhasrotation_align_vectors()for aligning normals — reusable for floor alignmentdepth_pool.pydoes median pooling but never persists resultsdepth_refine.pyhasscipy.optimize.least_squaresinfrastructure for pose optimizationcompare_pose_sets.pyhas Kabschrigid_transform_3d()for rigid alignmentdepth_verify.pyhasproject_point_to_pixel()and depth residual computation- Current pipeline: ArUco → PnP → RANSAC averaging → depth refinement (sparse, marker corners only) → alignment (marker normals only)
- Open3D provides
segment_plane()for RANSAC plane fitting on point clouds
Metis Review
Identified Gaps (addressed):
- Correction DOF: Must constrain to pitch/roll + vertical translation only (no yaw drift, no lateral drift). Addressed via bounded optimization.
- RANSAC plane robustness: Must constrain plane normal to near-vertical and height to expected range, plus ROI masking. Addressed via configurable constraints.
- HDF5 schema versioning: Must include
/meta/schema_version, units, intrinsics, coordinate frame. Addressed in schema design. - Failure mode for missing floor: If plane detection fails for one camera, skip that camera and warn (don't fail entire run). Addressed in error handling design.
- Reproducibility: RANSAC seed control for deterministic tests. Addressed via
seedparameter. - Per-camera correction risk: May break inter-camera rigidity. Addressed via correction bounds + pre/post metrics reporting.
- Consensus plane definition: Use merged inlier points from all cameras, weighted by inlier count. Addressed in algorithm design.
Work Objectives
Core Objective
Enable depth-based ground plane refinement that corrects per-camera extrinsic errors (1-3° tilt, <2cm vertical offset) by detecting the actual physical floor surface from ZED depth maps and aligning all cameras to a consensus ground plane.
Concrete Deliverables
aruco/depth_save.py: HDF5 read/write module for depth maps + metadataaruco/ground_plane.py: Floor detection (RANSAC), consensus plane fitting, per-camera correctionrefine_ground_plane.py: Standalone Click CLI tool--save-depthflag added tocalibrate_extrinsics.pytests/test_depth_save.py: TDD tests for depth persistencetests/test_ground_plane.py: TDD tests for floor detection + alignmenttests/test_refine_ground_cli.py: TDD tests for CLI tool- Plotly diagnostic HTML output
Definition of Done
uv run pytest tests/test_depth_save.py→ all tests passuv run pytest tests/test_ground_plane.py→ all tests passuv run pytest tests/test_refine_ground_cli.py→ all tests passuv run basedpyright aruco/depth_save.py aruco/ground_plane.py refine_ground_plane.py→ no errorsuv run python calibrate_extrinsics.py --helpshows--save-depthflaguv run python refine_ground_plane.py --helpshows expected options- End-to-end: calibrate → save depth → refine ground → produces valid extrinsics JSON
Must Have
- Per-camera RANSAC floor plane detection from depth maps
- Consensus plane fitting from merged floor points
- Constrained per-camera correction (pitch/roll + vertical translation, no yaw/lateral)
- Correction bounds with configurable limits (default: max 5° rotation, max 5cm translation)
- "No-op if not confident" threshold — skip correction if RANSAC inlier ratio is too low
- HDF5 schema with versioning and full metadata (intrinsics, units, resolution, frame indices)
- Diagnostic metrics: per-camera plane normal angles, consensus disagreement before/after, correction magnitudes
- Plotly visualization of floor points + consensus plane + before/after camera poses
Must NOT Have (Guardrails)
- NO changes to ArUco detection, PnP, or RANSAC pose averaging logic
- NO changes to existing
depth_refine.pyordepth_verify.pybehavior - NO non-flat floor handling (ramps, stairs, multi-level)
- NO dense multi-view reconstruction beyond the floor plane
- NO automatic scene segmentation or ML-based floor detection
- NO global bundle adjustment across all cameras
- NO saving of every frame's depth data — only pooled + curated best subset
- NO GUI requirements — visualization is optional Plotly HTML output
- NO modification of the extrinsics JSON schema (output format matches existing convention)
Verification Strategy (MANDATORY)
UNIVERSAL RULE: ZERO HUMAN INTERVENTION
ALL tasks in this plan MUST be verifiable WITHOUT any human action.
Test Decision
- Infrastructure exists: YES (
pytestconfigured inpyproject.toml) - Automated tests: TDD (tests first)
- Framework:
pytest(existing)
If TDD Enabled
Each TODO follows RED-GREEN-REFACTOR:
Task Structure:
- RED: Write failing test first
- Test file:
tests/test_<module>.py - Test command:
uv run pytest tests/test_<module>.py - Expected: FAIL (test exists, implementation doesn't)
- Test file:
- GREEN: Implement minimum code to pass
- Command:
uv run pytest tests/test_<module>.py - Expected: PASS
- Command:
- REFACTOR: Clean up while keeping green
- Command:
uv run pytest tests/test_<module>.py - Expected: PASS (still)
- Command:
Agent-Executed QA Scenarios (MANDATORY — ALL tasks)
Verification Tool by Deliverable Type:
| Type | Tool | How Agent Verifies |
|---|---|---|
| Python module | Bash (pytest) | Run tests, assert pass count, zero failures |
| CLI tool | Bash (click --help + invocation) | Check help output, run with test data, verify exit code and output |
| HDF5 file | Bash (python -c "import h5py; ...") | Open file, check schema, validate data shapes |
| Type checking | Bash (basedpyright) | Run type checker, verify zero errors |
| Plotly output | Bash (file existence + python parse) | Check file exists, contains valid HTML, has expected traces |
Execution Strategy
Parallel Execution Waves
Wave 1 (Start Immediately):
├── Task 1: Add open3d + h5py dependencies
├── Task 2: TDD depth save module (aruco/depth_save.py) [after Task 1]
└── Task 3: TDD ground plane core module (aruco/ground_plane.py) [after Task 1]
Wave 2 (After Wave 1):
├── Task 4: Integrate --save-depth into calibrate_extrinsics.py [depends: 1, 2]
└── Task 5: Ground plane consensus + per-camera correction [depends: 1, 3]
Wave 3 (After Wave 2):
├── Task 6: Plotly diagnostic visualization module [depends: 5]
├── Task 7: refine_ground_plane.py CLI tool [depends: 2, 5, 6]
└── Task 8: Integration tests + basedpyright pass [depends: all]
Critical Path: Task 1 → Task 2 → Task 4 (depth save path)
Task 1 → Task 3 → Task 5 → Task 7 (ground plane path)
Dependency Matrix
| Task | Depends On | Blocks | Can Parallelize With |
|---|---|---|---|
| 1 | None | 2, 3 | None (must be first) |
| 2 | 1 | 4, 7 | 3 |
| 3 | 1 | 5 | 2 |
| 4 | 1, 2 | 7, 8 | 5 |
| 5 | 1, 3 | 6, 7 | 4 |
| 6 | 5 | 7 | 4 |
| 7 | 2, 5, 6 | 8 | None |
| 8 | All | None | None (final) |
Agent Dispatch Summary
| Wave | Tasks | Recommended Agents |
|---|---|---|
| 1 | 1 | task(category="quick", ...) |
| 1→2 | 2, 3 | task(category="unspecified-high", ...) — parallel after Task 1 |
| 2 | 4, 5 | task(category="unspecified-high", ...) — parallel |
| 3 | 6 | task(category="unspecified-low", ...) |
| 3 | 7 | task(category="unspecified-high", ...) |
| 3 | 8 | task(category="unspecified-low", ...) |
TODOs
-
1. Add
open3dandh5pydependencies topyproject.tomlWhat to do:
- Add
open3dandh5pyto the[project] dependencieslist inpyproject.toml - Run
uv syncto install - Verify imports work:
uv run python -c "import open3d; import h5py; print('ok')"
Must NOT do:
- Do not add unnecessary deps (no trimesh, no probreg, no pycpd)
- Do not modify any other pyproject.toml sections
Recommended Agent Profile:
- Category:
quick- Reason: Single file edit + one command
- Skills: []
- No special skills needed for a dependency addition
Parallelization:
- Can Run In Parallel: NO
- Parallel Group: Wave 1 (solo — must complete before 2, 3)
- Blocks: Tasks 2, 3, 4, 5, 6, 7, 8
- Blocked By: None
References:
Pattern References:
pyproject.toml:7-27— Existing dependency list format and conventions (e.g.,"scipy>=1.17.0")
Acceptance Criteria:
pyproject.tomlcontainsopen3dandh5pyin dependenciesuv synccompletes without erroruv run python -c "import open3d; import h5py; print('ok')"printsokand exits 0
Agent-Executed QA Scenarios:
Scenario: Dependencies install and import correctly Tool: Bash Preconditions: pyproject.toml edited Steps: 1. uv sync 2. uv run python -c "import open3d; print(open3d.__version__)" 3. Assert: exit code 0, version string printed 4. uv run python -c "import h5py; print(h5py.__version__)" 5. Assert: exit code 0, version string printed Expected Result: Both libraries installed and importable Evidence: Command output capturedCommit: YES
- Message:
build(deps): add open3d and h5py for ground plane refinement - Files:
pyproject.toml,uv.lock - Pre-commit:
uv run python -c "import open3d; import h5py"
- Add
-
2. TDD: Create
aruco/depth_save.py— HDF5 depth map persistence moduleWhat to do:
RED phase — Write
tests/test_depth_save.pyfirst with tests for:save_depth_data(): saves pooled depth + confidence + raw frames + intrinsics + metadata to HDF5load_depth_data(): loads HDF5 back into structured dict- Round-trip test: save → load → compare arrays with
np.testing.assert_allclose - Schema validation: check
/meta/schema_version,/meta/units,/meta/coordinate_frame - Per-camera groups:
/<serial>/pooled_depth,/<serial>/pooled_confidence,/<serial>/raw_frames/<idx>/depth,/<serial>/intrinsics - Edge cases: single camera, no confidence map, no raw frames
- Error handling: invalid path, empty data
GREEN phase — Implement
aruco/depth_save.py:save_depth_data(path, camera_data, schema_version=1)— writes HDF5load_depth_data(path)— reads HDF5 back to dict- Schema version 1 layout:
/meta/ schema_version: int = 1 units: str = "meters" coordinate_frame: str = "world_from_cam" created_at: str (ISO 8601) /<serial>/ intrinsics: (3, 3) float64 — camera matrix K resolution: (2,) int — [width, height] pooled_depth: (H, W) float32 pooled_confidence: (H, W) float32 [optional] pool_metadata: JSON string (same dict currently in results) raw_frames/ 0/depth: (H, W) float32 0/confidence: (H, W) float32 [optional] 0/frame_index: int 0/score: float 1/depth: ... - Use
h5pycompression:compression="gzip",compression_opts=4 - Type annotations on all public functions
REFACTOR phase — Clean up, add docstrings, run basedpyright.
Must NOT do:
- Do not modify existing
depth_pool.pyordepth_verify.py - Do not add ZED SDK dependency to this module (pure numpy/h5py)
- Do not save uncompressed data
Recommended Agent Profile:
- Category:
unspecified-high- Reason: New module with TDD workflow, HDF5 schema design, comprehensive tests
- Skills: []
- No special skills needed — standard Python + h5py
Parallelization:
- Can Run In Parallel: YES
- Parallel Group: Wave 1-2 (with Task 3, after Task 1)
- Blocks: Tasks 4, 7
- Blocked By: Task 1
References:
Pattern References:
aruco/depth_pool.py:1-90— Data format conventions: depth maps are(H, W) floatin meters, confidence maps are(H, W) floatwith ZED semantics (lower = more confident)calibrate_extrinsics.py:143-305— How depth maps and confidence maps are collected per camera, how pool_metadata dict is structuredcalibrate_extrinsics.py:120-131— Function signature ofapply_depth_verify_refine_postprocessshowing theverification_framesdata structure
API/Type References:
aruco/depth_verify.py:18-24—project_point_to_pixel(P_cam, K)shows intrinsics matrix K format (3x3, fx/fy/cx/cy)
Test References:
tests/test_depth_pool.py— Follow this test structure: parametric, synthetic data, edge cases withpytest.raisestests/conftest.py— sys.path setup for imports
Documentation References:
calibrate_extrinsics.py:338—results[str(serial)]["depth_pool"]shows pool_metadata dict structure
WHY Each Reference Matters:
depth_pool.pydefines the array contracts (shape, dtype, units) the save module must preservecalibrate_extrinsics.py:143-305shows exactly where/how depth data is produced — the save module must capture this data- Test patterns in
test_depth_pool.pyestablish the project's testing conventions
Acceptance Criteria:
TDD:
- Test file created:
tests/test_depth_save.py - Tests cover: save, load, round-trip, schema validation, edge cases, error handling
uv run pytest tests/test_depth_save.py -v→ PASS (all tests, 0 failures)
Agent-Executed QA Scenarios:
Scenario: Round-trip save and load preserves data Tool: Bash (pytest) Preconditions: aruco/depth_save.py implemented Steps: 1. uv run pytest tests/test_depth_save.py -v -k "round_trip" 2. Assert: exit code 0 3. Assert: output contains "PASSED" Expected Result: Saved HDF5 loads back with identical data Evidence: pytest output captured Scenario: HDF5 schema has required metadata Tool: Bash (pytest) Preconditions: aruco/depth_save.py implemented Steps: 1. uv run pytest tests/test_depth_save.py -v -k "schema" 2. Assert: exit code 0 3. Assert: tests verify /meta/schema_version, /meta/units, /meta/coordinate_frame Expected Result: Schema metadata present and correct Evidence: pytest output captured Scenario: Module passes type checking Tool: Bash (basedpyright) Preconditions: Module implemented with type annotations Steps: 1. uv run basedpyright aruco/depth_save.py 2. Assert: exit code 0 or only non-error diagnostics Expected Result: No type errors Evidence: basedpyright output capturedCommit: YES
- Message:
feat(aruco): add HDF5 depth map persistence module - Files:
aruco/depth_save.py,tests/test_depth_save.py - Pre-commit:
uv run pytest tests/test_depth_save.py
-
3. TDD: Create
aruco/ground_plane.py— floor detection & consensus alignment coreWhat to do:
RED phase — Write
tests/test_ground_plane.pyfirst with tests for:A.
unproject_depth_to_points(depth_map, K, T_world_cam, stride=4):- Takes depth map + intrinsics + extrinsics → returns (N, 3) world-coordinate point cloud
- Test: synthetic depth of a flat plane at known height → verify recovered 3D points match expected positions
- Test: NaN/zero/negative depth values are excluded
- Test: stride parameter reduces output point count proportionally
B.
detect_floor_plane(points, normal_constraint, height_range, min_inlier_ratio, seed):- Uses Open3D RANSAC
segment_plane()on the point-cloud - Returns
FloorPlaneResult(normal, offset, inliers, inlier_ratio, plane_model) - Test: synthetic flat floor + random noise → recovers correct plane within tolerance
- Test: synthetic floor + wall points → RANSAC ignores wall, finds floor (normal_constraint filters)
- Test: normal_constraint rejects planes that aren't near-vertical (e.g., wall plane)
- Test: height_range rejects planes too far from expected floor height
- Test: too few inliers → returns None (below min_inlier_ratio)
- Test: seed parameter produces deterministic results
C.
compute_consensus_plane(floor_results, camera_weights=None):- Takes per-camera FloorPlaneResult list → fits a single consensus plane
- Method: concatenate all inlier points, weighted by inlier count, fit plane via SVD
- Test: two cameras seeing same plane → consensus matches individual planes
- Test: two cameras with slight disagreement → consensus is between them
- Test: camera weights affect result appropriately
D.
compute_floor_correction(T_world_cam, floor_result, consensus_plane, max_rotation_deg=5.0, max_translation_m=0.05):- Computes constrained correction for a single camera
- Allowed DOF: pitch/roll + vertical translation ONLY (no yaw, no lateral)
- Uses
scipy.optimize.least_squareswith bounds - Returns
CorrectionResult(T_corrected, delta_rotation_deg, delta_translation_m, applied) - Test: camera with 2° tilt from consensus → correction brings it within 0.1°
- Test: correction respects max_rotation_deg bound
- Test: correction respects max_translation_m bound
- Test: yaw component is preserved (no yaw drift)
- Test: lateral translation is preserved (no X/Z drift)
GREEN phase — Implement
aruco/ground_plane.py:- Import
open3dfor RANSAC plane segmentation - Import
scipy.optimize.least_squaresfor constrained correction - Reuse
aruco.alignment.rotation_align_vectorswhere appropriate - Reuse
aruco.pose_math.invert_transformandmatrix_to_rvec_tvec - Use dataclasses for
FloorPlaneResultandCorrectionResult - All functions are pure (no side effects, no file I/O)
REFACTOR phase — Docstrings, type annotations, basedpyright.
Must NOT do:
- No ML/segmentation — RANSAC + geometric constraints only
- No global bundle adjustment
- No modification to existing alignment.py
- No dense reconstruction beyond floor plane extraction
Recommended Agent Profile:
- Category:
unspecified-high- Reason: Core algorithmic module with 4 major functions, each with multiple test cases. Requires understanding of SE3 geometry, RANSAC, and constrained optimization.
- Skills: []
Parallelization:
- Can Run In Parallel: YES
- Parallel Group: Wave 1-2 (with Task 2, after Task 1)
- Blocks: Task 5
- Blocked By: Task 1
References:
Pattern References:
aruco/alignment.py:54-114—rotation_align_vectors(from_vec, to_vec)— reuse for aligning floor normal to target up vectoraruco/alignment.py:117-137—apply_alignment_to_pose(T, R_align)— pattern for applying global rotation to extrinsicsaruco/alignment.py:140-202—estimate_up_vector_from_cameras()— existing camera-based "up" estimation, useful as initial guess for floor normalaruco/depth_refine.py:12-20—extrinsics_to_params()/params_to_extrinsics()— 6-DOF parameterization for optimizationaruco/depth_refine.py:71-180—refine_extrinsics_with_depth()— pattern for bounded least_squares optimization of camera posearuco/depth_verify.py:18-24—project_point_to_pixel(P_cam, K)— projection matharuco/pose_math.py:22-28—invert_transform(T)— efficient SE3 inversion
API/Type References:
aruco/alignment.py:7-16— Type aliases:Vec3,Mat33,Mat44,CornersNCaruco/depth_verify.py:8-15—DepthVerificationResultdataclass pattern
Test References:
tests/test_alignment.py— Testing convention for geometric functions (synthetic inputs, tolerance assertions)tests/test_depth_refine.py— Testing convention for optimization functions (before/after metrics)
External References:
- Open3D docs:
segment_plane(distance_threshold, ransac_n, num_iterations)— returns[a, b, c, d]plane model + inlier indices
WHY Each Reference Matters:
alignment.pyprovides the exact rotation-alignment primitives we need — no need to reimplementdepth_refine.pyestablishes the bounded least-squares pattern with regularization — correction should follow the same styletest_alignment.pyshows how geometric tests are structured in this project (synthetic data,assert_allclose)
Acceptance Criteria:
TDD:
- Test file created:
tests/test_ground_plane.py - Tests cover: unproject, floor detection (happy + noise + wall + failure), consensus, correction (tilt + bounds + yaw preservation)
uv run pytest tests/test_ground_plane.py -v→ PASS (all tests, 0 failures)
Agent-Executed QA Scenarios:
Scenario: Floor detection on synthetic flat plane Tool: Bash (pytest) Preconditions: aruco/ground_plane.py implemented Steps: 1. uv run pytest tests/test_ground_plane.py -v -k "detect_floor and synthetic_flat" 2. Assert: exit code 0 3. Assert: recovered normal within 1° of [0, -1, 0] Expected Result: RANSAC correctly identifies flat floor Evidence: pytest output captured Scenario: Per-camera correction preserves yaw Tool: Bash (pytest) Preconditions: aruco/ground_plane.py implemented Steps: 1. uv run pytest tests/test_ground_plane.py -v -k "correction and yaw" 2. Assert: exit code 0 3. Assert: yaw angle before == yaw angle after (within 0.01°) Expected Result: Correction only affects pitch/roll + vertical translation Evidence: pytest output captured Scenario: Module passes type checking Tool: Bash (basedpyright) Preconditions: Module implemented with type annotations Steps: 1. uv run basedpyright aruco/ground_plane.py 2. Assert: exit code 0 or only non-error diagnostics Expected Result: No type errors Evidence: basedpyright output capturedCommit: YES
- Message:
feat(aruco): add ground plane detection and per-camera correction module - Files:
aruco/ground_plane.py,tests/test_ground_plane.py - Pre-commit:
uv run pytest tests/test_ground_plane.py
-
4. Integrate
--save-depthflag intocalibrate_extrinsics.pyWhat to do:
- Add
--save-depthClick option (typeclick.Path(), defaultNone) - When provided, after depth pooling/selection in
apply_depth_verify_refine_postprocess, callsave_depth_data()to persist:- Pooled depth + confidence per camera
- Raw best-scored frames (depth + confidence + frame index + score)
- Camera intrinsics matrix K
- Pool metadata dict
- Log the output path and file size
Must NOT do:
- Do not change existing depth processing behavior
- Do not make saving mandatory (only when
--save-depthis provided) - Do not save if depth verification/refinement is not enabled (warn and skip)
Recommended Agent Profile:
- Category:
unspecified-high- Reason: Integration into existing CLI with complex data flow, needs careful threading of data through the function
- Skills: []
Parallelization:
- Can Run In Parallel: YES
- Parallel Group: Wave 2 (with Task 5)
- Blocks: Tasks 7, 8
- Blocked By: Tasks 1, 2
References:
Pattern References:
calibrate_extrinsics.py:562-678— Click option definitions andmain()function signature — follow exact same patterns for the new flagcalibrate_extrinsics.py:606-611—--depth-pool-sizeoption as example of depth-related flagcalibrate_extrinsics.py:120-305—apply_depth_verify_refine_postprocess()— this is where depth data is available and where save should be triggeredcalibrate_extrinsics.py:143-165— Wheredepth_mapsandconfidence_mapslists are built per camera — data to capture for raw framescalibrate_extrinsics.py:267-270— Wherefinal_depthandpool_metadataare determined — data to capture for pooled result
API/Type References:
aruco/depth_save.py(Task 2 output) —save_depth_data(path, camera_data, schema_version=1)function signature
Test References:
tests/test_depth_cli_postprocess.py— Existing test patterns for calibrate_extrinsics CLI post-processing behaviortests/test_depth_pool_integration.py— Integration test patterns with mocked depth data
WHY Each Reference Matters:
calibrate_extrinsics.py:562-678is the exact location where the new flag must be added, following identical Click patternsapply_depth_verify_refine_postprocessis the function that has access to all depth data — save must be called from here or just after it- Integration tests show how to mock ZED data for testing the full flow
Acceptance Criteria:
TDD:
- Test file updated or created:
tests/test_depth_save_integration.py - Tests cover: flag appears in help, save is called when flag provided, save is NOT called without flag
uv run pytest tests/test_depth_save_integration.py -v→ PASS
Agent-Executed QA Scenarios:
Scenario: --save-depth flag appears in CLI help Tool: Bash Preconditions: calibrate_extrinsics.py updated Steps: 1. uv run python calibrate_extrinsics.py --help 2. Assert: output contains "--save-depth" 3. Assert: output contains "HDF5" or "depth" in the help text for the flag Expected Result: Flag is documented in help Evidence: Help output captured Scenario: Existing tests still pass after integration Tool: Bash (pytest) Preconditions: calibrate_extrinsics.py updated Steps: 1. uv run pytest tests/test_depth_cli_postprocess.py tests/test_depth_pool_integration.py -v 2. Assert: exit code 0, no regressions Expected Result: No existing behavior broken Evidence: pytest output capturedCommit: YES
- Message:
feat(calibrate): add --save-depth flag for HDF5 depth persistence - Files:
calibrate_extrinsics.py,tests/test_depth_save_integration.py - Pre-commit:
uv run pytest tests/test_depth_cli_postprocess.py tests/test_depth_pool_integration.py
- Add
-
5. Extend
aruco/ground_plane.pywith multi-camera workflow orchestrationWhat to do:
Add high-level orchestration functions that compose the primitives from Task 3:
A.
refine_ground_from_depth(camera_data, extrinsics, config):- Main entry point: takes per-camera depth data + current extrinsics → returns corrected extrinsics + metrics
- Flow:
- Per camera:
unproject_depth_to_points→detect_floor_plane compute_consensus_planefrom all successful detections- Per camera:
compute_floor_correctionrelative to consensus - Return corrected extrinsics dict +
RefinementMetrics
- Per camera:
- Config dataclass with:
max_rotation_deg,max_translation_m,ransac_distance_threshold,min_inlier_ratio,height_range,normal_constraint_deg,stride,seed - Metrics dataclass with: per-camera floor angles (before/after), consensus plane model, correction magnitudes, skipped cameras + reasons
B. Error handling:
- If floor detection fails for a camera → skip it, log warning, include in metrics
- If fewer than 2 cameras have valid floor → abort, return original extrinsics + error reason
- If correction exceeds bounds → cap at bounds, mark as
clampedin metrics
Must NOT do:
- No file I/O in this module — pure computation
- No visualization — that's Task 6
Recommended Agent Profile:
- Category:
unspecified-high- Reason: Orchestration logic with error handling, config management, metrics collection
- Skills: []
Parallelization:
- Can Run In Parallel: YES
- Parallel Group: Wave 2 (with Task 4)
- Blocks: Tasks 6, 7
- Blocked By: Tasks 1, 3
References:
Pattern References:
calibrate_extrinsics.py:120-131—apply_depth_verify_refine_postprocess()signature — pattern for multi-camera orchestration functionaruco/depth_refine.py:71-227—refine_extrinsics_with_depth()return value pattern: (result, stats_dict)aruco/depth_verify.py:8-15—DepthVerificationResultdataclass — pattern for structured results
API/Type References:
aruco/ground_plane.py(Task 3 output) — All primitive functions:unproject_depth_to_points,detect_floor_plane,compute_consensus_plane,compute_floor_correction
Test References:
tests/test_ground_plane.py(Task 3 output) — Unit test patterns to follow for orchestration tests
WHY Each Reference Matters:
apply_depth_verify_refine_postprocessshows how multi-camera iteration with fallback is done in this codebasedepth_refine.pyshows the (result, stats) return pattern that should be followed
Acceptance Criteria:
TDD:
- Tests added to
tests/test_ground_plane.pyfor orchestration functions - Tests cover: full pipeline with 2-camera synthetic data, single-camera skip, all-cameras-fail abort, config bounds
uv run pytest tests/test_ground_plane.py -v→ PASS (all tests, 0 failures)
Agent-Executed QA Scenarios:
Scenario: Two-camera synthetic refinement produces level ground Tool: Bash (pytest) Preconditions: Orchestration functions implemented Steps: 1. uv run pytest tests/test_ground_plane.py -v -k "refine_ground_from_depth and two_camera" 2. Assert: exit code 0 3. Assert: after correction, floor angle disagreement < 0.5° Expected Result: Per-camera corrections level the ground Evidence: pytest output captured Scenario: Graceful fallback when floor detection fails for one camera Tool: Bash (pytest) Preconditions: Orchestration functions implemented Steps: 1. uv run pytest tests/test_ground_plane.py -v -k "skip_camera" 2. Assert: exit code 0 3. Assert: skipped camera's extrinsics unchanged, other cameras corrected Expected Result: Partial failure handled gracefully Evidence: pytest output capturedCommit: YES
- Message:
feat(aruco): add multi-camera ground plane refinement orchestration - Files:
aruco/ground_plane.py,tests/test_ground_plane.py - Pre-commit:
uv run pytest tests/test_ground_plane.py
-
6. Create Plotly diagnostic visualization for ground plane refinement
What to do:
- Add a function
create_ground_diagnostic_plot(metrics, camera_data, extrinsics_before, extrinsics_after)→ returnsplotly.graph_objects.Figure - Add a function
save_diagnostic_plot(fig, path)→ writes HTML file - Visualization contents:
- 3D scatter: floor inlier points per camera (color-coded by camera serial)
- Surface: consensus plane (semi-transparent)
- Camera frustums: before (dashed/faded) and after (solid) positions
- Annotation: per-camera correction magnitude (degrees + cm)
- Title: summary metrics (total disagreement before/after)
- Follow existing Plotly patterns from
visualize_extrinsics.pyandcompare_pose_sets.py
Must NOT do:
- No interactive server or GUI — static HTML file only
- No Open3D visualization (use Plotly only, already a dep)
- No complex camera frustum rendering — simple cone or pyramid is fine
Recommended Agent Profile:
- Category:
unspecified-low- Reason: Visualization code following existing Plotly patterns, no complex algorithm
- Skills: []
Parallelization:
- Can Run In Parallel: NO
- Parallel Group: Wave 3 (sequential after Task 5)
- Blocks: Task 7
- Blocked By: Task 5
References:
Pattern References:
compare_pose_sets.py:145-200—add_camera_trace()— Plotly camera visualization pattern (frustum + axes + labels)visualize_extrinsics.py— Full Plotly 3D scene setup with layout, ground plane, axis labels (check head of file for imports and patterns)
Test References:
- No heavy test required — visualization is a "nice to have". A smoke test that the function returns a
go.Figurewith expected trace count is sufficient.
WHY Each Reference Matters:
compare_pose_sets.pyalready has Plotly camera rendering code that can be adaptedvisualize_extrinsics.pyshows the full 3D scene pattern including ground plane rendering
Acceptance Criteria:
- Function
create_ground_diagnostic_plotreturns aplotly.graph_objects.Figure - Figure contains traces for: floor points per camera, consensus plane surface, camera markers
- Smoke test:
uv run pytest tests/test_ground_plane.py -v -k "diagnostic_plot"→ PASS
Agent-Executed QA Scenarios:
Scenario: Diagnostic plot generates valid HTML Tool: Bash (pytest) Preconditions: Visualization function implemented Steps: 1. uv run pytest tests/test_ground_plane.py -v -k "diagnostic_plot" 2. Assert: exit code 0 3. Assert: test verifies Figure has correct number of traces Expected Result: Plot function produces valid Plotly figure Evidence: pytest output capturedCommit: YES (groups with Task 7)
- Message:
feat(aruco): add Plotly diagnostic visualization for ground plane - Files:
aruco/ground_plane.py(viz function added),tests/test_ground_plane.py - Pre-commit:
uv run pytest tests/test_ground_plane.py
- Add a function
-
7. Create
refine_ground_plane.py— standalone CLI toolWhat to do:
- Click CLI tool with the following options:
--input-depth/-d: Path to HDF5 depth file (from--save-depth)--input-extrinsics/-i: Path to extrinsics JSON (fromcalibrate_extrinsics.py)--output-extrinsics/-o: Path for corrected extrinsics JSON--metrics-json: Optional path for machine-readable metrics output--plot/--no-plot: Generate Plotly diagnostic (default:--plot)--plot-output: Path for diagnostic HTML (default:<output_dir>/ground_diagnostic.html)--max-rotation-deg: Max correction rotation (default: 5.0)--max-translation-m: Max correction translation (default: 0.05)--ransac-threshold: RANSAC distance threshold in meters (default: 0.02)--min-inlier-ratio: Minimum inlier ratio to accept floor detection (default: 0.3)--height-range: Expected floor height range as "min,max" (default: auto from data)--stride: Depth map downsampling stride (default: 4)--seed: Random seed for reproducibility (default: 42)--debug / --no-debug: Verbose logging
- Flow:
- Load extrinsics JSON (reuse
compare_pose_sets.py:load_poses_from_json) - Load depth data from HDF5 (use
depth_save.load_depth_data) - Call
refine_ground_from_depth()orchestration function - Save corrected extrinsics (same JSON format as input, with
_meta.ground_refined: true) - Print summary metrics to stdout
- Optionally save metrics JSON
- Optionally generate diagnostic Plotly HTML
- Load extrinsics JSON (reuse
- Output extrinsics format: identical to
calibrate_extrinsics.pyoutput, with added_meta.ground_refinedflag
Must NOT do:
- No ZED SDK dependency — works entirely from saved files
- No modification of input files
- No interactive prompts
Recommended Agent Profile:
- Category:
unspecified-high- Reason: Full CLI tool composing multiple modules, end-to-end data flow, error handling, multiple output formats
- Skills: []
Parallelization:
- Can Run In Parallel: NO
- Parallel Group: Wave 3 (depends on 2, 5, 6)
- Blocks: Task 8
- Blocked By: Tasks 2, 5, 6
References:
Pattern References:
calibrate_extrinsics.py:562-678— Click CLI pattern with extensive options, logging, error handlingcompare_pose_sets.py:52-88—load_poses_from_json()— JSON extrinsics loading patterncompare_pose_sets.py:91-92—serialize_pose()— JSON extrinsics saving patternvisualize_extrinsics.py— CLI tool that loads extrinsics + generates Plotly output
API/Type References:
aruco/depth_save.py(Task 2) —load_depth_data(path)return typearuco/ground_plane.py(Tasks 3, 5) —refine_ground_from_depth()signature and return typearuco/ground_plane.py(Task 6) —create_ground_diagnostic_plot()signature
WHY Each Reference Matters:
calibrate_extrinsics.pyCLI is the canonical pattern for Click tools in this projectcompare_pose_sets.pyshows how to load/save the extrinsics JSON format correctly- The ground_plane module provides all computation — CLI just wires I/O to computation
Acceptance Criteria:
TDD:
- Test file created:
tests/test_refine_ground_cli.py - Tests cover: help output, valid invocation with synthetic data, missing input error, output file creation, metrics JSON format
uv run pytest tests/test_refine_ground_cli.py -v→ PASS
Agent-Executed QA Scenarios:
Scenario: CLI help shows all expected options Tool: Bash Preconditions: refine_ground_plane.py created Steps: 1. uv run python refine_ground_plane.py --help 2. Assert: output contains "--input-depth", "--input-extrinsics", "--output-extrinsics" 3. Assert: output contains "--max-rotation-deg", "--ransac-threshold", "--seed" 4. Assert: exit code 0 Expected Result: All options documented Evidence: Help output captured Scenario: Tool produces valid extrinsics JSON Tool: Bash Preconditions: Synthetic HDF5 and extrinsics JSON created by test fixtures Steps: 1. uv run pytest tests/test_refine_ground_cli.py -v -k "produces_valid_json" 2. Assert: exit code 0 3. Assert: output JSON is valid, contains all camera serials, has _meta.ground_refined Expected Result: Output matches extrinsics JSON schema Evidence: pytest output captured Scenario: Metrics JSON contains before/after comparison Tool: Bash Preconditions: Test creates and runs CLI with --metrics-json Steps: 1. uv run pytest tests/test_refine_ground_cli.py -v -k "metrics_json" 2. Assert: exit code 0 3. Assert: metrics has 'floor.angle_disagreement_deg_before' and 'floor.angle_disagreement_deg_after' Expected Result: Machine-readable improvement metrics produced Evidence: pytest output capturedCommit: YES
- Message:
feat: add refine_ground_plane.py standalone CLI tool - Files:
refine_ground_plane.py,tests/test_refine_ground_cli.py - Pre-commit:
uv run pytest tests/test_refine_ground_cli.py
- Click CLI tool with the following options:
-
8. Final integration: full test suite pass + basedpyright + README update
What to do:
- Run the FULL test suite:
uv run pytest -x -vv - Run basedpyright on all new files:
uv run basedpyright aruco/depth_save.py aruco/ground_plane.py refine_ground_plane.py - Fix any regressions or type errors
- Add usage example to
README.mdshowing the depth-save → ground-refine workflow:# Step 1: Calibrate with depth saving uv run calibrate_extrinsics.py ... --refine-depth --save-depth output/depth_data.h5 # Step 2: Refine ground plane uv run refine_ground_plane.py \ --input-depth output/depth_data.h5 \ --input-extrinsics output/extrinsics.json \ --output-extrinsics output/extrinsics_ground_refined.json \ --plot-output output/ground_diagnostic.html
Must NOT do:
- Do not modify any test behavior — only fix genuine regressions
- Do not add features — this is stabilization only
Recommended Agent Profile:
- Category:
unspecified-low- Reason: Verification and minor fixups, no new features
- Skills: []
Parallelization:
- Can Run In Parallel: NO
- Parallel Group: Wave 3 (final, sequential)
- Blocks: None (terminal)
- Blocked By: All previous tasks
References:
Pattern References:
README.md— Existing usage examples forcalibrate_extrinsics.pyandvisualize_extrinsics.pypyproject.toml:39-41— pytest configuration (testpaths,norecursedirs)
WHY Each Reference Matters:
- README has existing command examples that the new workflow should follow in format/style
- pyproject.toml pytest config ensures all test directories are covered
Acceptance Criteria:
uv run pytest -x -vv→ ALL tests pass, 0 failures, 0 errorsuv run basedpyright aruco/depth_save.py aruco/ground_plane.py refine_ground_plane.py→ 0 errors- README.md contains usage example for the new ground refinement workflow
Agent-Executed QA Scenarios:
Scenario: Full test suite passes Tool: Bash (pytest) Preconditions: All previous tasks completed Steps: 1. uv run pytest -x -vv 2. Assert: exit code 0 3. Assert: all tests pass, 0 failures Expected Result: No regressions introduced Evidence: Full pytest output captured Scenario: Type checking passes Tool: Bash (basedpyright) Preconditions: All new modules written Steps: 1. uv run basedpyright aruco/depth_save.py aruco/ground_plane.py refine_ground_plane.py 2. Assert: no error-level diagnostics Expected Result: Type-safe code Evidence: basedpyright output capturedCommit: YES
- Message:
chore: final integration pass — tests, types, README for ground plane refinement - Files:
README.md, any fixup files - Pre-commit:
uv run pytest -x -vv
- Run the FULL test suite:
Commit Strategy
| After Task | Message | Files | Verification |
|---|---|---|---|
| 1 | build(deps): add open3d and h5py for ground plane refinement |
pyproject.toml, uv.lock |
uv run python -c "import open3d; import h5py" |
| 2 | feat(aruco): add HDF5 depth map persistence module |
aruco/depth_save.py, tests/test_depth_save.py |
uv run pytest tests/test_depth_save.py |
| 3 | feat(aruco): add ground plane detection and per-camera correction module |
aruco/ground_plane.py, tests/test_ground_plane.py |
uv run pytest tests/test_ground_plane.py |
| 4 | feat(calibrate): add --save-depth flag for HDF5 depth persistence |
calibrate_extrinsics.py, tests/test_depth_save_integration.py |
uv run pytest tests/test_depth_cli_postprocess.py tests/test_depth_pool_integration.py |
| 5 | feat(aruco): add multi-camera ground plane refinement orchestration |
aruco/ground_plane.py, tests/test_ground_plane.py |
uv run pytest tests/test_ground_plane.py |
| 6 | feat(aruco): add Plotly diagnostic visualization for ground plane |
aruco/ground_plane.py, tests/test_ground_plane.py |
uv run pytest tests/test_ground_plane.py |
| 7 | feat: add refine_ground_plane.py standalone CLI tool |
refine_ground_plane.py, tests/test_refine_ground_cli.py |
uv run pytest tests/test_refine_ground_cli.py |
| 8 | chore: final integration pass — tests, types, README for ground plane refinement |
README.md, fixups |
uv run pytest -x -vv |
Success Criteria
Verification Commands
# All tests pass
uv run pytest -x -vv # Expected: 0 failures
# Type checking passes
uv run basedpyright aruco/depth_save.py aruco/ground_plane.py refine_ground_plane.py # Expected: 0 errors
# CLI tools have correct help
uv run python calibrate_extrinsics.py --help | grep "save-depth" # Expected: --save-depth appears
uv run python refine_ground_plane.py --help # Expected: all options listed
# Dependencies installed
uv run python -c "import open3d; import h5py; print('ok')" # Expected: ok
Final Checklist
- All "Must Have" requirements present
- All "Must NOT Have" exclusions absent (no core pipeline changes, no ML, no non-flat floors)
- All tests pass (
uv run pytest -x -vv) - Type checking passes (
uv run basedpyright) - HDF5 depth saving works end-to-end (save → load round-trip)
- Ground plane refinement produces measurably improved floor alignment
- Output extrinsics JSON matches existing format (compatible with
visualize_extrinsics.py) - Diagnostic Plotly HTML generated successfully
- README updated with usage workflow