crosstyan/zed-playground

Fork 0

Files

T

crosstyan 511994e3a8 chore: checkpoint ground-plane calibration refinement work

2026-02-09 10:02:48 +00:00

46 KiB

Raw Blame History

Ground Plane Refinement & Depth Map Persistence

TL;DR

Quick Summary: Fix inter-camera ground plane disagreement by adding depth-based floor detection and per-camera extrinsic correction as a standalone post-processing tool. Also add HDF5 depth map persistence so SVO re-reading is not needed for iterative refinement.

Deliverables:

--save-depth flag in calibrate_extrinsics.py → HDF5 depth persistence

New aruco/depth_save.py module for HDF5 read/write

New aruco/ground_plane.py module for floor detection + consensus alignment

New refine_ground_plane.py standalone CLI tool

Plotly diagnostic visualization (before/after floor alignment)

Full TDD test suite for all new modules

New dependencies: open3d, h5py

Estimated Effort: Large Parallel Execution: YES — 3 waves Critical Path: Task 1 (deps) → Task 2 (depth save module) → Task 4 (CLI integration) → Task 5 (ground plane module) → Task 7 (CLI tool) → Task 8 (visualization)

Context

Original Request

User's calibrate_extrinsics.py produces extrinsics where the ground plane is not level — specifically, different cameras disagree about where the ground is when overlaying world-coordinate point clouds. The error is small (1-3° tilt, <2cm offset) across a 2-4 camera ZED setup. User wants:

A way to refine the calibration using actual floor depth observations
Saved pooled depth maps to avoid re-reading SVOs for iterative refinement

Interview Summary

Key Discussions:

Core problem: Inter-camera disagreement, not just global tilt. Point clouds from different cameras don't align on the floor surface.
Integration approach: Post-processing tool (standalone CLI), not integrated into existing pipeline.
Library choice: Open3D for point cloud operations (user wants it available for future work). h5py for HDF5 persistence.
Refinement granularity: Per-camera correction (each camera gets its own correction based on its floor observations).
Depth saving: Opt-in via --save-depth <dir> flag. Save pooled + raw best frames per camera.
Save format: HDF5 via h5py with versioned schema.
Visualization: Plotly HTML diagnostic (floor points per camera, consensus plane, before/after).
Test strategy: TDD with pytest, following existing test patterns.

Research Findings:

alignment.py has rotation_align_vectors() for aligning normals — reusable for floor alignment
depth_pool.py does median pooling but never persists results
depth_refine.py has scipy.optimize.least_squares infrastructure for pose optimization
compare_pose_sets.py has Kabsch rigid_transform_3d() for rigid alignment
depth_verify.py has project_point_to_pixel() and depth residual computation
Current pipeline: ArUco → PnP → RANSAC averaging → depth refinement (sparse, marker corners only) → alignment (marker normals only)
Open3D provides segment_plane() for RANSAC plane fitting on point clouds

Metis Review

Identified Gaps (addressed):

Correction DOF: Must constrain to pitch/roll + vertical translation only (no yaw drift, no lateral drift). Addressed via bounded optimization.
RANSAC plane robustness: Must constrain plane normal to near-vertical and height to expected range, plus ROI masking. Addressed via configurable constraints.
HDF5 schema versioning: Must include /meta/schema_version, units, intrinsics, coordinate frame. Addressed in schema design.
Failure mode for missing floor: If plane detection fails for one camera, skip that camera and warn (don't fail entire run). Addressed in error handling design.
Reproducibility: RANSAC seed control for deterministic tests. Addressed via seed parameter.
Per-camera correction risk: May break inter-camera rigidity. Addressed via correction bounds + pre/post metrics reporting.
Consensus plane definition: Use merged inlier points from all cameras, weighted by inlier count. Addressed in algorithm design.

Work Objectives

Core Objective

Enable depth-based ground plane refinement that corrects per-camera extrinsic errors (1-3° tilt, <2cm vertical offset) by detecting the actual physical floor surface from ZED depth maps and aligning all cameras to a consensus ground plane.

Concrete Deliverables

aruco/depth_save.py: HDF5 read/write module for depth maps + metadata
aruco/ground_plane.py: Floor detection (RANSAC), consensus plane fitting, per-camera correction
refine_ground_plane.py: Standalone Click CLI tool
--save-depth flag added to calibrate_extrinsics.py
tests/test_depth_save.py: TDD tests for depth persistence
tests/test_ground_plane.py: TDD tests for floor detection + alignment
tests/test_refine_ground_cli.py: TDD tests for CLI tool
Plotly diagnostic HTML output

Definition of Done

uv run pytest tests/test_depth_save.py → all tests pass
uv run pytest tests/test_ground_plane.py → all tests pass
uv run pytest tests/test_refine_ground_cli.py → all tests pass
uv run basedpyright aruco/depth_save.py aruco/ground_plane.py refine_ground_plane.py → no errors
uv run python calibrate_extrinsics.py --help shows --save-depth flag
uv run python refine_ground_plane.py --help shows expected options
End-to-end: calibrate → save depth → refine ground → produces valid extrinsics JSON

Must Have

Per-camera RANSAC floor plane detection from depth maps
Consensus plane fitting from merged floor points
Constrained per-camera correction (pitch/roll + vertical translation, no yaw/lateral)
Correction bounds with configurable limits (default: max 5° rotation, max 5cm translation)
"No-op if not confident" threshold — skip correction if RANSAC inlier ratio is too low
HDF5 schema with versioning and full metadata (intrinsics, units, resolution, frame indices)
Diagnostic metrics: per-camera plane normal angles, consensus disagreement before/after, correction magnitudes
Plotly visualization of floor points + consensus plane + before/after camera poses

Must NOT Have (Guardrails)

NO changes to ArUco detection, PnP, or RANSAC pose averaging logic
NO changes to existing depth_refine.py or depth_verify.py behavior
NO non-flat floor handling (ramps, stairs, multi-level)
NO dense multi-view reconstruction beyond the floor plane
NO automatic scene segmentation or ML-based floor detection
NO global bundle adjustment across all cameras
NO saving of every frame's depth data — only pooled + curated best subset
NO GUI requirements — visualization is optional Plotly HTML output
NO modification of the extrinsics JSON schema (output format matches existing convention)

Verification Strategy (MANDATORY)

UNIVERSAL RULE: ZERO HUMAN INTERVENTION

ALL tasks in this plan MUST be verifiable WITHOUT any human action.

Test Decision

Infrastructure exists: YES (pytest configured in pyproject.toml)
Automated tests: TDD (tests first)
Framework: pytest (existing)

If TDD Enabled

Each TODO follows RED-GREEN-REFACTOR:

Task Structure:

RED: Write failing test first
- Test file: tests/test_<module>.py
- Test command: uv run pytest tests/test_<module>.py
- Expected: FAIL (test exists, implementation doesn't)
GREEN: Implement minimum code to pass
- Command: uv run pytest tests/test_<module>.py
- Expected: PASS
REFACTOR: Clean up while keeping green
- Command: uv run pytest tests/test_<module>.py
- Expected: PASS (still)

Agent-Executed QA Scenarios (MANDATORY — ALL tasks)

Verification Tool by Deliverable Type:

Type	Tool	How Agent Verifies
Python module	Bash (pytest)	Run tests, assert pass count, zero failures
CLI tool	Bash (click --help + invocation)	Check help output, run with test data, verify exit code and output
HDF5 file	Bash (python -c "import h5py; ...")	Open file, check schema, validate data shapes
Type checking	Bash (basedpyright)	Run type checker, verify zero errors
Plotly output	Bash (file existence + python parse)	Check file exists, contains valid HTML, has expected traces

Execution Strategy

Parallel Execution Waves

Wave 1 (Start Immediately):
├── Task 1: Add open3d + h5py dependencies
├── Task 2: TDD depth save module (aruco/depth_save.py) [after Task 1]
└── Task 3: TDD ground plane core module (aruco/ground_plane.py) [after Task 1]

Wave 2 (After Wave 1):
├── Task 4: Integrate --save-depth into calibrate_extrinsics.py [depends: 1, 2]
└── Task 5: Ground plane consensus + per-camera correction [depends: 1, 3]

Wave 3 (After Wave 2):
├── Task 6: Plotly diagnostic visualization module [depends: 5]
├── Task 7: refine_ground_plane.py CLI tool [depends: 2, 5, 6]
└── Task 8: Integration tests + basedpyright pass [depends: all]

Critical Path: Task 1 → Task 2 → Task 4 (depth save path)
              Task 1 → Task 3 → Task 5 → Task 7 (ground plane path)

Dependency Matrix

Task	Depends On	Blocks	Can Parallelize With
1	None	2, 3	None (must be first)
2	1	4, 7	3
3	1	5	2
4	1, 2	7, 8	5
5	1, 3	6, 7	4
6	5	7	4
7	2, 5, 6	8	None
8	All	None	None (final)

Agent Dispatch Summary

Wave	Tasks	Recommended Agents
1	1	task(category="quick", ...)
1→2	2, 3	task(category="unspecified-high", ...) — parallel after Task 1
2	4, 5	task(category="unspecified-high", ...) — parallel
3	6	task(category="unspecified-low", ...)
3	7	task(category="unspecified-high", ...)
3	8	task(category="unspecified-low", ...)

TODOs

1. Add open3d and h5py dependencies to pyproject.toml

What to do:
- Add open3d and h5py to the [project] dependencies list in pyproject.toml
- Run uv sync to install
- Verify imports work: uv run python -c "import open3d; import h5py; print('ok')"
Must NOT do:
- Do not add unnecessary deps (no trimesh, no probreg, no pycpd)
- Do not modify any other pyproject.toml sections
Recommended Agent Profile:
- Category: quick
  - Reason: Single file edit + one command
- Skills: []
  - No special skills needed for a dependency addition
Parallelization:
- Can Run In Parallel: NO
- Parallel Group: Wave 1 (solo — must complete before 2, 3)
- Blocks: Tasks 2, 3, 4, 5, 6, 7, 8
- Blocked By: None
References:

Pattern References:
- pyproject.toml:7-27 — Existing dependency list format and conventions (e.g., "scipy>=1.17.0")
Acceptance Criteria:
- pyproject.toml contains open3d and h5py in dependencies
- uv sync completes without error
- uv run python -c "import open3d; import h5py; print('ok')" prints ok and exits 0
Agent-Executed QA Scenarios:
```
Scenario: Dependencies install and import correctly
  Tool: Bash
  Preconditions: pyproject.toml edited
  Steps:
    1. uv sync
    2. uv run python -c "import open3d; print(open3d.__version__)"
    3. Assert: exit code 0, version string printed
    4. uv run python -c "import h5py; print(h5py.__version__)"
    5. Assert: exit code 0, version string printed
  Expected Result: Both libraries installed and importable
  Evidence: Command output captured
```
Commit: YES
- Message: build(deps): add open3d and h5py for ground plane refinement
- Files: pyproject.toml, uv.lock
- Pre-commit: uv run python -c "import open3d; import h5py"

2. TDD: Create aruco/depth_save.py — HDF5 depth map persistence module

What to do:

RED phase — Write tests/test_depth_save.py first with tests for:
- save_depth_data(): saves pooled depth + confidence + raw frames + intrinsics + metadata to HDF5
- load_depth_data(): loads HDF5 back into structured dict
- Round-trip test: save → load → compare arrays with np.testing.assert_allclose
- Schema validation: check /meta/schema_version, /meta/units, /meta/coordinate_frame
- Per-camera groups: /<serial>/pooled_depth, /<serial>/pooled_confidence, /<serial>/raw_frames/<idx>/depth, /<serial>/intrinsics
- Edge cases: single camera, no confidence map, no raw frames
- Error handling: invalid path, empty data
GREEN phase — Implement aruco/depth_save.py:
- save_depth_data(path, camera_data, schema_version=1) — writes HDF5
- load_depth_data(path) — reads HDF5 back to dict
- Schema version 1 layout:
```
/meta/
  schema_version: int = 1
  units: str = "meters"
  coordinate_frame: str = "world_from_cam"
  created_at: str (ISO 8601)
/<serial>/
  intrinsics: (3, 3) float64  — camera matrix K
  resolution: (2,) int — [width, height]
  pooled_depth: (H, W) float32
  pooled_confidence: (H, W) float32  [optional]
  pool_metadata: JSON string (same dict currently in results)
  raw_frames/
    0/depth: (H, W) float32
    0/confidence: (H, W) float32  [optional]
    0/frame_index: int
    0/score: float
    1/depth: ...
```
- Use h5py compression: compression="gzip", compression_opts=4
- Type annotations on all public functions
REFACTOR phase — Clean up, add docstrings, run basedpyright.

Must NOT do:
- Do not modify existing depth_pool.py or depth_verify.py
- Do not add ZED SDK dependency to this module (pure numpy/h5py)
- Do not save uncompressed data
Recommended Agent Profile:
- Category: unspecified-high
  - Reason: New module with TDD workflow, HDF5 schema design, comprehensive tests
- Skills: []
  - No special skills needed — standard Python + h5py
Parallelization:
- Can Run In Parallel: YES
- Parallel Group: Wave 1-2 (with Task 3, after Task 1)
- Blocks: Tasks 4, 7
- Blocked By: Task 1
References:

Pattern References:
- aruco/depth_pool.py:1-90 — Data format conventions: depth maps are (H, W) float in meters, confidence maps are (H, W) float with ZED semantics (lower = more confident)
- calibrate_extrinsics.py:143-305 — How depth maps and confidence maps are collected per camera, how pool_metadata dict is structured
- calibrate_extrinsics.py:120-131 — Function signature of apply_depth_verify_refine_postprocess showing the verification_frames data structure
API/Type References:
- aruco/depth_verify.py:18-24 — project_point_to_pixel(P_cam, K) shows intrinsics matrix K format (3x3, fx/fy/cx/cy)
Test References:
- tests/test_depth_pool.py — Follow this test structure: parametric, synthetic data, edge cases with pytest.raises
- tests/conftest.py — sys.path setup for imports
Documentation References:
- calibrate_extrinsics.py:338 — results[str(serial)]["depth_pool"] shows pool_metadata dict structure
WHY Each Reference Matters:
- depth_pool.py defines the array contracts (shape, dtype, units) the save module must preserve
- calibrate_extrinsics.py:143-305 shows exactly where/how depth data is produced — the save module must capture this data
- Test patterns in test_depth_pool.py establish the project's testing conventions
Acceptance Criteria:

TDD:
- Test file created: tests/test_depth_save.py
- Tests cover: save, load, round-trip, schema validation, edge cases, error handling
- uv run pytest tests/test_depth_save.py -v → PASS (all tests, 0 failures)
Agent-Executed QA Scenarios:
```
Scenario: Round-trip save and load preserves data
  Tool: Bash (pytest)
  Preconditions: aruco/depth_save.py implemented
  Steps:
    1. uv run pytest tests/test_depth_save.py -v -k "round_trip"
    2. Assert: exit code 0
    3. Assert: output contains "PASSED"
  Expected Result: Saved HDF5 loads back with identical data
  Evidence: pytest output captured

Scenario: HDF5 schema has required metadata
  Tool: Bash (pytest)
  Preconditions: aruco/depth_save.py implemented
  Steps:
    1. uv run pytest tests/test_depth_save.py -v -k "schema"
    2. Assert: exit code 0
    3. Assert: tests verify /meta/schema_version, /meta/units, /meta/coordinate_frame
  Expected Result: Schema metadata present and correct
  Evidence: pytest output captured

Scenario: Module passes type checking
  Tool: Bash (basedpyright)
  Preconditions: Module implemented with type annotations
  Steps:
    1. uv run basedpyright aruco/depth_save.py
    2. Assert: exit code 0 or only non-error diagnostics
  Expected Result: No type errors
  Evidence: basedpyright output captured
```
Commit: YES
- Message: feat(aruco): add HDF5 depth map persistence module
- Files: aruco/depth_save.py, tests/test_depth_save.py
- Pre-commit: uv run pytest tests/test_depth_save.py

3. TDD: Create aruco/ground_plane.py — floor detection & consensus alignment core

What to do:

RED phase — Write tests/test_ground_plane.py first with tests for:

A. unproject_depth_to_points(depth_map, K, T_world_cam, stride=4):
- Takes depth map + intrinsics + extrinsics → returns (N, 3) world-coordinate point cloud
- Test: synthetic depth of a flat plane at known height → verify recovered 3D points match expected positions
- Test: NaN/zero/negative depth values are excluded
- Test: stride parameter reduces output point count proportionally
B. detect_floor_plane(points, normal_constraint, height_range, min_inlier_ratio, seed):
- Uses Open3D RANSAC segment_plane() on the point-cloud
- Returns FloorPlaneResult(normal, offset, inliers, inlier_ratio, plane_model)
- Test: synthetic flat floor + random noise → recovers correct plane within tolerance
- Test: synthetic floor + wall points → RANSAC ignores wall, finds floor (normal_constraint filters)
- Test: normal_constraint rejects planes that aren't near-vertical (e.g., wall plane)
- Test: height_range rejects planes too far from expected floor height
- Test: too few inliers → returns None (below min_inlier_ratio)
- Test: seed parameter produces deterministic results
C. compute_consensus_plane(floor_results, camera_weights=None):
- Takes per-camera FloorPlaneResult list → fits a single consensus plane
- Method: concatenate all inlier points, weighted by inlier count, fit plane via SVD
- Test: two cameras seeing same plane → consensus matches individual planes
- Test: two cameras with slight disagreement → consensus is between them
- Test: camera weights affect result appropriately
D. compute_floor_correction(T_world_cam, floor_result, consensus_plane, max_rotation_deg=5.0, max_translation_m=0.05):
- Computes constrained correction for a single camera
- Allowed DOF: pitch/roll + vertical translation ONLY (no yaw, no lateral)
- Uses scipy.optimize.least_squares with bounds
- Returns CorrectionResult(T_corrected, delta_rotation_deg, delta_translation_m, applied)
- Test: camera with 2° tilt from consensus → correction brings it within 0.1°
- Test: correction respects max_rotation_deg bound
- Test: correction respects max_translation_m bound
- Test: yaw component is preserved (no yaw drift)
- Test: lateral translation is preserved (no X/Z drift)
GREEN phase — Implement aruco/ground_plane.py:
- Import open3d for RANSAC plane segmentation
- Import scipy.optimize.least_squares for constrained correction
- Reuse aruco.alignment.rotation_align_vectors where appropriate
- Reuse aruco.pose_math.invert_transform and matrix_to_rvec_tvec
- Use dataclasses for FloorPlaneResult and CorrectionResult
- All functions are pure (no side effects, no file I/O)
REFACTOR phase — Docstrings, type annotations, basedpyright.

Must NOT do:
- No ML/segmentation — RANSAC + geometric constraints only
- No global bundle adjustment
- No modification to existing alignment.py
- No dense reconstruction beyond floor plane extraction
Recommended Agent Profile:
- Category: unspecified-high
  - Reason: Core algorithmic module with 4 major functions, each with multiple test cases. Requires understanding of SE3 geometry, RANSAC, and constrained optimization.
- Skills: []
Parallelization:
- Can Run In Parallel: YES
- Parallel Group: Wave 1-2 (with Task 2, after Task 1)
- Blocks: Task 5
- Blocked By: Task 1
References:

Pattern References:
- aruco/alignment.py:54-114 — rotation_align_vectors(from_vec, to_vec) — reuse for aligning floor normal to target up vector
- aruco/alignment.py:117-137 — apply_alignment_to_pose(T, R_align) — pattern for applying global rotation to extrinsics
- aruco/alignment.py:140-202 — estimate_up_vector_from_cameras() — existing camera-based "up" estimation, useful as initial guess for floor normal
- aruco/depth_refine.py:12-20 — extrinsics_to_params() / params_to_extrinsics() — 6-DOF parameterization for optimization
- aruco/depth_refine.py:71-180 — refine_extrinsics_with_depth() — pattern for bounded least_squares optimization of camera pose
- aruco/depth_verify.py:18-24 — project_point_to_pixel(P_cam, K) — projection math
- aruco/pose_math.py:22-28 — invert_transform(T) — efficient SE3 inversion
API/Type References:
- aruco/alignment.py:7-16 — Type aliases: Vec3, Mat33, Mat44, CornersNC
- aruco/depth_verify.py:8-15 — DepthVerificationResult dataclass pattern
Test References:
- tests/test_alignment.py — Testing convention for geometric functions (synthetic inputs, tolerance assertions)
- tests/test_depth_refine.py — Testing convention for optimization functions (before/after metrics)
External References:
- Open3D docs: segment_plane(distance_threshold, ransac_n, num_iterations) — returns [a, b, c, d] plane model + inlier indices
WHY Each Reference Matters:
- alignment.py provides the exact rotation-alignment primitives we need — no need to reimplement
- depth_refine.py establishes the bounded least-squares pattern with regularization — correction should follow the same style
- test_alignment.py shows how geometric tests are structured in this project (synthetic data, assert_allclose)
Acceptance Criteria:

TDD:
- Test file created: tests/test_ground_plane.py
- Tests cover: unproject, floor detection (happy + noise + wall + failure), consensus, correction (tilt + bounds + yaw preservation)
- uv run pytest tests/test_ground_plane.py -v → PASS (all tests, 0 failures)
Agent-Executed QA Scenarios:
```
Scenario: Floor detection on synthetic flat plane
  Tool: Bash (pytest)
  Preconditions: aruco/ground_plane.py implemented
  Steps:
    1. uv run pytest tests/test_ground_plane.py -v -k "detect_floor and synthetic_flat"
    2. Assert: exit code 0
    3. Assert: recovered normal within 1° of [0, -1, 0]
  Expected Result: RANSAC correctly identifies flat floor
  Evidence: pytest output captured

Scenario: Per-camera correction preserves yaw
  Tool: Bash (pytest)
  Preconditions: aruco/ground_plane.py implemented
  Steps:
    1. uv run pytest tests/test_ground_plane.py -v -k "correction and yaw"
    2. Assert: exit code 0
    3. Assert: yaw angle before == yaw angle after (within 0.01°)
  Expected Result: Correction only affects pitch/roll + vertical translation
  Evidence: pytest output captured

Scenario: Module passes type checking
  Tool: Bash (basedpyright)
  Preconditions: Module implemented with type annotations
  Steps:
    1. uv run basedpyright aruco/ground_plane.py
    2. Assert: exit code 0 or only non-error diagnostics
  Expected Result: No type errors
  Evidence: basedpyright output captured
```
Commit: YES
- Message: feat(aruco): add ground plane detection and per-camera correction module
- Files: aruco/ground_plane.py, tests/test_ground_plane.py
- Pre-commit: uv run pytest tests/test_ground_plane.py

4. Integrate --save-depth flag into calibrate_extrinsics.py

What to do:
- Add --save-depth Click option (type click.Path(), default None)
- When provided, after depth pooling/selection in apply_depth_verify_refine_postprocess, call save_depth_data() to persist:
  - Pooled depth + confidence per camera
  - Raw best-scored frames (depth + confidence + frame index + score)
  - Camera intrinsics matrix K
  - Pool metadata dict
- Log the output path and file size
Must NOT do:
- Do not change existing depth processing behavior
- Do not make saving mandatory (only when --save-depth is provided)
- Do not save if depth verification/refinement is not enabled (warn and skip)
Recommended Agent Profile:
- Category: unspecified-high
  - Reason: Integration into existing CLI with complex data flow, needs careful threading of data through the function
- Skills: []
Parallelization:
- Can Run In Parallel: YES
- Parallel Group: Wave 2 (with Task 5)
- Blocks: Tasks 7, 8
- Blocked By: Tasks 1, 2
References:

Pattern References:
- calibrate_extrinsics.py:562-678 — Click option definitions and main() function signature — follow exact same patterns for the new flag
- calibrate_extrinsics.py:606-611 — --depth-pool-size option as example of depth-related flag
- calibrate_extrinsics.py:120-305 — apply_depth_verify_refine_postprocess() — this is where depth data is available and where save should be triggered
- calibrate_extrinsics.py:143-165 — Where depth_maps and confidence_maps lists are built per camera — data to capture for raw frames
- calibrate_extrinsics.py:267-270 — Where final_depth and pool_metadata are determined — data to capture for pooled result
API/Type References:
- aruco/depth_save.py (Task 2 output) — save_depth_data(path, camera_data, schema_version=1) function signature
Test References:
- tests/test_depth_cli_postprocess.py — Existing test patterns for calibrate_extrinsics CLI post-processing behavior
- tests/test_depth_pool_integration.py — Integration test patterns with mocked depth data
WHY Each Reference Matters:
- calibrate_extrinsics.py:562-678 is the exact location where the new flag must be added, following identical Click patterns
- apply_depth_verify_refine_postprocess is the function that has access to all depth data — save must be called from here or just after it
- Integration tests show how to mock ZED data for testing the full flow
Acceptance Criteria:

TDD:
- Test file updated or created: tests/test_depth_save_integration.py
- Tests cover: flag appears in help, save is called when flag provided, save is NOT called without flag
- uv run pytest tests/test_depth_save_integration.py -v → PASS
Agent-Executed QA Scenarios:
```
Scenario: --save-depth flag appears in CLI help
  Tool: Bash
  Preconditions: calibrate_extrinsics.py updated
  Steps:
    1. uv run python calibrate_extrinsics.py --help
    2. Assert: output contains "--save-depth"
    3. Assert: output contains "HDF5" or "depth" in the help text for the flag
  Expected Result: Flag is documented in help
  Evidence: Help output captured

Scenario: Existing tests still pass after integration
  Tool: Bash (pytest)
  Preconditions: calibrate_extrinsics.py updated
  Steps:
    1. uv run pytest tests/test_depth_cli_postprocess.py tests/test_depth_pool_integration.py -v
    2. Assert: exit code 0, no regressions
  Expected Result: No existing behavior broken
  Evidence: pytest output captured
```
Commit: YES
- Message: feat(calibrate): add --save-depth flag for HDF5 depth persistence
- Files: calibrate_extrinsics.py, tests/test_depth_save_integration.py
- Pre-commit: uv run pytest tests/test_depth_cli_postprocess.py tests/test_depth_pool_integration.py

5. Extend aruco/ground_plane.py with multi-camera workflow orchestration

What to do:

Add high-level orchestration functions that compose the primitives from Task 3:

A. refine_ground_from_depth(camera_data, extrinsics, config):
- Main entry point: takes per-camera depth data + current extrinsics → returns corrected extrinsics + metrics
- Flow:
  1. Per camera: unproject_depth_to_points → detect_floor_plane
  2. compute_consensus_plane from all successful detections
  3. Per camera: compute_floor_correction relative to consensus
  4. Return corrected extrinsics dict + RefinementMetrics
- Config dataclass with: max_rotation_deg, max_translation_m, ransac_distance_threshold, min_inlier_ratio, height_range, normal_constraint_deg, stride, seed
- Metrics dataclass with: per-camera floor angles (before/after), consensus plane model, correction magnitudes, skipped cameras + reasons
B. Error handling:
- If floor detection fails for a camera → skip it, log warning, include in metrics
- If fewer than 2 cameras have valid floor → abort, return original extrinsics + error reason
- If correction exceeds bounds → cap at bounds, mark as clamped in metrics
Must NOT do:
- No file I/O in this module — pure computation
- No visualization — that's Task 6
Recommended Agent Profile:
- Category: unspecified-high
  - Reason: Orchestration logic with error handling, config management, metrics collection
- Skills: []
Parallelization:
- Can Run In Parallel: YES
- Parallel Group: Wave 2 (with Task 4)
- Blocks: Tasks 6, 7
- Blocked By: Tasks 1, 3
References:

Pattern References:
- calibrate_extrinsics.py:120-131 — apply_depth_verify_refine_postprocess() signature — pattern for multi-camera orchestration function
- aruco/depth_refine.py:71-227 — refine_extrinsics_with_depth() return value pattern: (result, stats_dict)
- aruco/depth_verify.py:8-15 — DepthVerificationResult dataclass — pattern for structured results
API/Type References:
- aruco/ground_plane.py (Task 3 output) — All primitive functions: unproject_depth_to_points, detect_floor_plane, compute_consensus_plane, compute_floor_correction
Test References:
- tests/test_ground_plane.py (Task 3 output) — Unit test patterns to follow for orchestration tests
WHY Each Reference Matters:
- apply_depth_verify_refine_postprocess shows how multi-camera iteration with fallback is done in this codebase
- depth_refine.py shows the (result, stats) return pattern that should be followed
Acceptance Criteria:

TDD:
- Tests added to tests/test_ground_plane.py for orchestration functions
- Tests cover: full pipeline with 2-camera synthetic data, single-camera skip, all-cameras-fail abort, config bounds
- uv run pytest tests/test_ground_plane.py -v → PASS (all tests, 0 failures)
Agent-Executed QA Scenarios:
```
Scenario: Two-camera synthetic refinement produces level ground
  Tool: Bash (pytest)
  Preconditions: Orchestration functions implemented
  Steps:
    1. uv run pytest tests/test_ground_plane.py -v -k "refine_ground_from_depth and two_camera"
    2. Assert: exit code 0
    3. Assert: after correction, floor angle disagreement < 0.5°
  Expected Result: Per-camera corrections level the ground
  Evidence: pytest output captured

Scenario: Graceful fallback when floor detection fails for one camera
  Tool: Bash (pytest)
  Preconditions: Orchestration functions implemented
  Steps:
    1. uv run pytest tests/test_ground_plane.py -v -k "skip_camera"
    2. Assert: exit code 0
    3. Assert: skipped camera's extrinsics unchanged, other cameras corrected
  Expected Result: Partial failure handled gracefully
  Evidence: pytest output captured
```
Commit: YES
- Message: feat(aruco): add multi-camera ground plane refinement orchestration
- Files: aruco/ground_plane.py, tests/test_ground_plane.py
- Pre-commit: uv run pytest tests/test_ground_plane.py

6. Create Plotly diagnostic visualization for ground plane refinement

What to do:
- Add a function create_ground_diagnostic_plot(metrics, camera_data, extrinsics_before, extrinsics_after) → returns plotly.graph_objects.Figure
- Add a function save_diagnostic_plot(fig, path) → writes HTML file
- Visualization contents:
  - 3D scatter: floor inlier points per camera (color-coded by camera serial)
  - Surface: consensus plane (semi-transparent)
  - Camera frustums: before (dashed/faded) and after (solid) positions
  - Annotation: per-camera correction magnitude (degrees + cm)
  - Title: summary metrics (total disagreement before/after)
- Follow existing Plotly patterns from visualize_extrinsics.py and compare_pose_sets.py
Must NOT do:
- No interactive server or GUI — static HTML file only
- No Open3D visualization (use Plotly only, already a dep)
- No complex camera frustum rendering — simple cone or pyramid is fine
Recommended Agent Profile:
- Category: unspecified-low
  - Reason: Visualization code following existing Plotly patterns, no complex algorithm
- Skills: []
Parallelization:
- Can Run In Parallel: NO
- Parallel Group: Wave 3 (sequential after Task 5)
- Blocks: Task 7
- Blocked By: Task 5
References:

Pattern References:
- compare_pose_sets.py:145-200 — add_camera_trace() — Plotly camera visualization pattern (frustum + axes + labels)
- visualize_extrinsics.py — Full Plotly 3D scene setup with layout, ground plane, axis labels (check head of file for imports and patterns)
Test References:
- No heavy test required — visualization is a "nice to have". A smoke test that the function returns a go.Figure with expected trace count is sufficient.
WHY Each Reference Matters:
- compare_pose_sets.py already has Plotly camera rendering code that can be adapted
- visualize_extrinsics.py shows the full 3D scene pattern including ground plane rendering
Acceptance Criteria:
- Function create_ground_diagnostic_plot returns a plotly.graph_objects.Figure
- Figure contains traces for: floor points per camera, consensus plane surface, camera markers
- Smoke test: uv run pytest tests/test_ground_plane.py -v -k "diagnostic_plot" → PASS
Agent-Executed QA Scenarios:
```
Scenario: Diagnostic plot generates valid HTML
  Tool: Bash (pytest)
  Preconditions: Visualization function implemented
  Steps:
    1. uv run pytest tests/test_ground_plane.py -v -k "diagnostic_plot"
    2. Assert: exit code 0
    3. Assert: test verifies Figure has correct number of traces
  Expected Result: Plot function produces valid Plotly figure
  Evidence: pytest output captured
```
Commit: YES (groups with Task 7)
- Message: feat(aruco): add Plotly diagnostic visualization for ground plane
- Files: aruco/ground_plane.py (viz function added), tests/test_ground_plane.py
- Pre-commit: uv run pytest tests/test_ground_plane.py

7. Create refine_ground_plane.py — standalone CLI tool

What to do:
- Click CLI tool with the following options:
  - --input-depth / -d: Path to HDF5 depth file (from --save-depth)
  - --input-extrinsics / -i: Path to extrinsics JSON (from calibrate_extrinsics.py)
  - --output-extrinsics / -o: Path for corrected extrinsics JSON
  - --metrics-json: Optional path for machine-readable metrics output
  - --plot / --no-plot: Generate Plotly diagnostic (default: --plot)
  - --plot-output: Path for diagnostic HTML (default: <output_dir>/ground_diagnostic.html)
  - --max-rotation-deg: Max correction rotation (default: 5.0)
  - --max-translation-m: Max correction translation (default: 0.05)
  - --ransac-threshold: RANSAC distance threshold in meters (default: 0.02)
  - --min-inlier-ratio: Minimum inlier ratio to accept floor detection (default: 0.3)
  - --height-range: Expected floor height range as "min,max" (default: auto from data)
  - --stride: Depth map downsampling stride (default: 4)
  - --seed: Random seed for reproducibility (default: 42)
  - --debug / --no-debug: Verbose logging
- Flow:
  1. Load extrinsics JSON (reuse compare_pose_sets.py:load_poses_from_json)
  2. Load depth data from HDF5 (use depth_save.load_depth_data)
  3. Call refine_ground_from_depth() orchestration function
  4. Save corrected extrinsics (same JSON format as input, with _meta.ground_refined: true)
  5. Print summary metrics to stdout
  6. Optionally save metrics JSON
  7. Optionally generate diagnostic Plotly HTML
- Output extrinsics format: identical to calibrate_extrinsics.py output, with added _meta.ground_refined flag
Must NOT do:
- No ZED SDK dependency — works entirely from saved files
- No modification of input files
- No interactive prompts
Recommended Agent Profile:
- Category: unspecified-high
  - Reason: Full CLI tool composing multiple modules, end-to-end data flow, error handling, multiple output formats
- Skills: []
Parallelization:
- Can Run In Parallel: NO
- Parallel Group: Wave 3 (depends on 2, 5, 6)
- Blocks: Task 8
- Blocked By: Tasks 2, 5, 6
References:

Pattern References:
- calibrate_extrinsics.py:562-678 — Click CLI pattern with extensive options, logging, error handling
- compare_pose_sets.py:52-88 — load_poses_from_json() — JSON extrinsics loading pattern
- compare_pose_sets.py:91-92 — serialize_pose() — JSON extrinsics saving pattern
- visualize_extrinsics.py — CLI tool that loads extrinsics + generates Plotly output
API/Type References:
- aruco/depth_save.py (Task 2) — load_depth_data(path) return type
- aruco/ground_plane.py (Tasks 3, 5) — refine_ground_from_depth() signature and return type
- aruco/ground_plane.py (Task 6) — create_ground_diagnostic_plot() signature
WHY Each Reference Matters:
- calibrate_extrinsics.py CLI is the canonical pattern for Click tools in this project
- compare_pose_sets.py shows how to load/save the extrinsics JSON format correctly
- The ground_plane module provides all computation — CLI just wires I/O to computation
Acceptance Criteria:

TDD:
- Test file created: tests/test_refine_ground_cli.py
- Tests cover: help output, valid invocation with synthetic data, missing input error, output file creation, metrics JSON format
- uv run pytest tests/test_refine_ground_cli.py -v → PASS
Agent-Executed QA Scenarios:
```
Scenario: CLI help shows all expected options
  Tool: Bash
  Preconditions: refine_ground_plane.py created
  Steps:
    1. uv run python refine_ground_plane.py --help
    2. Assert: output contains "--input-depth", "--input-extrinsics", "--output-extrinsics"
    3. Assert: output contains "--max-rotation-deg", "--ransac-threshold", "--seed"
    4. Assert: exit code 0
  Expected Result: All options documented
  Evidence: Help output captured

Scenario: Tool produces valid extrinsics JSON
  Tool: Bash
  Preconditions: Synthetic HDF5 and extrinsics JSON created by test fixtures
  Steps:
    1. uv run pytest tests/test_refine_ground_cli.py -v -k "produces_valid_json"
    2. Assert: exit code 0
    3. Assert: output JSON is valid, contains all camera serials, has _meta.ground_refined
  Expected Result: Output matches extrinsics JSON schema
  Evidence: pytest output captured

Scenario: Metrics JSON contains before/after comparison
  Tool: Bash
  Preconditions: Test creates and runs CLI with --metrics-json
  Steps:
    1. uv run pytest tests/test_refine_ground_cli.py -v -k "metrics_json"
    2. Assert: exit code 0
    3. Assert: metrics has 'floor.angle_disagreement_deg_before' and 'floor.angle_disagreement_deg_after'
  Expected Result: Machine-readable improvement metrics produced
  Evidence: pytest output captured
```
Commit: YES
- Message: feat: add refine_ground_plane.py standalone CLI tool
- Files: refine_ground_plane.py, tests/test_refine_ground_cli.py
- Pre-commit: uv run pytest tests/test_refine_ground_cli.py

8. Final integration: full test suite pass + basedpyright + README update

What to do:
- Run the FULL test suite: uv run pytest -x -vv
- Run basedpyright on all new files: uv run basedpyright aruco/depth_save.py aruco/ground_plane.py refine_ground_plane.py
- Fix any regressions or type errors
- Add usage example to README.md showing the depth-save → ground-refine workflow:
```
# Step 1: Calibrate with depth saving
uv run calibrate_extrinsics.py ... --refine-depth --save-depth output/depth_data.h5

# Step 2: Refine ground plane
uv run refine_ground_plane.py \
    --input-depth output/depth_data.h5 \
    --input-extrinsics output/extrinsics.json \
    --output-extrinsics output/extrinsics_ground_refined.json \
    --plot-output output/ground_diagnostic.html
```
Must NOT do:
- Do not modify any test behavior — only fix genuine regressions
- Do not add features — this is stabilization only
Recommended Agent Profile:
- Category: unspecified-low
  - Reason: Verification and minor fixups, no new features
- Skills: []
Parallelization:
- Can Run In Parallel: NO
- Parallel Group: Wave 3 (final, sequential)
- Blocks: None (terminal)
- Blocked By: All previous tasks
References:

Pattern References:
- README.md — Existing usage examples for calibrate_extrinsics.py and visualize_extrinsics.py
- pyproject.toml:39-41 — pytest configuration (testpaths, norecursedirs)
WHY Each Reference Matters:
- README has existing command examples that the new workflow should follow in format/style
- pyproject.toml pytest config ensures all test directories are covered
Acceptance Criteria:
- uv run pytest -x -vv → ALL tests pass, 0 failures, 0 errors
- uv run basedpyright aruco/depth_save.py aruco/ground_plane.py refine_ground_plane.py → 0 errors
- README.md contains usage example for the new ground refinement workflow
Agent-Executed QA Scenarios:
```
Scenario: Full test suite passes
  Tool: Bash (pytest)
  Preconditions: All previous tasks completed
  Steps:
    1. uv run pytest -x -vv
    2. Assert: exit code 0
    3. Assert: all tests pass, 0 failures
  Expected Result: No regressions introduced
  Evidence: Full pytest output captured

Scenario: Type checking passes
  Tool: Bash (basedpyright)
  Preconditions: All new modules written
  Steps:
    1. uv run basedpyright aruco/depth_save.py aruco/ground_plane.py refine_ground_plane.py
    2. Assert: no error-level diagnostics
  Expected Result: Type-safe code
  Evidence: basedpyright output captured
```
Commit: YES
- Message: chore: final integration pass — tests, types, README for ground plane refinement
- Files: README.md, any fixup files
- Pre-commit: uv run pytest -x -vv

Commit Strategy

After Task	Message	Files	Verification
1	`build(deps): add open3d and h5py for ground plane refinement`	`pyproject.toml`, `uv.lock`	`uv run python -c "import open3d; import h5py"`
2	`feat(aruco): add HDF5 depth map persistence module`	`aruco/depth_save.py`, `tests/test_depth_save.py`	`uv run pytest tests/test_depth_save.py`
3	`feat(aruco): add ground plane detection and per-camera correction module`	`aruco/ground_plane.py`, `tests/test_ground_plane.py`	`uv run pytest tests/test_ground_plane.py`
4	`feat(calibrate): add --save-depth flag for HDF5 depth persistence`	`calibrate_extrinsics.py`, `tests/test_depth_save_integration.py`	`uv run pytest tests/test_depth_cli_postprocess.py tests/test_depth_pool_integration.py`
5	`feat(aruco): add multi-camera ground plane refinement orchestration`	`aruco/ground_plane.py`, `tests/test_ground_plane.py`	`uv run pytest tests/test_ground_plane.py`
6	`feat(aruco): add Plotly diagnostic visualization for ground plane`	`aruco/ground_plane.py`, `tests/test_ground_plane.py`	`uv run pytest tests/test_ground_plane.py`
7	`feat: add refine_ground_plane.py standalone CLI tool`	`refine_ground_plane.py`, `tests/test_refine_ground_cli.py`	`uv run pytest tests/test_refine_ground_cli.py`
8	`chore: final integration pass — tests, types, README for ground plane refinement`	`README.md`, fixups	`uv run pytest -x -vv`

Success Criteria

Verification Commands

# All tests pass
uv run pytest -x -vv  # Expected: 0 failures

# Type checking passes
uv run basedpyright aruco/depth_save.py aruco/ground_plane.py refine_ground_plane.py  # Expected: 0 errors

# CLI tools have correct help
uv run python calibrate_extrinsics.py --help | grep "save-depth"  # Expected: --save-depth appears
uv run python refine_ground_plane.py --help  # Expected: all options listed

# Dependencies installed
uv run python -c "import open3d; import h5py; print('ok')"  # Expected: ok

Final Checklist

All "Must Have" requirements present
All "Must NOT Have" exclusions absent (no core pipeline changes, no ML, no non-flat floors)
All tests pass (uv run pytest -x -vv)
Type checking passes (uv run basedpyright)
HDF5 depth saving works end-to-end (save → load round-trip)
Ground plane refinement produces measurably improved floor alignment
Output extrinsics JSON matches existing format (compatible with visualize_extrinsics.py)
Diagnostic Plotly HTML generated successfully
README updated with usage workflow

46 KiB Raw Blame History