Files
zed-playground/py_workspace/.sisyphus/drafts/ground-plane-refinement.md
T

4.6 KiB

Draft: Ground Plane Refinement & Depth Map Persistence

Requirements (confirmed)

  • Core problem: Camera disagreement — different cameras don't agree on where the ground is (floor at different heights/angles)
  • Depth saving: Save BOTH pooled depth maps AND raw best-scored frames per camera, so pooling parameters can be re-tuned without re-reading SVOs
  • Integration: Post-processing step — a new standalone CLI tool that loads existing extrinsics + saved depth data and refines
  • Library: TBD — user wants to understand trade-offs before committing

Technical Decisions

  • Post-processing approach: non-invasive, loads existing calibration JSON + depth data
  • Depth saving happens inside calibrate_extrinsics.py (or triggered by flag)
  • Ground refinement tool is a NEW script (e.g., refine_ground_plane.py)

Research Findings

  • Current alignment.py: Aligns world frame based on marker face normals, NOT actual floor geometry
  • Current depth_pool.py: Per-pixel median pooling exists, but result is discarded after use (never saved)
  • Current depth_refine.py: Optimizes 6-DOF per camera using depth at marker corners only (sparse)
  • compare_pose_sets.py: Has Kabsch rigid_transform_3d() for point-set alignment
  • Available deps: numpy, scipy, opencv — sufficient for RANSAC plane fitting
  • Open3D: Provides ICP, RANSAC, visualization but is ~500MB heavy dep

Open Questions (Resolved)

  • Camera count: 2-4 cameras (small setup, likely some floor overlap)
  • Observation method: Point clouds don't align when overlayed in world coords
  • Error magnitude: Small — 1-3° tilt, <2cm offset (fine-tuning level)
  • Floor type: TBD (assumed flat for now)
  • Library choice: TBD — recommendation below

Library Recommendation Analysis

Given: 2-4 cameras, small errors, flat floor assumption, post-processing tool

numpy/scipy approach:

  • RANSAC plane fitting: trivial with numpy (random sample 3 points, fit plane, count inliers)
  • Plane-to-plane alignment: rotation_align_vectors already exists in alignment.py
  • Point cloud generation from depth+intrinsics: simple numpy vectorized operation
  • Kabsch alignment: already exists in compare_pose_sets.py
  • Verdict: SUFFICIENT for this use case. No ICP needed since we're fitting to a known target (Y=0 plane).

Open3D approach:

  • Overkill for plane fitting + rotation correction
  • Would be useful if we needed dense ICP between overlapping point clouds
  • 500MB dep for what amounts to ~50 lines of numpy code
  • Verdict: Not needed for the initial version

Decision: Use Open3D for point cloud operations (user wants it available for future work). Also add h5py for HDF5 depth map persistence.

Confirmed Technical Choices

  • Library: Open3D (RANSAC plane segmentation, ICP if needed, point cloud ops)
  • Depth save format: HDF5 via h5py (structured, metadata-rich, one file per camera)
  • Visualization: Plotly HTML (interactive 3D — floor points per camera, consensus plane, before/after)
  • Integration: Standalone post-processing CLI tool (click-based, like existing tools)
  • Error handling: numpy/scipy for math, Open3D for geometry, existing alignment.py patterns

Algorithm (confirmed via research + codebase analysis)

  1. Load existing extrinsics JSON + saved depth maps (HDF5)
  2. Per camera: unproject depth → world-coord point cloud using extrinsics
  3. Per camera: Open3D RANSAC plane segmentation → extract floor points
  4. Consensus: fit a single plane to ALL floor points from all cameras
  5. Compute correction rotation: align consensus plane normal to [0, -1, 0]
  6. Apply correction to all extrinsics (global rotation, like current alignment.py)
  7. Optionally: per-camera ICP refinement on overlapping floor regions
  8. Save corrected extrinsics JSON + generate diagnostic Plotly visualization

Final Decisions (all confirmed)

  • Depth save trigger: --save-depth <dir> flag in calibrate_extrinsics.py
  • Refinement granularity: Per-camera refinement (each camera corrected based on its floor obs)
  • Test strategy: TDD — write tests first, following existing test patterns in tests/

Scope Boundaries

  • INCLUDE: Depth map saving (HDF5), ground plane detection per camera, consensus plane fitting, per-camera extrinsic correction
  • INCLUDE: Standalone post-processing CLI tool (refine_ground_plane.py)
  • INCLUDE: Plotly diagnostic visualization
  • INCLUDE: TDD with pytest
  • INCLUDE: New deps: open3d, h5py
  • EXCLUDE: Modifying the core ArUco detection or PnP pipeline
  • EXCLUDE: Real-time / streaming refinement
  • EXCLUDE: Non-flat floor handling (ramps, stairs)
  • EXCLUDE: Dense multi-view reconstruction beyond floor plane