feat: implement depth bias estimation and correction in ICP pipeline

This commit is contained in:
2026-02-11 14:11:40 +00:00
parent 29eec81ea0
commit 8c6087683f
11 changed files with 1506 additions and 30 deletions
@@ -0,0 +1,820 @@
# Per-Camera Depth Bias Correction
## TL;DR
> **Quick Summary**: Add automatic per-camera depth offset estimation and correction as a pre-pass within the ICP pipeline, eliminating the 0.038m0.137m cross-camera depth biases that prevent ICP from converging within safety gates.
>
> **Deliverables**:
> - `estimate_depth_biases()` function in `aruco/icp_registration.py`
> - Bias application integrated into `refine_with_icp()` before unprojection
> - `--icp-depth-bias/--no-icp-depth-bias` CLI flag in `refine_ground_plane.py`
> - Comprehensive tests in `tests/test_depth_bias.py`
> - Updated documentation in `README.md`
>
> **Estimated Effort**: Medium
> **Parallel Execution**: YES - 2 waves
> **Critical Path**: Task 1 → Task 2 → Task 3 → Task 4 → Task 5 → Task 6
---
## Context
### Original Request
Implement per-camera depth bias correction as the recommended next remediation step from the ICP depth bias diagnosis. The diagnosis (documented in `docs/icp-depth-bias-diagnosis.md`) confirmed that systematic cross-camera depth biases (up to 13.7cm) are the primary blocker for ICP convergence, not overlap or planarity.
### Interview Summary
**Key Discussions**:
- **Integration style**: Automatic pre-pass within `refine_ground_plane.py --icp` (no separate CLI command)
- **Correction model**: Offset-only (β) first — z' = z + β per camera
- **Estimation method**: Full overlap-region signed residuals (KDTree correspondences), not just floor-plane d-differences
- **Test strategy**: Tests after implementation
**Research Findings**:
- **Librarian**: Affine (α·z+β) is production standard, but offset-only is appropriate for 4 cameras and known-small-range scenes
- **Librarian**: Per-camera (N-1 params) preferred over per-pair (N²) for global loop-closure consistency
- **Explore**: Insertion point is `icp_registration.py:569` — before `unproject_depth_to_points()` in `refine_with_icp()`
- **Explore**: Depth is stored as float32 meters (Z along camera optical axis) in HDF5. Units confirmed via `depth_save.py` schema.
- **Data**: Camera `44435674` is worst outlier; `41831756-44289123` is best-agreeing pair (0.038m bias)
### Metis Review
**Identified Gaps** (addressed):
- **Camera-ray scalar**: Residuals must be projected onto source camera ray direction (not arbitrary world normal) since β shifts depth along the optical axis. Plan uses ray-projected signed residuals.
- **NaN/depth-zero clamping**: After applying β, values ≤ 0 must be masked to NaN. Added to acceptance criteria.
- **Disconnected overlap graph**: Cameras without sufficient overlap to reference get β=0 (safe fallback). Added explicit handling.
- **Minimum sample thresholds**: Pairs with <100 valid correspondences are excluded from the global solve. Added gating.
- **Toggle isolation**: `--no-icp-depth-bias` must skip estimation entirely and produce identical output to current code. Added test.
- **Sign convention**: Deterministic synthetic test with known sign required. Added.
---
## Work Objectives
### Core Objective
Estimate and correct per-camera depth offsets so that overlapping point clouds from different cameras agree on surface positions, enabling ICP to converge within existing safety gates.
### Concrete Deliverables
- New function `estimate_depth_biases()` in `aruco/icp_registration.py`
- Modified `refine_with_icp()` with bias pre-pass
- New CLI flag `--icp-depth-bias/--no-icp-depth-bias` in `refine_ground_plane.py`
- New test file `tests/test_depth_bias.py`
- Updated `README.md` documentation
### Definition of Done
- [x] All pairwise median biases reduce by >50% after correction (measured on real data)
- [x] ICP accepts ≥1 non-reference camera update (currently 0)
- [x] `uv run pytest` passes (all existing + new tests)
- [x] `uv run basedpyright` produces no new errors
- [x] `--no-icp-depth-bias` produces identical output to current code
### Must Have
- Per-camera offset estimation from overlap correspondences
- Robust median aggregation (insensitive to 30% outliers)
- Reference camera fixed at β=0 (gauge freedom)
- Minimum correspondence count gating per pair
- NaN/invalid depth handling after bias application
- CLI toggle (on by default when `--icp` is used)
- Logging of estimated biases per camera
### Must NOT Have (Guardrails)
- NO affine model (α·z+β) — defer to future iteration
- NO per-pixel bias maps — single scalar per camera
- NO new persistent config files for biases — runtime-only estimation
- NO changes to ICP convergence criteria or safety gate thresholds
- NO weakening of existing acceptance gates to "make it pass"
- NO over-engineered normal estimation pipelines — use existing normals or camera-ray direction
- NO temporal drift compensation
- NO changes to the depth HDF5 schema
---
## Verification Strategy (MANDATORY)
> **UNIVERSAL RULE: ZERO HUMAN INTERVENTION**
>
> ALL tasks in this plan MUST be verifiable WITHOUT any human action.
### Test Decision
- **Infrastructure exists**: YES
- **Automated tests**: YES (tests after implementation)
- **Framework**: pytest + numpy assertions (existing)
### Agent-Executed QA Scenarios (MANDATORY — ALL tasks)
> Every task includes Agent-Executed QA Scenarios as the PRIMARY verification method.
> The executing agent directly runs the deliverable and verifies it.
**Verification Tool by Deliverable Type:**
| Type | Tool | How Agent Verifies |
|------|------|-------------------|
| **Python module** | Bash (uv run pytest) | Run targeted tests, assert pass |
| **CLI integration** | Bash (uv run refine_ground_plane.py) | Run with flags, check output JSON |
| **Type safety** | Bash (uv run basedpyright) | Run type checker, count new errors |
---
## Execution Strategy
### Parallel Execution Waves
```
Wave 1 (Start Immediately):
├── Task 1: Implement estimate_depth_biases() function
└── (sequential dependency chain follows)
Wave 2 (After Task 4):
├── Task 5: Write tests (tests/test_depth_bias.py)
└── Task 6: Update documentation
```
### Dependency Matrix
| Task | Depends On | Blocks | Can Parallelize With |
|------|------------|--------|---------------------|
| 1 | None | 2, 3, 5 | None |
| 2 | 1 | 3 | None |
| 3 | 2 | 4 | None |
| 4 | 3 | 5, 6 | None |
| 5 | 4 | 7 | 6 |
| 6 | 4 | 7 | 5 |
| 7 | 5, 6 | None | None |
### Agent Dispatch Summary
| Wave | Tasks | Recommended Agents |
|------|-------|-------------------|
| 1 | 1, 2, 3, 4 (sequential chain) | task(category="deep", load_skills=[], run_in_background=false) |
| 2 | 5, 6 | task(category="quick", ...) in parallel |
| 3 | 7 | task(category="quick", ...) final verification |
---
## TODOs
- [x] 1. Implement `estimate_depth_biases()` function
**What to do**:
- Add `estimate_depth_biases()` to `aruco/icp_registration.py`
- Function signature:
```python
def estimate_depth_biases(
camera_data: Dict[str, Dict[str, Any]],
extrinsics: Dict[str, Mat44],
floor_planes: Dict[str, FloorPlane],
config: ICPConfig,
reference_serial: Optional[str] = None,
) -> Dict[str, float]:
```
- Algorithm:
1. For each camera: unproject depth to world points (reuse existing `unproject_depth_to_points` + extrinsics transform, stride=4)
2. Also compute camera-ray directions in world: `ray_dir_world = R @ ray_dir_cam` where `ray_dir_cam = normalize([x_cam, y_cam, z_cam])`
3. For each overlapping pair (i, j): use `compute_overlap_xz` or `compute_overlap_3d` (match the config.overlap_mode setting) to check overlap
4. Build KDTree on target cloud, find nearest neighbors for source points within `3 * config.voxel_size` distance
5. For each correspondence (src_k, tgt_k): compute signed residual projected onto source camera ray: `β_k = (tgt_k - src_k) · ray_dir_src_k`
6. Take robust median of β_k values per pair → `pairwise_bias[(i,j)]`
7. Gate: reject pairs with <100 valid correspondences
8. Solve global system: for each pair (i,j) with median bias `b_ij`, the relationship is `β_j - β_i ≈ b_ij`. Fix reference camera β_ref = 0. Solve via `np.linalg.lstsq` for N-1 unknowns.
9. Cap |β| at a configurable maximum (default: 0.3m) — reject implausible biases
10. For cameras disconnected from reference in the overlap graph, set β=0 (safe fallback)
11. Log all estimated biases: `logger.info(f"Depth bias for {serial}: {bias:.4f}m")`
- Return: `Dict[str, float]` mapping serial number → bias offset in meters
**Must NOT do**:
- Do NOT implement affine (scale+offset) — offset only
- Do NOT modify `unproject_depth_to_points()` itself
- Do NOT persist biases to disk
- Do NOT use floor-plane d-differences as the primary estimation (only full overlap residuals)
**Recommended Agent Profile**:
- **Category**: `deep`
- Reason: Core algorithmic work requiring careful math (ray projection, global solve, robust statistics)
- **Skills**: `[]`
- No special skills needed — pure Python/NumPy/Open3D work
- **Skills Evaluated but Omitted**:
- `playwright`: No browser interaction
- `frontend-ui-ux`: No UI work
**Parallelization**:
- **Can Run In Parallel**: NO
- **Parallel Group**: Sequential (first task)
- **Blocks**: Tasks 2, 3, 5
- **Blocked By**: None
**References** (CRITICAL):
**Pattern References** (existing code to follow):
- `aruco/icp_registration.py:562-598` — Point cloud creation loop in `refine_with_icp()`. Shows how to iterate cameras, unproject, transform to world. The new function should follow the EXACT same unprojection + world transform pattern.
- `aruco/icp_registration.py:603-622` — Overlap checking loop. Shows how pairs are enumerated and overlap is computed. Reuse same logic for bias estimation pairs.
- `aruco/icp_registration.py:75-87` — `preprocess_point_cloud()` for SOR + voxel downsampling pattern
- `aruco/icp_registration.py:240-290` — `compute_overlap_xz()` and `compute_overlap_3d()` implementations
**API/Type References** (contracts to implement against):
- `aruco/icp_registration.py:20-48` — `ICPConfig` dataclass. The new function should use config fields like `voxel_size`, `overlap_margin`, `min_overlap_area`, `overlap_mode`.
- `aruco/ground_plane.py:20-23` — `FloorPlane` dataclass (normal, d, num_inliers)
- `aruco/ground_plane.py:71-111` — `unproject_depth_to_points()` — input/output contract: depth_map (H,W) float32 meters + K (3,3) → points (N,3) float64 in camera frame
**Documentation References**:
- `docs/icp-depth-bias-diagnosis.md` — Full diagnosis with measured bias values. The estimated biases should be in the same ballpark (0.038m0.137m between pairs).
**WHY Each Reference Matters**:
- Lines 562-598: MUST match the same unprojection and world-transform code exactly so that bias estimation and bias application see the same point clouds
- Lines 603-622: Reuse overlap logic to ensure bias is estimated only for pairs that will actually be registered by ICP
- FloorPlane/ICPConfig: Must use same config parameters to avoid inconsistency between estimation and registration
**Acceptance Criteria**:
> **AGENT-EXECUTABLE VERIFICATION ONLY**
- [ ] Function `estimate_depth_biases` exists and is importable: `python -c "from aruco.icp_registration import estimate_depth_biases"`
- [ ] Function returns `Dict[str, float]` type
- [ ] Reference camera has bias exactly 0.0
- [ ] `uv run basedpyright aruco/icp_registration.py` — no new type errors introduced
**Agent-Executed QA Scenarios:**
```
Scenario: Function is importable and callable
Tool: Bash
Preconditions: None
Steps:
1. uv run python -c "from aruco.icp_registration import estimate_depth_biases; print('OK')"
2. Assert: stdout contains "OK"
3. Assert: exit code 0
Expected Result: Function imports successfully
Evidence: Terminal output captured
Scenario: Type check passes
Tool: Bash
Preconditions: Implementation complete
Steps:
1. uv run basedpyright aruco/icp_registration.py 2>&1 | grep -c "error" || true
2. Compare error count with baseline (before changes)
Expected Result: No new type errors
Evidence: basedpyright output captured
```
**Commit**: YES
- Message: `feat(icp): add per-camera depth bias estimation function`
- Files: `aruco/icp_registration.py`
- Pre-commit: `uv run basedpyright aruco/icp_registration.py`
---
- [x] 2. Integrate bias correction into `refine_with_icp()`
**What to do**:
- At the top of `refine_with_icp()`, after the serials/reference camera setup but BEFORE the point cloud creation loop:
1. Call `estimate_depth_biases(camera_data, extrinsics, floor_planes, config)` to get biases
2. Log estimated biases
- In the existing point cloud creation loop (line ~562-598), BEFORE the `unproject_depth_to_points()` call:
1. Copy the depth map: `depth_corrected = data["depth"].copy()`
2. Apply bias: `depth_corrected += biases.get(serial, 0.0)`
3. Clamp invalid values: `depth_corrected[depth_corrected <= 0] = np.nan`
4. Pass `depth_corrected` (not `data["depth"]`) to `unproject_depth_to_points()`
- Add `depth_bias: bool = True` field to `ICPConfig` dataclass (default True)
- Gate the bias estimation: `if config.depth_bias: ... else: biases = {}`
- Store estimated biases in `ICPMetrics` for downstream reporting:
- Add field `depth_biases: Dict[str, float] = field(default_factory=dict)` to `ICPMetrics`
**Must NOT do**:
- Do NOT modify `unproject_depth_to_points()` signature or behavior
- Do NOT change depth_map in-place (always `.copy()` first)
- Do NOT change any ICP parameters, gates, or thresholds
- Do NOT apply bias correction to the ground-plane refinement step (only ICP)
**Recommended Agent Profile**:
- **Category**: `quick`
- Reason: Straightforward integration — calling existing function, applying simple arithmetic, gating with a bool
- **Skills**: `[]`
**Parallelization**:
- **Can Run In Parallel**: NO
- **Parallel Group**: Sequential (after Task 1)
- **Blocks**: Task 3
- **Blocked By**: Task 1
**References**:
**Pattern References**:
- `aruco/icp_registration.py:540-598` — The `refine_with_icp()` function. Specifically the loop starting at line 562 where `data["depth"]` is accessed. This is where bias must be inserted.
- `aruco/icp_registration.py:20-48` — `ICPConfig` dataclass — add `depth_bias: bool = True` field here
- `aruco/icp_registration.py:61-73` — `ICPMetrics` dataclass — add `depth_biases: Dict[str, float]` field here
**WHY Each Reference Matters**:
- Line 569: The EXACT line where `data["depth"]` is passed to unprojection — this is where we insert `depth_corrected`
- ICPConfig/ICPMetrics: Must extend these dataclasses consistently with existing field patterns
**Acceptance Criteria**:
- [ ] `ICPConfig` has `depth_bias: bool` field with default `True`
- [ ] `ICPMetrics` has `depth_biases: Dict[str, float]` field
- [ ] When `config.depth_bias=True`, biases are estimated and applied before unprojection
- [ ] When `config.depth_bias=False`, no bias estimation occurs, depth maps are unmodified
- [ ] Original `data["depth"]` is never modified in-place (uses `.copy()`)
- [ ] Depth values ≤ 0 after bias application are set to NaN
**Agent-Executed QA Scenarios:**
```
Scenario: Bias field exists in ICPConfig
Tool: Bash
Preconditions: Task 1 and 2 complete
Steps:
1. uv run python -c "from aruco.icp_registration import ICPConfig; c = ICPConfig(); print(c.depth_bias)"
2. Assert: stdout contains "True"
Expected Result: Default is True
Evidence: Terminal output
Scenario: Biases stored in metrics
Tool: Bash
Preconditions: Task 2 complete
Steps:
1. uv run python -c "from aruco.icp_registration import ICPMetrics; m = ICPMetrics(); print(type(m.depth_biases))"
2. Assert: stdout contains "dict"
Expected Result: Field exists and is a dict
Evidence: Terminal output
```
**Commit**: YES
- Message: `feat(icp): integrate depth bias correction into refine_with_icp pipeline`
- Files: `aruco/icp_registration.py`
- Pre-commit: `uv run basedpyright aruco/icp_registration.py`
---
- [x] 3. Wire CLI flag in `refine_ground_plane.py`
**What to do**:
- Add Click option:
```python
@click.option(
"--icp-depth-bias/--no-icp-depth-bias",
default=True,
help="Estimate and correct per-camera depth biases before ICP registration.",
)
```
- Add `icp_depth_bias: bool` parameter to `main()` function
- Pass to ICPConfig: `depth_bias=icp_depth_bias`
- After ICP runs, log bias results from `icp_metrics.depth_biases` if available
- Add bias info to the per-camera diagnostics JSON output (existing pattern at lines 301-320)
**Must NOT do**:
- Do NOT add a separate bias estimation CLI command
- Do NOT add a --depth-biases-file input option (runtime-only)
- Do NOT change existing CLI flag defaults
**Recommended Agent Profile**:
- **Category**: `quick`
- Reason: Mechanical wiring — adding a click.option and passing it through
- **Skills**: `[]`
**Parallelization**:
- **Can Run In Parallel**: NO
- **Parallel Group**: Sequential (after Task 2)
- **Blocks**: Task 4
- **Blocked By**: Task 2
**References**:
**Pattern References**:
- `refine_ground_plane.py:89-155` — Existing ICP CLI flags pattern. The new flag should follow the exact same `--icp-*` naming convention and be placed after the existing ICP options.
- `refine_ground_plane.py:270-281` — Where `ICPConfig` is constructed. Add `depth_bias=icp_depth_bias` here.
- `refine_ground_plane.py:290-296` — Where `icp_metrics` is logged. Add bias logging here.
- `refine_ground_plane.py:301-320` — Per-camera diagnostics JSON output. Add bias values here.
**WHY Each Reference Matters**:
- Lines 89-155: Must match naming pattern (`--icp-depth-bias` not `--depth-bias`)
- Lines 270-281: ICPConfig constructor — must add the new field here
- Lines 290-320: Existing logging/output patterns to extend, not reinvent
**Acceptance Criteria**:
- [ ] `--icp-depth-bias` flag exists and defaults to True
- [ ] `--no-icp-depth-bias` disables bias correction
- [ ] `uv run refine_ground_plane.py --help` shows the new flag
- [ ] Bias values appear in output JSON diagnostics when bias correction runs
**Agent-Executed QA Scenarios:**
```
Scenario: CLI flag appears in help
Tool: Bash
Preconditions: Task 3 complete
Steps:
1. uv run python refine_ground_plane.py --help
2. Assert: output contains "--icp-depth-bias"
3. Assert: output contains "--no-icp-depth-bias"
4. Assert: output contains "depth biases"
Expected Result: Flag documented in help
Evidence: Help output captured
```
**Commit**: YES
- Message: `feat(cli): add --icp-depth-bias flag to refine_ground_plane`
- Files: `refine_ground_plane.py`
- Pre-commit: `uv run basedpyright refine_ground_plane.py`
---
- [x] 4. End-to-end validation on real data
**What to do**:
- Run the full pipeline with bias correction enabled:
```bash
uv run refine_ground_plane.py \
--input-extrinsics output/extrinsics.json \
--input-depth output/depth_data.h5 \
--output-extrinsics output/extrinsics_bias_corrected.json \
--icp --icp-region hybrid --icp-depth-bias --debug
```
- Verify from logs:
1. Estimated biases are logged for each camera
2. Bias magnitudes are in the expected range (0.03m0.15m for non-reference cameras)
3. ICP fitness/RMSE metrics improve compared to `--no-icp-depth-bias` run
4. At least 1 non-reference camera is accepted (currently 0 without bias correction)
- Run comparison without bias correction:
```bash
uv run refine_ground_plane.py \
--input-extrinsics output/extrinsics.json \
--input-depth output/depth_data.h5 \
--output-extrinsics output/extrinsics_no_bias.json \
--icp --icp-region hybrid --no-icp-depth-bias --debug
```
- Compare outputs: the bias-corrected run should show lower residuals and more accepted cameras
- If bias correction does NOT improve acceptance, log the diagnostic info and investigate:
- Are estimated biases in reasonable range?
- Are ICP fitness scores higher with bias correction?
- Are safety gates still too tight?
**Must NOT do**:
- Do NOT relax safety gate thresholds to force acceptance
- Do NOT modify any code in this task — this is validation only
- Do NOT declare failure if improvement is partial — any improvement validates the approach
**Recommended Agent Profile**:
- **Category**: `deep`
- Reason: Requires careful analysis of log output and comparison between runs
- **Skills**: `[]`
**Parallelization**:
- **Can Run In Parallel**: NO
- **Parallel Group**: Sequential (after Task 3)
- **Blocks**: Tasks 5, 6
- **Blocked By**: Task 3
**References**:
**Pattern References**:
- `README.md` — "Ground Plane Refinement" section shows the canonical e2e command
- `docs/icp-depth-bias-diagnosis.md` — Baseline bias measurements (0.038m0.137m) to compare against
**Data References**:
- `output/extrinsics.json` — Input extrinsics
- `output/depth_data.h5` — Input depth data
**WHY Each Reference Matters**:
- README: Canonical command format for the pipeline
- Diagnosis doc: Baseline numbers — estimated biases should roughly match diagnosed biases
**Acceptance Criteria**:
- [ ] Pipeline completes without errors with `--icp-depth-bias`
- [ ] Estimated biases are logged for each camera
- [ ] Bias magnitudes are plausible (0.01m0.20m for non-reference cameras)
- [ ] Camera `44435674` shows the largest bias (consistent with diagnosis)
- [ ] ICP fitness scores are ≥ as good as without bias correction
- [ ] `num_cameras_optimized` ≥ 1 (improvement over current 0)
**Agent-Executed QA Scenarios:**
```
Scenario: E2E with bias correction enabled
Tool: Bash
Preconditions: Tasks 1-3 complete, output/extrinsics.json and output/depth_data.h5 exist
Steps:
1. uv run refine_ground_plane.py \
--input-extrinsics output/extrinsics.json \
--input-depth output/depth_data.h5 \
--output-extrinsics output/extrinsics_bias_corrected.json \
--icp --icp-region hybrid --icp-depth-bias --debug 2>&1 | tee /tmp/bias_on.log
2. Assert: exit code 0
3. grep "Depth bias for" /tmp/bias_on.log → Assert: at least 3 camera biases logged
4. grep "num_cameras_optimized" /tmp/bias_on.log or check output JSON
5. Assert: output file output/extrinsics_bias_corrected.json exists
Expected Result: Pipeline completes, biases estimated, at least partial ICP success
Evidence: /tmp/bias_on.log captured
Scenario: E2E comparison without bias correction
Tool: Bash
Preconditions: Same as above
Steps:
1. uv run refine_ground_plane.py \
--input-extrinsics output/extrinsics.json \
--input-depth output/depth_data.h5 \
--output-extrinsics output/extrinsics_no_bias.json \
--icp --icp-region hybrid --no-icp-depth-bias --debug 2>&1 | tee /tmp/bias_off.log
2. Assert: exit code 0
3. Compare acceptance counts between bias_on.log and bias_off.log
Expected Result: bias_on shows equal or better acceptance than bias_off
Evidence: /tmp/bias_off.log captured
Scenario: Toggle isolation — no-bias matches baseline
Tool: Bash
Preconditions: Current code has been committed before changes
Steps:
1. Compare output/extrinsics_no_bias.json with a pre-change baseline run
2. Assert: Pose matrices match within floating-point tolerance (1e-6)
Expected Result: --no-icp-depth-bias produces identical results to code before this feature
Evidence: Comparison output captured
```
**Commit**: NO (validation only — no code changes)
---
- [x] 5. Write tests (`tests/test_depth_bias.py`)
**What to do**:
- Create `tests/test_depth_bias.py` with the following test cases:
**A. Bias Estimation Math Tests:**
- `test_estimate_biases_two_cameras_known_offset`: Create two synthetic cameras with overlapping box point clouds. Camera B's depth is shifted by +0.05m. Assert `estimate_depth_biases` returns β_B ≈ 0.05m (±2mm) and β_ref = 0.0.
- `test_estimate_biases_sign_correctness`: Camera B depth shifted by -0.08m. Assert β_B ≈ -0.08m. Ensures sign convention is correct.
- `test_estimate_biases_four_cameras`: 4 synthetic cameras with known offsets [0, 0.05, 0.12, -0.03]. Assert all recovered within ±3mm.
**B. Global Solve Tests:**
- `test_bias_solve_overdetermined`: 4 cameras, 6 pairwise medians, solve N-1=3 unknowns. Assert least-squares solution matches known biases.
- `test_bias_solve_disconnected_camera`: One camera has no overlap with any other. Assert it gets β=0.
**C. Robustness Tests:**
- `test_estimate_biases_robust_to_outliers`: Inject 25% random outlier correspondences. Assert recovered bias within ±10mm of true value.
- `test_estimate_biases_min_correspondences`: Pair with only 50 correspondences (below 100 threshold). Assert pair is excluded from solve.
**D. Integration Tests:**
- `test_bias_application_preserves_nan`: Depth map with NaN regions. After bias application, NaN regions remain NaN.
- `test_bias_application_clamps_negative`: Depth map with values near 0. After applying negative bias, values ≤ 0 become NaN.
- `test_bias_toggle_off`: With `config.depth_bias=False`, assert `estimate_depth_biases` is not called and depth maps are unmodified (monkeypatch the function and assert not called).
- `test_refine_with_icp_with_bias_synthetic`: Extend existing synthetic ICP test to include a known depth offset, verify that bias correction improves ICP convergence.
**E. Type Safety:**
- `test_types_pass`: Run `basedpyright` and assert no new errors.
**Must NOT do**:
- Do NOT require real camera data (all synthetic)
- Do NOT require network/hardware access
- Do NOT modify existing tests
- Do NOT create tests that depend on specific floating-point values (use tolerances)
**Recommended Agent Profile**:
- **Category**: `unspecified-high`
- Reason: Many test cases with careful synthetic data construction and assertion design
- **Skills**: `[]`
**Parallelization**:
- **Can Run In Parallel**: YES
- **Parallel Group**: Wave 2 (with Task 6)
- **Blocks**: Task 7
- **Blocked By**: Task 4
**References**:
**Test References** (testing patterns to follow):
- `tests/test_icp_registration.py` — Shows how to create synthetic box point clouds with `create_box_pcd()`, mock `unproject_depth_to_points`, build `ICPConfig`, and test `refine_with_icp`. FOLLOW THIS PATTERN EXACTLY for test structure, fixtures, and assertion style.
- `tests/test_depth_refine.py` — Shows how to create constant depth maps (`np.full((H,W), Z)`), mock intrinsics, and test depth-based optimization. Use this pattern for bias application tests.
- `tests/test_ground_plane.py` — Shows FloorPlane fixtures and consensus plane testing patterns.
**API References**:
- `aruco/icp_registration.py:estimate_depth_biases` — Function under test (signature from Task 1)
- `aruco/icp_registration.py:ICPConfig` — Config object to construct for tests
- `aruco/icp_registration.py:ICPMetrics` — Metrics object to verify bias storage
**WHY Each Reference Matters**:
- `test_icp_registration.py`: CRITICAL — synthetic PCD creation patterns. Reuse `create_box_pcd`, `monkeypatch` patterns for `unproject_depth_to_points`
- `test_depth_refine.py`: Constant depth map creation pattern for testing bias application
- `test_ground_plane.py`: FloorPlane fixture construction for test setup
**Acceptance Criteria**:
- [ ] `tests/test_depth_bias.py` exists
- [ ] `uv run pytest tests/test_depth_bias.py -v` — all tests pass
- [ ] At least 10 test functions covering estimation, solve, robustness, integration, and toggle
- [ ] No test requires hardware or network access
**Agent-Executed QA Scenarios:**
```
Scenario: All bias tests pass
Tool: Bash
Preconditions: Tasks 1-4 complete, test file written
Steps:
1. uv run pytest tests/test_depth_bias.py -v
2. Assert: exit code 0
3. Assert: output shows ≥10 passed tests
4. Assert: output shows 0 failed tests
Expected Result: All tests green
Evidence: pytest output captured
Scenario: Full test suite still passes
Tool: Bash
Preconditions: New tests written
Steps:
1. uv run pytest -x -q
2. Assert: exit code 0
3. Assert: no failures
Expected Result: No regressions
Evidence: pytest output captured
```
**Commit**: YES
- Message: `test(icp): add comprehensive depth bias correction tests`
- Files: `tests/test_depth_bias.py`
- Pre-commit: `uv run pytest tests/test_depth_bias.py -v`
---
- [x] 6. Update documentation
**What to do**:
- Update `README.md`:
- Add `--icp-depth-bias` to the Options section under "Ground Plane Refinement"
- Add a brief explanation: "Automatically estimates and corrects per-camera depth biases before ICP registration. Enabled by default when --icp is used."
- Add usage example with bias correction
- Update `docs/icp-depth-bias-diagnosis.md`:
- Add a "Remediation Applied" section documenting that offset correction was implemented
- Record post-correction bias measurements (from Task 4 results)
**Must NOT do**:
- Do NOT create new documentation files
- Do NOT add verbose implementation details to README (keep it user-facing)
**Recommended Agent Profile**:
- **Category**: `quick`
- Reason: Documentation updates — straightforward text edits
- **Skills**: `[]`
**Parallelization**:
- **Can Run In Parallel**: YES
- **Parallel Group**: Wave 2 (with Task 5)
- **Blocks**: Task 7
- **Blocked By**: Task 4
**References**:
**Documentation References**:
- `README.md` — "Ground Plane Refinement" section, specifically the Options list starting around the `--icp` description
- `docs/icp-depth-bias-diagnosis.md` — Existing diagnosis document to update with remediation results
**WHY Each Reference Matters**:
- README: Users need to know the new flag exists and what it does
- Diagnosis doc: Closes the loop on the diagnosis by documenting the fix and its effectiveness
**Acceptance Criteria**:
- [ ] README mentions `--icp-depth-bias` flag
- [ ] README has usage example with bias correction
- [ ] Diagnosis doc has "Remediation Applied" section
**Agent-Executed QA Scenarios:**
```
Scenario: README contains new flag documentation
Tool: Bash
Preconditions: Task 6 complete
Steps:
1. grep -c "icp-depth-bias" README.md
2. Assert: count ≥ 2 (flag name + description)
Expected Result: Flag is documented
Evidence: grep output captured
```
**Commit**: YES
- Message: `docs: document per-camera depth bias correction feature`
- Files: `README.md`, `docs/icp-depth-bias-diagnosis.md`
---
- [x] 7. Final verification pass
**What to do**:
- Run full test suite: `uv run pytest -x -v`
- Run type checker: `uv run basedpyright`
- Run e2e one final time with bias correction to confirm nothing regressed:
```bash
uv run refine_ground_plane.py \
--input-extrinsics output/extrinsics.json \
--input-depth output/depth_data.h5 \
--output-extrinsics output/extrinsics_final.json \
--icp --icp-region hybrid --icp-depth-bias
```
- Verify output extrinsics are valid (parseable JSON, 4x4 matrices)
**Must NOT do**:
- Do NOT make any code changes in this task
- Do NOT weaken any tests or gates
**Recommended Agent Profile**:
- **Category**: `quick`
- Reason: Pure verification — running existing commands and checking output
- **Skills**: `[]`
**Parallelization**:
- **Can Run In Parallel**: NO
- **Parallel Group**: Sequential (final task)
- **Blocks**: None
- **Blocked By**: Tasks 5, 6
**References**:
- `README.md` — Canonical e2e commands
- `pyproject.toml` — Test configuration
**Acceptance Criteria**:
- [ ] `uv run pytest -x -v` — all tests pass (0 failures)
- [ ] `uv run basedpyright` — no new errors beyond existing baseline
- [ ] E2E pipeline completes without errors
- [ ] Output JSON is valid and contains 4x4 pose matrices
**Agent-Executed QA Scenarios:**
```
Scenario: Full test suite passes
Tool: Bash
Steps:
1. uv run pytest -x -v
2. Assert: exit code 0
3. Count total tests passed
Expected Result: All tests pass
Evidence: pytest output captured
Scenario: Type checker passes
Tool: Bash
Steps:
1. uv run basedpyright 2>&1 | tail -5
2. Assert: no new errors
Expected Result: Clean type check
Evidence: basedpyright output captured
Scenario: E2E final run
Tool: Bash
Steps:
1. uv run refine_ground_plane.py \
--input-extrinsics output/extrinsics.json \
--input-depth output/depth_data.h5 \
--output-extrinsics output/extrinsics_final.json \
--icp --icp-region hybrid --icp-depth-bias
2. Assert: exit code 0
3. python -c "import json; d=json.load(open('output/extrinsics_final.json')); print(len(d))"
4. Assert: valid JSON with camera entries
Expected Result: Pipeline runs clean
Evidence: Output file verified
```
**Commit**: NO (verification only)
---
## Commit Strategy
| After Task | Message | Files | Verification |
|------------|---------|-------|--------------|
| 1 | `feat(icp): add per-camera depth bias estimation function` | `aruco/icp_registration.py` | `uv run basedpyright aruco/icp_registration.py` |
| 2 | `feat(icp): integrate depth bias correction into refine_with_icp pipeline` | `aruco/icp_registration.py` | `uv run basedpyright aruco/icp_registration.py` |
| 3 | `feat(cli): add --icp-depth-bias flag to refine_ground_plane` | `refine_ground_plane.py` | `uv run basedpyright refine_ground_plane.py` |
| 5 | `test(icp): add comprehensive depth bias correction tests` | `tests/test_depth_bias.py` | `uv run pytest tests/test_depth_bias.py -v` |
| 6 | `docs: document per-camera depth bias correction feature` | `README.md`, `docs/icp-depth-bias-diagnosis.md` | `grep "icp-depth-bias" README.md` |
---
## Success Criteria
### Verification Commands
```bash
# All tests pass
uv run pytest -x -v # Expected: 0 failures
# Type check clean
uv run basedpyright # Expected: no new errors
# E2E with bias correction
uv run refine_ground_plane.py \
--input-extrinsics output/extrinsics.json \
--input-depth output/depth_data.h5 \
--output-extrinsics output/extrinsics_final.json \
--icp --icp-region hybrid --icp-depth-bias --debug
# Expected: num_cameras_optimized >= 1, bias values logged
# Toggle isolation
uv run refine_ground_plane.py \
--input-extrinsics output/extrinsics.json \
--input-depth output/depth_data.h5 \
--output-extrinsics output/extrinsics_no_bias.json \
--icp --icp-region hybrid --no-icp-depth-bias
# Expected: identical to pre-feature behavior
```
### Final Checklist
- [x] All "Must Have" present (bias estimation, robust median, reference camera, NaN handling, CLI toggle, logging)
- [x] All "Must NOT Have" absent (no affine model, no persistent bias files, no gate weakening)
- [x] All tests pass (existing + new)
- [x] E2E shows measurable improvement in ICP acceptance
- [x] Documentation updated
@@ -65,11 +65,11 @@ Replace the floor-band-only ICP pipeline with a configurable region selection sy
- Updated `README.md`: new flags documented
### Definition of Done
- [ ] `uv run refine_ground_plane.py --help` shows `--icp-region`, `--icp-global-init`, `--icp-min-overlap`, `--icp-band-height`
- [ ] `uv run pytest -x -vv` → all tests pass (existing + new)
- [ ] `uv run basedpyright aruco/icp_registration.py refine_ground_plane.py` → 0 errors
- [ ] `--icp-region floor` produces identical output to current behavior (regression)
- [ ] `--icp-region hybrid` produces ≥ as many converged pairs as floor on test data
- [x] `uv run refine_ground_plane.py --help` shows `--icp-region`, `--icp-global-init`, `--icp-min-overlap`, `--icp-band-height`
- [x] `uv run pytest -x -vv` → all tests pass (existing + new)
- [x] `uv run basedpyright aruco/icp_registration.py refine_ground_plane.py` → 0 errors
- [x] `--icp-region floor` produces identical output to current behavior (regression)
- [x] `--icp-region hybrid` produces ≥ as many converged pairs as floor on test data
### Must Have
- Region selection: `floor`, `hybrid`, `full` modes