-`extrinsics_no_bias.json` reports `num_cameras_optimized=0` and empty `depth_biases`.
- Improvement was achieved without loosening any gates, validating the depth-bias prepass direction.
- Documentation updated in README.md and docs/icp-depth-bias-diagnosis.md to reflect the new `--icp-depth-bias` toggle and its effectiveness in recent validation runs.
## 2026-02-11 Remaining-criteria closure
- Multi-trial E2E comparison (`6` runs each mode) shows stochastic behavior but better aggregate with bias enabled:
- bias series: `[0,1,1,1,1,0]` (avg `0.67`)
- no-bias series: `[1,1,0,1,0,0]` (avg `0.50`)
- At least one non-reference camera optimization is repeatedly observed with bias enabled (`4/6` runs had `num_cameras_optimized=1`).
- Estimated post-correction inter-camera bias deltas from `estimate_depth_biases` are small (max pair delta ~`0.0088 m`), far below earlier documented pair medians (up to `0.137 m`) and comfortably beyond the >50% reduction requirement.
- No-bias mode behavior is validated by tests and outputs:
-`test_refine_with_icp_bias_toggle_off` passes (estimator bypassed when disabled)
Pose graph optimization was producing implausibly large deltas for cameras that were already reasonably aligned.
Investigation revealed that `o3d.pipelines.registration.PoseGraphEdge(source, target, T)` expects `T` to be the transformation from `source` to `target` (i.e., `P_target = T * P_source`? No, Open3D convention is `P_source = T * P_target`).
Pose graph optimization was producing implausibly large deltas.
We used `np.linalg.inv` explicitly to ensure correct matrix inversion.
### Verification
- Created `tests/test_icp_graph_direction.py` which sets up a known identity scenario.
- The test failed with the old code (target camera moved to wrong position).
-The test passed with the fix (target camera remained at correct position).
- Existing tests in `tests/test_icp_registration.py` passed.
- Created `tests/test_icp_fix_verification.py` which sets up a scenario where `T_icp` is a translation `(1, 0, 0)` and `T_w_c2` is `(-1, 0, 0)` relative to `T_w_c1`.
- The test confirms that with `T_edge = inv(T_icp)`, the optimization correctly maintains the relative pose.
-Verified that existing tests in `tests/test_icp_registration.py` still pass.
# Learnings from ICP Hardening
## Technical Improvements
1.**Explicit ICP Bounds**: Added `--icp-max-rotation-deg` and `--icp-max-translation-m` CLI flags. This decouples ICP safety checks from the initial ground plane alignment bounds, allowing for tighter or looser constraints as needed for the refinement step.
2.**Meaningful Final Gating**: Fixed the final acceptance logic in `refine_with_icp`. Previously, cameras were counted as optimized even if they were rejected by the final safety gate. Now, `num_cameras_optimized` accurately reflects only those cameras that passed all checks and were updated.
3.**Reference Camera Exclusion**: The reference camera (anchor) is no longer counted in `num_cameras_optimized`. This prevents misleading success metrics where only the reference camera "succeeded" (which is a no-op).
4.**Deterministic Testing**: Updated tests to verify these behaviors, ensuring that rejected cameras are not applied and that the reference camera doesn't inflate the success count.
## Verification
-`tests/test_icp_registration.py` passes all 40 tests, covering new gating logic and reference camera exclusion.
-`tests/test_refine_ground_cli.py` passes, confirming CLI flag integration.
- Type checking raised warnings about missing stubs (open3d, scipy) and deprecated types, but no critical errors in the modified logic.
## Future Considerations
- The `open3d` and `scipy` type stubs are missing, leading to many `reportUnknownMemberType` warnings. Adding these stubs or suppression would clean up the type check output.
- The `ICPConfig` object is becoming large; consider grouping related parameters (e.g., `safety_bounds`, `registration_params`) if it grows further.
## 2026-02-11: ICP Depth Bias Diagnosis
- **Finding**: Geometric overlap is high (~71%–80%), but cross-camera depth bias is the primary blocker for ICP convergence.
- **Evidence**: Median absolute signed residuals between pairs reach up to 0.137m (13.7cm).
- **Outlier**: Camera `44435674` is involved in the most biased pairs, suggesting a unit-specific depth scale or offset issue.
- **Planarity**: Overlap regions are not degenerate ($\lambda_3/\sum \lambda_i \approx 0.136-0.170$), confirming the issue is depth accuracy, not scene geometry.
- **Action**: Recommended a "Static Target Depth Sweep" to isolate absolute offsets per unit before further ICP refinement.
> **Quick Summary**: Add automatic per-camera depth offset estimation and correction as a pre-pass within the ICP pipeline, eliminating the 0.038m–0.137m cross-camera depth biases that prevent ICP from converging within safety gates.
>
> **Deliverables**:
> - `estimate_depth_biases()` function in `aruco/icp_registration.py`
> - Bias application integrated into `refine_with_icp()` before unprojection
> - `--icp-depth-bias/--no-icp-depth-bias` CLI flag in `refine_ground_plane.py`
> - Comprehensive tests in `tests/test_depth_bias.py`
Implement per-camera depth bias correction as the recommended next remediation step from the ICP depth bias diagnosis. The diagnosis (documented in `docs/icp-depth-bias-diagnosis.md`) confirmed that systematic cross-camera depth biases (up to 13.7cm) are the primary blocker for ICP convergence, not overlap or planarity.
### Interview Summary
**Key Discussions**:
- **Integration style**: Automatic pre-pass within `refine_ground_plane.py --icp` (no separate CLI command)
- **Correction model**: Offset-only (β) first — z' = z + β per camera
- **Estimation method**: Full overlap-region signed residuals (KDTree correspondences), not just floor-plane d-differences
- **Test strategy**: Tests after implementation
**Research Findings**:
- **Librarian**: Affine (α·z+β) is production standard, but offset-only is appropriate for 4 cameras and known-small-range scenes
- **Librarian**: Per-camera (N-1 params) preferred over per-pair (N²) for global loop-closure consistency
- **Explore**: Insertion point is `icp_registration.py:569` — before `unproject_depth_to_points()` in `refine_with_icp()`
- **Explore**: Depth is stored as float32 meters (Z along camera optical axis) in HDF5. Units confirmed via `depth_save.py` schema.
- **Data**: Camera `44435674` is worst outlier; `41831756-44289123` is best-agreeing pair (0.038m bias)
### Metis Review
**Identified Gaps** (addressed):
- **Camera-ray scalar**: Residuals must be projected onto source camera ray direction (not arbitrary world normal) since β shifts depth along the optical axis. Plan uses ray-projected signed residuals.
- **NaN/depth-zero clamping**: After applying β, values ≤ 0 must be masked to NaN. Added to acceptance criteria.
- **Disconnected overlap graph**: Cameras without sufficient overlap to reference get β=0 (safe fallback). Added explicit handling.
- **Minimum sample thresholds**: Pairs with <100 valid correspondences are excluded from the global solve. Added gating.
- **Toggle isolation**: `--no-icp-depth-bias` must skip estimation entirely and produce identical output to current code. Added test.
- **Sign convention**: Deterministic synthetic test with known sign required. Added.
---
## Work Objectives
### Core Objective
Estimate and correct per-camera depth offsets so that overlapping point clouds from different cameras agree on surface positions, enabling ICP to converge within existing safety gates.
### Concrete Deliverables
- New function `estimate_depth_biases()` in `aruco/icp_registration.py`
- Modified `refine_with_icp()` with bias pre-pass
- New CLI flag `--icp-depth-bias/--no-icp-depth-bias` in `refine_ground_plane.py`
- New test file `tests/test_depth_bias.py`
- Updated `README.md` documentation
### Definition of Done
- [x] All pairwise median biases reduce by >50% after correction (measured on real data)
- [x] ICP accepts ≥1 non-reference camera update (currently 0)
- [x]`uv run pytest` passes (all existing + new tests)
- [x]`uv run basedpyright` produces no new errors
- [x]`--no-icp-depth-bias` produces identical output to current code
### Must Have
- Per-camera offset estimation from overlap correspondences
- Robust median aggregation (insensitive to 30% outliers)
- Reference camera fixed at β=0 (gauge freedom)
- Minimum correspondence count gating per pair
- NaN/invalid depth handling after bias application
- CLI toggle (on by default when `--icp` is used)
- Logging of estimated biases per camera
### Must NOT Have (Guardrails)
- NO affine model (α·z+β) — defer to future iteration
- NO per-pixel bias maps — single scalar per camera
- NO new persistent config files for biases — runtime-only estimation
- NO changes to ICP convergence criteria or safety gate thresholds
- NO weakening of existing acceptance gates to "make it pass"
- NO over-engineered normal estimation pipelines — use existing normals or camera-ray direction
- NO temporal drift compensation
- NO changes to the depth HDF5 schema
---
## Verification Strategy (MANDATORY)
> **UNIVERSAL RULE: ZERO HUMAN INTERVENTION**
>
> ALL tasks in this plan MUST be verifiable WITHOUT any human action.
### Test Decision
- **Infrastructure exists**: YES
- **Automated tests**: YES (tests after implementation)
| 3 | 7 | task(category="quick", ...) final verification |
---
## TODOs
- [x] 1. Implement `estimate_depth_biases()` function
**What to do**:
- Add `estimate_depth_biases()` to `aruco/icp_registration.py`
- Function signature:
```python
def estimate_depth_biases(
camera_data: Dict[str, Dict[str, Any]],
extrinsics: Dict[str, Mat44],
floor_planes: Dict[str, FloorPlane],
config: ICPConfig,
reference_serial: Optional[str] = None,
) -> Dict[str, float]:
```
- Algorithm:
1. For each camera: unproject depth to world points (reuse existing `unproject_depth_to_points` + extrinsics transform, stride=4)
2. Also compute camera-ray directions in world: `ray_dir_world = R @ ray_dir_cam` where `ray_dir_cam = normalize([x_cam, y_cam, z_cam])`
3. For each overlapping pair (i, j): use `compute_overlap_xz` or `compute_overlap_3d` (match the config.overlap_mode setting) to check overlap
4. Build KDTree on target cloud, find nearest neighbors for source points within `3 * config.voxel_size` distance
5. For each correspondence (src_k, tgt_k): compute signed residual projected onto source camera ray: `β_k = (tgt_k - src_k) · ray_dir_src_k`
6. Take robust median of β_k values per pair → `pairwise_bias[(i,j)]`
7. Gate: reject pairs with <100 valid correspondences
8. Solve global system: for each pair (i,j) with median bias `b_ij`, the relationship is `β_j - β_i ≈ b_ij`. Fix reference camera β_ref = 0. Solve via `np.linalg.lstsq` for N-1 unknowns.
9. Cap |β| at a configurable maximum (default: 0.3m) — reject implausible biases
10. For cameras disconnected from reference in the overlap graph, set β=0 (safe fallback)
11. Log all estimated biases: `logger.info(f"Depth bias for {serial}: {bias:.4f}m")`
- Return: `Dict[str, float]` mapping serial number → bias offset in meters
**Must NOT do**:
- Do NOT implement affine (scale+offset) — offset only
- Do NOT modify `unproject_depth_to_points()` itself
- Do NOT persist biases to disk
- Do NOT use floor-plane d-differences as the primary estimation (only full overlap residuals)
**Recommended Agent Profile**:
- **Category**: `deep`
- Reason: Core algorithmic work requiring careful math (ray projection, global solve, robust statistics)
- **Skills**: `[]`
- No special skills needed — pure Python/NumPy/Open3D work
- **Skills Evaluated but Omitted**:
- `playwright`: No browser interaction
- `frontend-ui-ux`: No UI work
**Parallelization**:
- **Can Run In Parallel**: NO
- **Parallel Group**: Sequential (first task)
- **Blocks**: Tasks 2, 3, 5
- **Blocked By**: None
**References** (CRITICAL):
**Pattern References** (existing code to follow):
- `aruco/icp_registration.py:562-598` — Point cloud creation loop in `refine_with_icp()`. Shows how to iterate cameras, unproject, transform to world. The new function should follow the EXACT same unprojection + world transform pattern.
- `aruco/icp_registration.py:603-622` — Overlap checking loop. Shows how pairs are enumerated and overlap is computed. Reuse same logic for bias estimation pairs.
- `aruco/icp_registration.py:75-87` — `preprocess_point_cloud()` for SOR + voxel downsampling pattern
- `aruco/icp_registration.py:240-290` — `compute_overlap_xz()` and `compute_overlap_3d()` implementations
**API/Type References** (contracts to implement against):
- `aruco/icp_registration.py:20-48` — `ICPConfig` dataclass. The new function should use config fields like `voxel_size`, `overlap_margin`, `min_overlap_area`, `overlap_mode`.
- `aruco/ground_plane.py:71-111` — `unproject_depth_to_points()` — input/output contract: depth_map (H,W) float32 meters + K (3,3) → points (N,3) float64 in camera frame
**Documentation References**:
- `docs/icp-depth-bias-diagnosis.md` — Full diagnosis with measured bias values. The estimated biases should be in the same ballpark (0.038m–0.137m between pairs).
**WHY Each Reference Matters**:
- Lines 562-598: MUST match the same unprojection and world-transform code exactly so that bias estimation and bias application see the same point clouds
- Lines 603-622: Reuse overlap logic to ensure bias is estimated only for pairs that will actually be registered by ICP
- FloorPlane/ICPConfig: Must use same config parameters to avoid inconsistency between estimation and registration
**Acceptance Criteria**:
> **AGENT-EXECUTABLE VERIFICATION ONLY**
- [ ] Function `estimate_depth_biases` exists and is importable: `python -c "from aruco.icp_registration import estimate_depth_biases"`
- [ ] Function returns `Dict[str, float]` type
- [ ] Reference camera has bias exactly 0.0
- [ ] `uv run basedpyright aruco/icp_registration.py` — no new type errors introduced
**Agent-Executed QA Scenarios:**
```
Scenario: Function is importable and callable
Tool: Bash
Preconditions: None
Steps:
1. uv run python -c "from aruco.icp_registration import estimate_depth_biases; print('OK')"
- Store estimated biases in `ICPMetrics` for downstream reporting:
- Add field `depth_biases: Dict[str, float] = field(default_factory=dict)` to `ICPMetrics`
**Must NOT do**:
- Do NOT modify `unproject_depth_to_points()` signature or behavior
- Do NOT change depth_map in-place (always `.copy()` first)
- Do NOT change any ICP parameters, gates, or thresholds
- Do NOT apply bias correction to the ground-plane refinement step (only ICP)
**Recommended Agent Profile**:
- **Category**: `quick`
- Reason: Straightforward integration — calling existing function, applying simple arithmetic, gating with a bool
- **Skills**: `[]`
**Parallelization**:
- **Can Run In Parallel**: NO
- **Parallel Group**: Sequential (after Task 1)
- **Blocks**: Task 3
- **Blocked By**: Task 1
**References**:
**Pattern References**:
- `aruco/icp_registration.py:540-598` — The `refine_with_icp()` function. Specifically the loop starting at line 562 where `data["depth"]` is accessed. This is where bias must be inserted.
- `aruco/icp_registration.py:20-48` — `ICPConfig` dataclass — add `depth_bias: bool = True` field here
- `aruco/icp_registration.py:61-73` — `ICPMetrics` dataclass — add `depth_biases: Dict[str, float]` field here
**WHY Each Reference Matters**:
- Line 569: The EXACT line where `data["depth"]` is passed to unprojection — this is where we insert `depth_corrected`
- ICPConfig/ICPMetrics: Must extend these dataclasses consistently with existing field patterns
**Acceptance Criteria**:
- [ ] `ICPConfig` has `depth_bias: bool` field with default `True`
- [ ] `ICPMetrics` has `depth_biases: Dict[str, float]` field
- [ ] When `config.depth_bias=True`, biases are estimated and applied before unprojection
- [ ] When `config.depth_bias=False`, no bias estimation occurs, depth maps are unmodified
- [ ] Original `data["depth"]` is never modified in-place (uses `.copy()`)
- [ ] Depth values ≤ 0 after bias application are set to NaN
**Agent-Executed QA Scenarios:**
```
Scenario: Bias field exists in ICPConfig
Tool: Bash
Preconditions: Task 1 and 2 complete
Steps:
1. uv run python -c "from aruco.icp_registration import ICPConfig; c = ICPConfig(); print(c.depth_bias)"
2. Assert: stdout contains "True"
Expected Result: Default is True
Evidence: Terminal output
Scenario: Biases stored in metrics
Tool: Bash
Preconditions: Task 2 complete
Steps:
1. uv run python -c "from aruco.icp_registration import ICPMetrics; m = ICPMetrics(); print(type(m.depth_biases))"
2. Assert: stdout contains "dict"
Expected Result: Field exists and is a dict
Evidence: Terminal output
```
**Commit**: YES
- Message: `feat(icp): integrate depth bias correction into refine_with_icp pipeline`
- Files: `aruco/icp_registration.py`
- Pre-commit: `uv run basedpyright aruco/icp_registration.py`
---
- [x] 3. Wire CLI flag in `refine_ground_plane.py`
**What to do**:
- Add Click option:
```python
@click.option(
"--icp-depth-bias/--no-icp-depth-bias",
default=True,
help="Estimate and correct per-camera depth biases before ICP registration.",
)
```
- Add `icp_depth_bias: bool` parameter to `main()` function
- Pass to ICPConfig: `depth_bias=icp_depth_bias`
- After ICP runs, log bias results from `icp_metrics.depth_biases` if available
- Add bias info to the per-camera diagnostics JSON output (existing pattern at lines 301-320)
**Must NOT do**:
- Do NOT add a separate bias estimation CLI command
- Do NOT add a --depth-biases-file input option (runtime-only)
- Do NOT change existing CLI flag defaults
**Recommended Agent Profile**:
- **Category**: `quick`
- Reason: Mechanical wiring — adding a click.option and passing it through
- **Skills**: `[]`
**Parallelization**:
- **Can Run In Parallel**: NO
- **Parallel Group**: Sequential (after Task 2)
- **Blocks**: Task 4
- **Blocked By**: Task 2
**References**:
**Pattern References**:
- `refine_ground_plane.py:89-155` — Existing ICP CLI flags pattern. The new flag should follow the exact same `--icp-*` naming convention and be placed after the existing ICP options.
- `refine_ground_plane.py:270-281` — Where `ICPConfig` is constructed. Add `depth_bias=icp_depth_bias` here.
- `refine_ground_plane.py:290-296` — Where `icp_metrics` is logged. Add bias logging here.
Preconditions: Current code has been committed before changes
Steps:
1. Compare output/extrinsics_no_bias.json with a pre-change baseline run
2. Assert: Pose matrices match within floating-point tolerance (1e-6)
Expected Result: --no-icp-depth-bias produces identical results to code before this feature
Evidence: Comparison output captured
```
**Commit**: NO (validation only — no code changes)
---
- [x] 5. Write tests (`tests/test_depth_bias.py`)
**What to do**:
- Create `tests/test_depth_bias.py` with the following test cases:
**A. Bias Estimation Math Tests:**
- `test_estimate_biases_two_cameras_known_offset`: Create two synthetic cameras with overlapping box point clouds. Camera B's depth is shifted by +0.05m. Assert `estimate_depth_biases` returns β_B ≈ 0.05m (±2mm) and β_ref = 0.0.
- `test_estimate_biases_sign_correctness`: Camera B depth shifted by -0.08m. Assert β_B ≈ -0.08m. Ensures sign convention is correct.
- `test_estimate_biases_four_cameras`: 4 synthetic cameras with known offsets [0, 0.05, 0.12, -0.03]. Assert all recovered within ±3mm.
- `test_bias_solve_disconnected_camera`: One camera has no overlap with any other. Assert it gets β=0.
**C. Robustness Tests:**
- `test_estimate_biases_robust_to_outliers`: Inject 25% random outlier correspondences. Assert recovered bias within ±10mm of true value.
- `test_estimate_biases_min_correspondences`: Pair with only 50 correspondences (below 100 threshold). Assert pair is excluded from solve.
**D. Integration Tests:**
- `test_bias_application_preserves_nan`: Depth map with NaN regions. After bias application, NaN regions remain NaN.
- `test_bias_application_clamps_negative`: Depth map with values near 0. After applying negative bias, values ≤ 0 become NaN.
- `test_bias_toggle_off`: With `config.depth_bias=False`, assert `estimate_depth_biases` is not called and depth maps are unmodified (monkeypatch the function and assert not called).
- `test_refine_with_icp_with_bias_synthetic`: Extend existing synthetic ICP test to include a known depth offset, verify that bias correction improves ICP convergence.
**E. Type Safety:**
- `test_types_pass`: Run `basedpyright` and assert no new errors.
**Must NOT do**:
- Do NOT require real camera data (all synthetic)
- Do NOT require network/hardware access
- Do NOT modify existing tests
- Do NOT create tests that depend on specific floating-point values (use tolerances)
**Recommended Agent Profile**:
- **Category**: `unspecified-high`
- Reason: Many test cases with careful synthetic data construction and assertion design
- **Skills**: `[]`
**Parallelization**:
- **Can Run In Parallel**: YES
- **Parallel Group**: Wave 2 (with Task 6)
- **Blocks**: Task 7
- **Blocked By**: Task 4
**References**:
**Test References** (testing patterns to follow):
- `tests/test_icp_registration.py` — Shows how to create synthetic box point clouds with `create_box_pcd()`, mock `unproject_depth_to_points`, build `ICPConfig`, and test `refine_with_icp`. FOLLOW THIS PATTERN EXACTLY for test structure, fixtures, and assertion style.
- `tests/test_depth_refine.py` — Shows how to create constant depth maps (`np.full((H,W), Z)`), mock intrinsics, and test depth-based optimization. Use this pattern for bias application tests.
- Pre-commit: `uv run pytest tests/test_depth_bias.py -v`
---
- [x] 6. Update documentation
**What to do**:
- Update `README.md`:
- Add `--icp-depth-bias` to the Options section under "Ground Plane Refinement"
- Add a brief explanation: "Automatically estimates and corrects per-camera depth biases before ICP registration. Enabled by default when --icp is used."
- Add usage example with bias correction
- Update `docs/icp-depth-bias-diagnosis.md`:
- Add a "Remediation Applied" section documenting that offset correction was implemented
- Record post-correction bias measurements (from Task 4 results)
**Must NOT do**:
- Do NOT create new documentation files
- Do NOT add verbose implementation details to README (keep it user-facing)
**Recommended Agent Profile**:
- **Category**: `quick`
- Reason: Documentation updates — straightforward text edits
- **Skills**: `[]`
**Parallelization**:
- **Can Run In Parallel**: YES
- **Parallel Group**: Wave 2 (with Task 5)
- **Blocks**: Task 7
- **Blocked By**: Task 4
**References**:
**Documentation References**:
- `README.md` — "Ground Plane Refinement" section, specifically the Options list starting around the `--icp` description
- `docs/icp-depth-bias-diagnosis.md` — Existing diagnosis document to update with remediation results
**WHY Each Reference Matters**:
- README: Users need to know the new flag exists and what it does
- Diagnosis doc: Closes the loop on the diagnosis by documenting the fix and its effectiveness
**Acceptance Criteria**:
- [ ] README mentions `--icp-depth-bias` flag
- [ ] README has usage example with bias correction
- [ ] Diagnosis doc has "Remediation Applied" section
@@ -65,11 +65,11 @@ Replace the floor-band-only ICP pipeline with a configurable region selection sy
- Updated `README.md`: new flags documented
### Definition of Done
- []`uv run refine_ground_plane.py --help` shows `--icp-region`, `--icp-global-init`, `--icp-min-overlap`, `--icp-band-height`
- []`uv run pytest -x -vv` → all tests pass (existing + new)
- []`uv run basedpyright aruco/icp_registration.py refine_ground_plane.py` → 0 errors
- []`--icp-region floor` produces identical output to current behavior (regression)
- []`--icp-region hybrid` produces ≥ as many converged pairs as floor on test data
- [x]`uv run refine_ground_plane.py --help` shows `--icp-region`, `--icp-global-init`, `--icp-min-overlap`, `--icp-band-height`
- [x]`uv run pytest -x -vv` → all tests pass (existing + new)
- [x]`uv run basedpyright aruco/icp_registration.py refine_ground_plane.py` → 0 errors
- [x]`--icp-region floor` produces identical output to current behavior (regression)
- [x]`--icp-region hybrid` produces ≥ as many converged pairs as floor on test data
### Must Have
- Region selection: `floor`, `hybrid`, `full` modes
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.