feat(calibration): robust depth refinement pipeline with diagnostics and benchmarking

2026-02-07 05:51:07 +00:00
parent ead3796cdb
commit dad1f2a69f
17 changed files with 1876 additions and 261 deletions
@@ -12,6 +12,8 @@ The script calibrates camera extrinsics using ArUco markers detected in SVO reco
 - `--auto-align`: Enables automatic ground plane alignment (opt-in).
 - `--verify-depth`: Enables depth-based verification of computed poses.
 - `--refine-depth`: Enables optimization of poses using depth data (requires `--verify-depth`).
+- `--use-confidence-weights`: Uses ZED depth confidence map to weight residuals in optimization.
+- `--benchmark-matrix`: Runs a comparison of baseline vs. robust refinement configurations.
 - `--max-samples`: Limits the number of processed samples for fast iteration.
 - `--debug`: Enables verbose debug logging (default is INFO).

@@ -63,13 +65,35 @@ This workflow uses the ZED camera's depth map to verify and improve the ArUco-ba
 ### 2. Refinement (`--refine-depth`)
 - **Trigger**: Runs only if verification is enabled and enough valid depth points (>4) are found.
 - **Process**:
-    - Uses `scipy.optimize.minimize` (L-BFGS-B) to adjust the 6-DOF pose parameters (rotation vector + translation vector).
-    - **Objective Function**: Minimizes the squared difference between computed depth and measured depth for all visible marker corners.
+    - Uses `scipy.optimize.least_squares` with a robust loss function (`soft_l1`) to handle outliers.
+    - **Objective Function**: Minimizes the robust residual between computed depth and measured depth for all visible marker corners.
+    - **Confidence Weighting** (`--use-confidence-weights`): If enabled, residuals are weighted by the ZED confidence map (higher confidence = higher weight).
    - **Constraints**: Bounded optimization to prevent drifting too far from the initial ArUco pose (default: ±5 degrees, ±5cm).
 - **Output**:
    - Refined pose replaces the original pose in the JSON output.
    - Improvement stats (delta rotation, delta translation, RMSE reduction) added under `refine_depth`.

+### 3. Best Frame Selection
+When multiple frames are available, the system scores them to pick the best candidate for verification/refinement:
+- **Criteria**:
+    - Number of detected markers (primary factor).
+    - Reprojection error (lower is better).
+    - Valid depth ratio (percentage of marker corners with valid depth data).
+    - Depth confidence (if available).
+- **Benefit**: Ensures refinement uses high-quality data rather than just the last valid frame.
+
+## Benchmark Matrix (`--benchmark-matrix`)
+
+This mode runs a comparative analysis of different refinement configurations on the same data to evaluate improvements. It compares:
+1. **Baseline**: Linear loss (MSE), no confidence weighting.
+2. **Robust**: Soft-L1 loss, no confidence weighting.
+3. **Robust + Confidence**: Soft-L1 loss with confidence-weighted residuals.
+4. **Robust + Confidence + Best Frame**: All of the above, using the highest-scored frame.
+
+**Output:**
+- Prints a summary table for each camera showing RMSE improvement and iteration counts.
+- Adds a `benchmark` object to the JSON output containing detailed stats for each configuration.
+
 ## Fast Iteration (`--max-samples`)

 For development or quick checks, processing thousands of frames is unnecessary.
@@ -78,7 +102,7 @@ For development or quick checks, processing thousands of frames is unnecessary.

 ## Example Workflow

-**Full Run with Alignment and Refinement:**
+**Full Run with Alignment and Robust Refinement:**
 ```bash
 uv run calibrate_extrinsics.py \
  --svo output/recording.svo \
@@ -88,9 +112,19 @@ uv run calibrate_extrinsics.py \
  --ground-marker-id 21 \
  --verify-depth \
  --refine-depth \
+  --use-confidence-weights \
  --output output/calibrated.json
 ```

+**Benchmark Run:**
+```bash
+uv run calibrate_extrinsics.py \
+  --svo output/recording.svo \
+  --markers aruco/markers/box.parquet \
+  --benchmark-matrix \
+  --max-samples 100
+```
+
 **Fast Debug Run:**
 ```bash
 uv run calibrate_extrinsics.py \
@@ -104,89 +138,18 @@ uv run calibrate_extrinsics.py \

 ## Known Unexpected Behavior / Troubleshooting

-### Depth Refinement Failure (Unit Mismatch)
+### Resolved: Depth Refinement Failure (Unit Mismatch)

-**Symptoms:**
+*Note: This issue has been resolved in the latest version by enforcing explicit meter units in the SVO reader and removing ambiguous manual conversions.*
+
+**Previous Symptoms:**
 - `depth_verify` reports extremely large RMSE values (e.g., > 1000).
 - `refine_depth` reports `success: false`, `iterations: 0`, and near-zero improvement.
- The optimization fails to converge or produces nonsensical results.

-**Root Cause:**
-The ZED SDK `retrieve_measure(sl.MEASURE.DEPTH)` returns depth values in the unit defined by `InitParameters.coordinate_units`. The default is **MILLIMETERS**. However, the calibration system (extrinsics, marker geometry) operates in **METERS**.
+**Resolution:**
+The system now explicitly sets `InitParameters.coordinate_units = sl.UNIT.METER` when opening SVO files, ensuring consistent units across the pipeline.

-This scale mismatch (factor of 1000) causes the residuals in the optimization objective function to be massive, breaking the numerical stability of the L-BFGS-B solver.
-
-**Mitigation:**
-The `SVOReader` class in `aruco/svo_sync.py` explicitly converts the retrieved depth map to meters:
-```python
-# aruco/svo_sync.py
-return depth_data / 1000.0
-```
-This ensures that all geometric math downstream remains consistent in meters.
-
-**Diagnostic Check:**
-If you suspect a unit mismatch, check the `depth_verify` RMSE in the output JSON.
- **Healthy:** RMSE < 0.5 (meters)
- **Mismatch:** RMSE > 100 (likely millimeters)
-
-*Note: Confidence filtering (`--depth-confidence-threshold`) is orthogonal to this issue. A unit mismatch affects all valid pixels regardless of confidence.*
-
-## Findings Summary (2026-02-07)
-
-This section summarizes the latest deep investigation across local code, outputs, and external docs.
-
-### Confirmed Facts
-
-1. **Marker geometry parquet is in meters**
-   - `aruco/markers/standard_box_markers_600mm.parquet` stores values around `0.3` (meters), not `300` (millimeters).
-   - `docs/marker-parquet-format.md` also documents meter-scale coordinates.
-
-2. **Depth unit contract is still fragile**
-   - ZED defaults to millimeters unless `InitParameters.coordinate_units` is explicitly set.
-   - Current reader path converts depth by dividing by `1000.0` in `aruco/svo_sync.py`.
-   - This works only if incoming depth is truly millimeters. It can become fragile if unit config changes elsewhere.
-
-3. **Observed runtime behavior still indicates refinement instability**
-   - Existing outputs (for example `output/aligned_refined_extrinsics*.json`) show very large `depth_verify.rmse`, often `refine_depth.success: false`, `iterations: 0`, and negligible improvement.
-   - This indicates that refinement quality is currently limited beyond the original mm↔m mismatch narrative.
-
-4. **Current refinement objective is not robust enough**
-   - Objective is plain squared depth residuals + simple regularization.
-   - It does **not** currently include robust loss (Huber/Soft-L1), confidence weighting in the objective, or strong convergence diagnostics.
-
-### Likely Contributors to Poor Refinement
-
- Depth outliers are not sufficiently down-weighted in optimization.
- Confidence map is used for verification filtering, but not as residual weights in the optimizer objective.
- Representative frame choice uses the latest valid frame, not necessarily the best-quality frame.
- Optimizer diagnostics are limited, making it hard to distinguish "real convergence" from "stuck at initialization".
-
-### Recommended Implementation Order (for next session)
-
-1. **Unit hardening (P0)**
-   - Explicitly set `init_params.coordinate_units = sl.UNIT.METER` in SVO reader.
-   - Remove or guard manual `/1000.0` conversion to avoid double-scaling risk.
-   - Add depth sanity logs (min/median/max sampled depth) under `--debug`.
-
-2. **Robust objective (P0)**
-   - Replace MSE-only residual with Huber (or Soft-L1) in meters.
-   - Add confidence-weighted depth residuals in objective function.
-   - Split translation/rotation regularization coefficients.
-
-3. **Frame quality selection (P1)**
-   - Replace "latest valid frame" with best-frame scoring:
-     - marker count (higher better)
-     - median reprojection error (lower better)
-     - valid depth ratio (higher better)
-
-4. **Diagnostics and acceptance gates (P1)**
-   - Log optimizer termination reason, gradient/step behavior, and effective valid points.
-   - Treat tiny RMSE changes as "no effective refinement" even if optimizer returns.
-
-5. **Benchmark matrix (P1)**
-   - Compare baseline vs robust loss vs robust+confidence vs robust+confidence+best-frame.
-   - Report per-camera pre/post RMSE, iteration count, and success/failure reason.
-
-### Practical note
-
-The previous troubleshooting section correctly explains one important failure mode (unit mismatch), but current evidence shows that **robust objective design and frame quality control** are now the primary bottlenecks for meaningful depth refinement gains.
+### Optimization Stalls
+If `refine_depth` shows `success: false` but `nfev` (evaluations) is high, the optimizer may have hit a flat region or local minimum.
+- **Check**: Look at `termination_message` in the JSON output.
+- **Fix**: Try enabling `--use-confidence-weights` or checking if the initial ArUco pose is too far off (reprojection error > 2.0).