# Calibrate Extrinsics Workflow This document explains the workflow for `calibrate_extrinsics.py`, focusing on ground plane alignment (`--auto-align`) and depth-based refinement (`--verify-depth`, `--refine-depth`). ## CLI Overview The script calibrates camera extrinsics using ArUco markers detected in SVO recordings. **Key Options:** - `--svo`: Path to SVO file(s) or directory containing them. - `--markers`: Path to the marker configuration parquet file. - `--auto-align`: Enables automatic ground plane alignment (opt-in). - `--verify-depth`: Enables depth-based verification of computed poses. - `--refine-depth`: Enables optimization of poses using depth data (requires `--verify-depth`). - `--use-confidence-weights`: Uses ZED depth confidence map to weight residuals in optimization. - `--benchmark-matrix`: Runs a comparison of baseline vs. robust refinement configurations. - `--max-samples`: Limits the number of processed samples for fast iteration. - `--debug`: Enables verbose debug logging (default is INFO). ## Ground Plane Alignment (`--auto-align`) When `--auto-align` is enabled, the script attempts to align the global coordinate system such that a specific face of the marker object becomes the ground plane (XZ plane, normal pointing +Y). **Prerequisites:** - The marker parquet file MUST contain `name` and `ids` columns defining which markers belong to which face (e.g., "top", "bottom", "front"). - If this metadata is missing, alignment is skipped with a warning. **Decision Flow:** The script selects the ground face using the following precedence: 1. **Explicit Face (`--ground-face`)**: - If you provide `--ground-face="bottom"`, the script looks up the markers for "bottom" in the loaded map. - It computes the average normal of those markers and aligns it to the global up vector. 2. **Marker ID Mapping (`--ground-marker-id`)**: - If you provide `--ground-marker-id=21`, the script finds which face contains marker 21 (e.g., "bottom"). - It then proceeds as if `--ground-face="bottom"` was specified. 3. **Heuristic Detection (Fallback)**: - If neither option is provided, the script analyzes all visible markers. - It computes the normal for every defined face. - It selects the face whose normal is most aligned with the camera's "down" direction (assuming the camera is roughly upright). **Logging:** The script logs the selected decision path for debugging: - `Mapped ground-marker-id 21 to face 'bottom' (markers=[21])` - `Using explicit ground face 'bottom' (markers=[21])` - `Heuristically detected ground face 'bottom' (markers=[21])` ## Depth Verification & Refinement This workflow uses the ZED camera's depth map to verify and improve the ArUco-based pose estimation. ### 1. Verification (`--verify-depth`) - **Input**: The computed extrinsic pose ($T_{world\_from\_cam}$) and the known 3D world coordinates of the marker corners. - **Process**: 1. Projects marker corners into the camera frame using the computed pose. 2. Samples the ZED depth map at these projected 2D locations (using a 5x5 median filter for robustness). 3. Compares the *measured* depth (ZED) with the *computed* depth (distance from camera center to projected corner). - **Output**: - RMSE (Root Mean Square Error) of the depth residuals. - Number of valid points (where depth was available and finite). - Added to JSON output under `depth_verify`. ### 2. Refinement (`--refine-depth`) - **Trigger**: Runs only if verification is enabled and enough valid depth points (>4) are found. - **Process**: - Uses `scipy.optimize.least_squares` with a robust loss function (`soft_l1`) to handle outliers. - **Objective Function**: Minimizes the robust residual between computed depth and measured depth for all visible marker corners. - **Confidence Weighting** (`--use-confidence-weights`): If enabled, residuals are weighted by the ZED confidence map (higher confidence = higher weight). - **Constraints**: Bounded optimization to prevent drifting too far from the initial ArUco pose (default: ±5 degrees, ±5cm). - **Output**: - Refined pose replaces the original pose in the JSON output. - Improvement stats (delta rotation, delta translation, RMSE reduction) added under `refine_depth`. ### 3. Best Frame Selection When multiple frames are available, the system scores them to pick the best candidate for verification/refinement: - **Criteria**: - Number of detected markers (primary factor). - Reprojection error (lower is better). - Valid depth ratio (percentage of marker corners with valid depth data). - Depth confidence (if available). - **Benefit**: Ensures refinement uses high-quality data rather than just the last valid frame. ## Benchmark Matrix (`--benchmark-matrix`) This mode runs a comparative analysis of different refinement configurations on the same data to evaluate improvements. It compares: 1. **Baseline**: Linear loss (MSE), no confidence weighting. 2. **Robust**: Soft-L1 loss, no confidence weighting. 3. **Robust + Confidence**: Soft-L1 loss with confidence-weighted residuals. 4. **Robust + Confidence + Best Frame**: All of the above, using the highest-scored frame. **Output:** - Prints a summary table for each camera showing RMSE improvement and iteration counts. - Adds a `benchmark` object to the JSON output containing detailed stats for each configuration. ## Fast Iteration (`--max-samples`) For development or quick checks, processing thousands of frames is unnecessary. - Use `--max-samples N` to stop after `N` valid samples (frames where markers were detected). - Example: `--max-samples 1` will process the first valid frame, run alignment/refinement, save the result, and exit. ## Example Workflow **Full Run with Alignment and Robust Refinement:** ```bash uv run calibrate_extrinsics.py \ --svo output/recording.svo \ --markers aruco/markers/box.parquet \ --aruco-dictionary DICT_APRILTAG_36h11 \ --auto-align \ --ground-marker-id 21 \ --verify-depth \ --refine-depth \ --use-confidence-weights \ --output output/calibrated.json ``` **Benchmark Run:** ```bash uv run calibrate_extrinsics.py \ --svo output/recording.svo \ --markers aruco/markers/box.parquet \ --benchmark-matrix \ --max-samples 100 ``` **Fast Debug Run:** ```bash uv run calibrate_extrinsics.py \ --svo output/ \ --markers aruco/markers/box.parquet \ --auto-align \ --max-samples 1 \ --debug \ --no-preview ``` ## Known Unexpected Behavior / Troubleshooting ### Resolved: Depth Refinement Failure (Unit Mismatch) *Note: This issue has been resolved in the latest version by enforcing explicit meter units in the SVO reader and removing ambiguous manual conversions.* **Previous Symptoms:** - `depth_verify` reports extremely large RMSE values (e.g., > 1000). - `refine_depth` reports `success: false`, `iterations: 0`, and near-zero improvement. **Resolution:** The system now explicitly sets `InitParameters.coordinate_units = sl.UNIT.METER` when opening SVO files, ensuring consistent units across the pipeline. ### Optimization Stalls If `refine_depth` shows `success: false` but `nfev` (evaluations) is high, the optimizer may have hit a flat region or local minimum. - **Check**: Look at `termination_message` in the JSON output. - **Fix**: Try enabling `--use-confidence-weights` or checking if the initial ArUco pose is too far off (reprojection error > 2.0).