Files
zed-playground/py_workspace/docs/calibrate-extrinsics-workflow.md
T

7.3 KiB

Calibrate Extrinsics Workflow

This document explains the workflow for calibrate_extrinsics.py, focusing on ground plane alignment (--auto-align) and depth-based refinement (--verify-depth, --refine-depth).

CLI Overview

The script calibrates camera extrinsics using ArUco markers detected in SVO recordings.

Key Options:

  • --svo: Path to SVO file(s) or directory containing them.
  • --markers: Path to the marker configuration parquet file.
  • --auto-align: Enables automatic ground plane alignment (opt-in).
  • --verify-depth: Enables depth-based verification of computed poses.
  • --refine-depth: Enables optimization of poses using depth data (requires --verify-depth).
  • --use-confidence-weights: Uses ZED depth confidence map to weight residuals in optimization.
  • --benchmark-matrix: Runs a comparison of baseline vs. robust refinement configurations.
  • --max-samples: Limits the number of processed samples for fast iteration.
  • --debug: Enables verbose debug logging (default is INFO).

Ground Plane Alignment (--auto-align)

When --auto-align is enabled, the script attempts to align the global coordinate system such that a specific face of the marker object becomes the ground plane (XZ plane, normal pointing +Y).

Prerequisites:

  • The marker parquet file MUST contain name and ids columns defining which markers belong to which face (e.g., "top", "bottom", "front").
  • If this metadata is missing, alignment is skipped with a warning.

Decision Flow: The script selects the ground face using the following precedence:

  1. Explicit Face (--ground-face):

    • If you provide --ground-face="bottom", the script looks up the markers for "bottom" in the loaded map.
    • It computes the average normal of those markers and aligns it to the global up vector.
  2. Marker ID Mapping (--ground-marker-id):

    • If you provide --ground-marker-id=21, the script finds which face contains marker 21 (e.g., "bottom").
    • It then proceeds as if --ground-face="bottom" was specified.
  3. Heuristic Detection (Fallback):

    • If neither option is provided, the script analyzes all visible markers.
    • It computes the normal for every defined face.
    • It selects the face whose normal is most aligned with the camera's "down" direction (assuming the camera is roughly upright).

Logging: The script logs the selected decision path for debugging:

  • Mapped ground-marker-id 21 to face 'bottom' (markers=[21])
  • Using explicit ground face 'bottom' (markers=[21])
  • Heuristically detected ground face 'bottom' (markers=[21])

Depth Verification & Refinement

This workflow uses the ZED camera's depth map to verify and improve the ArUco-based pose estimation.

1. Verification (--verify-depth)

  • Input: The computed extrinsic pose (T_{world\_from\_cam}) and the known 3D world coordinates of the marker corners.
  • Process:
    1. Projects marker corners into the camera frame using the computed pose.
    2. Samples the ZED depth map at these projected 2D locations (using a 5x5 median filter for robustness).
    3. Compares the measured depth (ZED) with the computed depth (distance from camera center to projected corner).
  • Output:
    • RMSE (Root Mean Square Error) of the depth residuals.
    • Number of valid points (where depth was available and finite).
    • Added to JSON output under depth_verify.

2. Refinement (--refine-depth)

  • Trigger: Runs only if verification is enabled and enough valid depth points (>4) are found.
  • Process:
    • Uses scipy.optimize.least_squares with a robust loss function (soft_l1) to handle outliers.
    • Objective Function: Minimizes the robust residual between computed depth and measured depth for all visible marker corners.
    • Confidence Weighting (--use-confidence-weights): If enabled, residuals are weighted by the ZED confidence map (higher confidence = higher weight).
    • Constraints: Bounded optimization to prevent drifting too far from the initial ArUco pose (default: ±5 degrees, ±5cm).
  • Output:
    • Refined pose replaces the original pose in the JSON output.
    • Improvement stats (delta rotation, delta translation, RMSE reduction) added under refine_depth.

3. Best Frame Selection

When multiple frames are available, the system scores them to pick the best candidate for verification/refinement:

  • Criteria:
    • Number of detected markers (primary factor).
    • Reprojection error (lower is better).
    • Valid depth ratio (percentage of marker corners with valid depth data).
    • Depth confidence (if available).
  • Benefit: Ensures refinement uses high-quality data rather than just the last valid frame.

Benchmark Matrix (--benchmark-matrix)

This mode runs a comparative analysis of different refinement configurations on the same data to evaluate improvements. It compares:

  1. Baseline: Linear loss (MSE), no confidence weighting.
  2. Robust: Soft-L1 loss, no confidence weighting.
  3. Robust + Confidence: Soft-L1 loss with confidence-weighted residuals.
  4. Robust + Confidence + Best Frame: All of the above, using the highest-scored frame.

Output:

  • Prints a summary table for each camera showing RMSE improvement and iteration counts.
  • Adds a benchmark object to the JSON output containing detailed stats for each configuration.

Fast Iteration (--max-samples)

For development or quick checks, processing thousands of frames is unnecessary.

  • Use --max-samples N to stop after N valid samples (frames where markers were detected).
  • Example: --max-samples 1 will process the first valid frame, run alignment/refinement, save the result, and exit.

Example Workflow

Full Run with Alignment and Robust Refinement:

uv run calibrate_extrinsics.py \
  --svo output/recording.svo \
  --markers aruco/markers/box.parquet \
  --aruco-dictionary DICT_APRILTAG_36h11 \
  --auto-align \
  --ground-marker-id 21 \
  --verify-depth \
  --refine-depth \
  --use-confidence-weights \
  --output output/calibrated.json

Benchmark Run:

uv run calibrate_extrinsics.py \
  --svo output/recording.svo \
  --markers aruco/markers/box.parquet \
  --benchmark-matrix \
  --max-samples 100

Fast Debug Run:

uv run calibrate_extrinsics.py \
  --svo output/ \
  --markers aruco/markers/box.parquet \
  --auto-align \
  --max-samples 1 \
  --debug \
  --no-preview

Known Unexpected Behavior / Troubleshooting

Resolved: Depth Refinement Failure (Unit Mismatch)

Note: This issue has been resolved in the latest version by enforcing explicit meter units in the SVO reader and removing ambiguous manual conversions.

Previous Symptoms:

  • depth_verify reports extremely large RMSE values (e.g., > 1000).
  • refine_depth reports success: false, iterations: 0, and near-zero improvement.

Resolution: The system now explicitly sets InitParameters.coordinate_units = sl.UNIT.METER when opening SVO files, ensuring consistent units across the pipeline.

Optimization Stalls

If refine_depth shows success: false but nfev (evaluations) is high, the optimizer may have hit a flat region or local minimum.

  • Check: Look at termination_message in the JSON output.
  • Fix: Try enabling --use-confidence-weights or checking if the initial ArUco pose is too far off (reprojection error > 2.0).