Files

T

crosstyan 9c861105f7 feat: add aruco-svo-calibration plan and utils scripts

- Add comprehensive work plan for ArUco-based multi-camera calibration
- Add recording_multi.py for multi-camera SVO recording
- Add streaming_receiver.py for network streaming
- Add svo_playback.py for synchronized playback
- Add zed_network_utils.py for camera configuration
- Add AGENTS.md with project context

2026-02-05 03:17:05 +00:00

4.4 KiB

Raw Blame History

Draft: ArUco-Based Multi-Camera Extrinsic Calibration from SVO

Requirements (confirmed)

Goal

Create a CLI tool that reads synchronized SVO recordings from multiple ZED cameras, detects ArUco markers on a 3D calibration box, computes camera extrinsics relative to the marker world origin, and outputs accurate pose matrices to replace the inaccurate ones in inside_network.json.

Calibration Target

Type: 3D box with 6 diamond board faces
Object points: Defined in aruco/output/standard_box_markers.parquet
Marker dictionary: DICT_4X4_50 (from existing code)
Minimum markers per frame: 4+ (one diamond face worth)

Input

Multiple SVO2 files (one per camera)
Frame sampling: Fixed interval + quality filter
Timestamp-aligned playback (using existing svo_playback.py pattern)

Output

New JSON file with calibrated extrinsics
Format: Similar to inside_network.json but with accurate pose field
Reference frame: Marker is world origin (all cameras expressed relative to ArUco box)

Workflow

CLI with preview: Command-line driven but shows visualization of detected markers
Example: uv run calibrate_extrinsics.py --svos *.svo2 --interval 30 --output calibrated.json

Technical Decisions

Intrinsics Source

Use ZED SDK's pre-calibrated intrinsics from cam.get_camera_information().camera_configuration.calibration_parameters.left_cam
Properties: fx, fy, cx, cy, disto

Pose Estimation

Use cv2.solvePnP with SOLVEPNP_SQPNP flag (from existing code)
Consider solvePnPRansac for per-frame robustness

Outlier Handling (Two-stage)

Per-frame rejection: Reject frames with high reprojection error (threshold ~2-5 pixels)
RANSAC on pose set: After collecting all valid poses, use RANSAC-style consensus

Pose Averaging

Rotation: Use scipy.spatial.transform.Rotation.mean() for geodesic mean
Translation: Use median or weighted mean with MAD-based outlier rejection

Math: Camera-to-World Transform

Each camera sees marker → T_cam_marker (camera-to-marker) World origin = marker, so camera pose in world = T_world_cam = inv(T_cam_marker)

For camera i: T_world_cam_i = inv(T_cam_i_marker)

Research Findings

From Librarian (Multi-camera calibration)

Relative transform: T_BA = T_BM @ inv(T_AM)
Board-based detection improves robustness to occlusion
Use refineDetectedMarkers for corner accuracy
Handle missing views by only computing poses when enough markers visible

From Librarian (Robust averaging)

Use scipy.spatial.transform.Rotation.mean(weights=...) for rotation averaging
Median/MAD on translation for outlier rejection
RANSAC over pose set with rotation angle + translation distance thresholds
Practical thresholds: rotation >2-5°, translation depends on scale

Existing Codebase Patterns

find_extrinsic_object.py: ArUco detection + solvePnP pattern
svo_playback.py: Multi-SVO sync via timestamp alignment
aruco_box.py: Diamond board geometry generation

Open Questions

None remaining

Metis Gap Analysis (Addressed)

Critical Gaps Resolved:

World frame: As defined in standard_box_markers.parquet (origin at box coordinate system)
Image stream: Use rectified LEFT view (no distortion coefficients needed)
Transform convention: Match inside_network.json format - appears to be T_world_from_cam (camera pose in world)
- Format: space-separated 4x4 matrix, row-major
Sync tolerance: Moderate (<33ms, 1 frame at 30fps)

Guardrails Added:

Validate parquet schema early (require marker_id, corners with X,Y,Z in meters)
Use reprojection error as primary quality metric
Require ≥4 markers with sufficient 3D spread (not just coplanar)
Whitelist only expected marker IDs (from parquet)
Add self-check mode with quantitative quality report

Scope Boundaries

INCLUDE

SVO file loading with timestamp sync
ArUco detection on left camera image
Pose estimation using solvePnP
Per-frame quality filtering (reprojection error)
Multi-frame pose averaging with outlier rejection
JSON output with 4x4 pose matrices
Preview visualization showing detected markers and axes
CLI interface with click

EXCLUDE

Right camera processing (use left only for simplicity)
Intrinsic calibration (use pre-calibrated from ZED SDK)
Modifying inside_network.json in-place
GUI-based frame selection
Bundle adjustment refinement
Depth-based verification

4.4 KiB Raw Blame History