feat: add aruco-svo-calibration plan and utils scripts

- Add comprehensive work plan for ArUco-based multi-camera calibration - Add recording_multi.py for multi-camera SVO recording - Add streaming_receiver.py for network streaming - Add svo_playback.py for synchronized playback - Add zed_network_utils.py for camera configuration - Add AGENTS.md with project context
2026-02-05 03:17:05 +00:00
parent d1e58245a6
commit 9c861105f7
17 changed files with 2071 additions and 0 deletions
@@ -0,0 +1,108 @@
+# Draft: ArUco-Based Multi-Camera Extrinsic Calibration from SVO
+
+## Requirements (confirmed)
+
+### Goal
+Create a CLI tool that reads synchronized SVO recordings from multiple ZED cameras, detects ArUco markers on a 3D calibration box, computes camera extrinsics relative to the marker world origin, and outputs accurate pose matrices to replace the inaccurate ones in `inside_network.json`.
+
+### Calibration Target
+- **Type**: 3D box with 6 diamond board faces
+- **Object points**: Defined in `aruco/output/standard_box_markers.parquet`
+- **Marker dictionary**: `DICT_4X4_50` (from existing code)
+- **Minimum markers per frame**: 4+ (one diamond face worth)
+
+### Input
+- Multiple SVO2 files (one per camera)
+- Frame sampling: Fixed interval + quality filter
+- Timestamp-aligned playback (using existing `svo_playback.py` pattern)
+
+### Output
+- **New JSON file** with calibrated extrinsics
+- Format: Similar to `inside_network.json` but with accurate `pose` field
+- Reference frame: **Marker is world origin** (all cameras expressed relative to ArUco box)
+
+### Workflow
+- **CLI with preview**: Command-line driven but shows visualization of detected markers
+- Example: `uv run calibrate_extrinsics.py --svos *.svo2 --interval 30 --output calibrated.json`
+
+## Technical Decisions
+
+### Intrinsics Source
+- Use ZED SDK's pre-calibrated intrinsics from `cam.get_camera_information().camera_configuration.calibration_parameters.left_cam`
+- Properties: `fx, fy, cx, cy, disto`
+
+### Pose Estimation
+- Use `cv2.solvePnP` with `SOLVEPNP_SQPNP` flag (from existing code)
+- Consider `solvePnPRansac` for per-frame robustness
+
+### Outlier Handling (Two-stage)
+1. **Per-frame rejection**: Reject frames with high reprojection error (threshold ~2-5 pixels)
+2. **RANSAC on pose set**: After collecting all valid poses, use RANSAC-style consensus
+
+### Pose Averaging
+- **Rotation**: Use `scipy.spatial.transform.Rotation.mean()` for geodesic mean
+- **Translation**: Use median or weighted mean with MAD-based outlier rejection
+
+### Math: Camera-to-World Transform
+Each camera sees marker → `T_cam_marker` (camera-to-marker)
+World origin = marker, so camera pose in world = `T_world_cam = inv(T_cam_marker)`
+
+For camera i: `T_world_cam_i = inv(T_cam_i_marker)`
+
+## Research Findings
+
+### From Librarian (Multi-camera calibration)
+- Relative transform: `T_BA = T_BM @ inv(T_AM)`
+- Board-based detection improves robustness to occlusion
+- Use `refineDetectedMarkers` for corner accuracy
+- Handle missing views by only computing poses when enough markers visible
+
+### From Librarian (Robust averaging)
+- Use `scipy.spatial.transform.Rotation.mean(weights=...)` for rotation averaging
+- Median/MAD on translation for outlier rejection
+- RANSAC over pose set with rotation angle + translation distance thresholds
+- Practical thresholds: rotation >2-5°, translation depends on scale
+
+### Existing Codebase Patterns
+- `find_extrinsic_object.py`: ArUco detection + solvePnP pattern
+- `svo_playback.py`: Multi-SVO sync via timestamp alignment
+- `aruco_box.py`: Diamond board geometry generation
+
+## Open Questions
+- None remaining
+
+## Metis Gap Analysis (Addressed)
+
+### Critical Gaps Resolved:
+1. **World frame**: As defined in `standard_box_markers.parquet` (origin at box coordinate system)
+2. **Image stream**: Use rectified LEFT view (no distortion coefficients needed)
+3. **Transform convention**: Match `inside_network.json` format - appears to be T_world_from_cam (camera pose in world)
+   - Format: space-separated 4x4 matrix, row-major
+4. **Sync tolerance**: Moderate (<33ms, 1 frame at 30fps)
+
+### Guardrails Added:
+- Validate parquet schema early (require marker_id, corners with X,Y,Z in meters)
+- Use reprojection error as primary quality metric
+- Require ≥4 markers with sufficient 3D spread (not just coplanar)
+- Whitelist only expected marker IDs (from parquet)
+- Add self-check mode with quantitative quality report
+
+## Scope Boundaries
+
+### INCLUDE
+- SVO file loading with timestamp sync
+- ArUco detection on left camera image
+- Pose estimation using solvePnP
+- Per-frame quality filtering (reprojection error)
+- Multi-frame pose averaging with outlier rejection
+- JSON output with 4x4 pose matrices
+- Preview visualization showing detected markers and axes
+- CLI interface with click
+
+### EXCLUDE
+- Right camera processing (use left only for simplicity)
+- Intrinsic calibration (use pre-calibrated from ZED SDK)
+- Modifying `inside_network.json` in-place
+- GUI-based frame selection
+- Bundle adjustment refinement
+- Depth-based verification