cvmmap-streamer/docs/depth_alignment.md

# Depth Alignment

Exported ZED MCAP files can carry RGB video and depth at different raster sizes.

For the current kindergarten `zed4` exports, the common pair is:

- video: `1920x1200`
- depth: `960x512`

That means RGB and depth do not share aspect ratio. The files stay alignable because the exporter writes two separate calibration topics:

- `/{label}/calibration` for video
- `/{label}/depth_calibration` for depth

See [mcap_layout.md](./mcap_layout.md) for the topic contract.

## What The Mapping Means

The correct way to align depth onto RGB is to use the two calibration matrices, not to assume matching pixel grids.

For the same camera, with zero distortion and identity rectification, the mapping reduces to a 2D affine transform:

```text
u_rgb = (fx_rgb / fx_depth) * u_depth + (cx_rgb - (fx_rgb / fx_depth) * cx_depth)
v_rgb = (fy_rgb / fy_depth) * v_depth + (cy_rgb - (fy_rgb / fy_depth) * cy_depth)
```

and the inverse:

```text
u_depth = (fx_depth / fx_rgb) * u_rgb + (cx_depth - (fx_depth / fx_rgb) * cx_rgb)
v_depth = (fy_depth / fy_rgb) * v_rgb + (cy_depth - (fy_depth / fy_rgb) * cy_rgb)
```

For the sampled kindergarten `zed4` files, those offsets are effectively zero, so the mapping becomes an anisotropic resize:

```text
u_rgb ~= 2.0 * u_depth
v_rgb ~= 2.34375 * v_depth
```

This is why the practical overlay behavior is a stretch, not a crop.

It is still better to derive the mapping from the two calibration topics than to hardcode `2.0` and `2.34375`, because the exact calibration can vary by camera and export settings.

## Helper Script

Use the alignment helper to inspect the calibration pair and optionally export an example overlay:

```bash
uv run --extra viewer python scripts/mcap_depth_alignment.py \
  /workspaces/data/kindergarten/bar/2026-03-18T11-59-41/2026-03-18T11-59-41_zed4.mcap \
  --camera-label zed4
```

To export example images:

```bash
uv run --extra viewer python scripts/mcap_depth_alignment.py \
  /workspaces/data/kindergarten/bar/2026-03-18T11-59-41/2026-03-18T11-59-41_zed4.mcap \
  --camera-label zed4 \
  --frame-index 400 \
  --output-dir /tmp/zed4_alignment_demo
```

That command writes:

- `rgb_frame.png`
- `depth_native_colorized.png`
- `depth_aligned_to_rgb_colorized.png`
- `depth_overlay_on_rgb.png`
- `rgb_aligned_to_depth.png`

## What The Helper Actually Does

The script:

1. reads `/{label}/calibration` and `/{label}/depth_calibration`
2. computes the affine mapping implied by the two intrinsic matrices
3. decodes one RGB frame and one depth frame from the MCAP
4. warps depth into RGB space with `cv2.warpAffine`
5. optionally warps RGB into depth space with the inverse mapping

For the current exported ZED MCAP contract, that is the right simple alignment path.

If a future export starts carrying non-zero distortion or non-identity rectification, consumers should switch from this affine shortcut to a full camera-model reprojection path.