feat: add batch MCAP export tooling for ZED segments

Add a Python batch wrapper around zed_svo_to_mcap for multi-camera
segment exports. The new script supports dataset discovery, repeated
segment-dir inputs, CSV-driven ordering, skip/probe/report flows, dry-run,
and CUDA environment passthrough so kindergarten-style datasets can be
converted into one bundled MCAP per segment.

Extend zed_svo_to_mcap so bundled multi-camera mode accepts --end-frame
with synced-group semantics. In this mode the value is interpreted as the
last emitted synced frame-group index from the common synced start, while
--start-frame remains unsupported.

Vendor a minimal pose-config TOML and a sample segments CSV into this repo
so the MCAP workflow is self-contained. Update the README to document the
batch MCAP flow, use portable placeholders instead of machine-specific
absolute paths, and describe the expected dataset layout explicitly.
This commit is contained in:
2026-03-20 09:19:56 +00:00
parent 8d9bd1b815
commit 1691274e85
5 changed files with 920 additions and 13 deletions
+101 -11
View File
@@ -65,7 +65,7 @@ The repo also includes an offline conversion tool for the left ZED color stream:
```bash
CUDA_VISIBLE_DEVICES=GPU-9cc7b26e-90d4-0c49-4d4c-060e528ffba6 \
./build/bin/zed_svo_to_mp4 \
--input /workspaces/data/kindergarten/bar/2026-03-18T11-59-41/2026-03-18T11-59-41_zed1.svo2 \
--input <SVO_INPUT> \
--encoder-device auto \
--preset balanced \
--quality 20 \
@@ -83,11 +83,40 @@ Python dependencies for the batch wrapper are managed with `uv`:
uv sync
```
Expected multi-camera dataset layout:
```text
<DATASET_ROOT>/
├── svo2_segments_sorted.csv
├── bar/
│ └── 2026-03-18T11-59-41/
│ ├── 2026-03-18T11-59-41_zed1.svo2
│ ├── 2026-03-18T11-59-41_zed2.svo2
│ ├── 2026-03-18T11-59-41_zed3.svo2
│ └── 2026-03-18T11-59-41_zed4.svo2
└── jump/
└── experiment/
└── 1/
└── 2026-03-18T11-26-23/
├── 2026-03-18T11-26-23_zed1.svo2
├── 2026-03-18T11-26-23_zed2.svo2
├── 2026-03-18T11-26-23_zed3.svo2
└── 2026-03-18T11-26-23_zed4.svo2
```
Placeholders used below:
- `<DATASET_ROOT>`: dataset root containing multi-camera segment directories
- `<SEGMENT_DIR>`: one multi-camera segment directory containing `*_zedN.svo` or `*_zedN.svo2`
- `<SEGMENT_DIR_A>`, `<SEGMENT_DIR_B>`: explicit segment directories
- `<SEGMENTS_CSV>`: CSV file with a `segment_dir` column, for example `config/svo2_segments_sorted.sample.csv`
- `<SVO_INPUT>`: one single-camera `.svo` or `.svo2` file
- `<POSE_CONFIG>`: TOML file such as `config/zed_pose_config.toml`
Use the wrapper to recurse through a folder, run `zed_svo_to_mp4` on every matched `.svo2`, and show one aggregate tqdm progress bar:
```bash
uv run python scripts/zed_batch_svo_to_mp4.py \
/workspaces/data/kindergarten/bar \
<DATASET_ROOT>/bar \
--pattern '*.svo2' \
--recursive \
--jobs 2 \
@@ -105,7 +134,7 @@ Use the grid converter to merge four synced ZED recordings into a 2x2 CCTV-style
```bash
./build/bin/zed_svo_grid_to_mp4 \
--segment-dir /workspaces/data/kindergarten/bar/2026-03-18T11-59-41 \
--segment-dir <SEGMENT_DIR> \
--encoder-device auto \
--codec h265 \
--duration-seconds 2
@@ -117,7 +146,7 @@ Use the batch wrapper to run `zed_svo_grid_to_mp4` over many segment directories
```bash
uv run python scripts/zed_batch_svo_grid_to_mp4.py \
/workspaces/data/kindergarten \
<DATASET_ROOT> \
--recursive \
--jobs 2 \
--encoder-device auto \
@@ -128,8 +157,8 @@ You can also provide the exact segments to convert:
```bash
uv run python scripts/zed_batch_svo_grid_to_mp4.py \
--segment-dir /workspaces/data/kindergarten/jump/external/recording/2026-03-18T11-23-22 \
--segment-dir /workspaces/data/kindergarten/jump/experiment/1/2026-03-18T11-26-23 \
--segment-dir <SEGMENT_DIR_A> \
--segment-dir <SEGMENT_DIR_B> \
--jobs 2
```
@@ -137,7 +166,7 @@ Or preserve a precomputed CSV ordering:
```bash
uv run python scripts/zed_batch_svo_grid_to_mp4.py \
--segments-csv /workspaces/data/kindergarten/svo2_segments_sorted.csv \
--segments-csv <SEGMENTS_CSV> \
--jobs 2 \
--duration-seconds 2
```
@@ -148,7 +177,7 @@ When you suspect a previous run left behind partial MP4 files, opt into `ffprobe
```bash
uv run python scripts/zed_batch_svo_grid_to_mp4.py \
/workspaces/data/kindergarten \
<DATASET_ROOT> \
--probe-existing \
--jobs 2
```
@@ -157,7 +186,7 @@ Use `--report-existing` to audit existing outputs without launching conversions.
```bash
uv run python scripts/zed_batch_svo_grid_to_mp4.py \
/workspaces/data/kindergarten \
<DATASET_ROOT> \
--report-existing
```
@@ -165,7 +194,7 @@ Use `--dry-run` to preview what the batch wrapper would convert after applying s
```bash
uv run python scripts/zed_batch_svo_grid_to_mp4.py \
/workspaces/data/kindergarten \
<DATASET_ROOT> \
--probe-existing \
--dry-run
```
@@ -174,7 +203,7 @@ uv run python scripts/zed_batch_svo_grid_to_mp4.py \
The `--segments-csv` input expects a header row with at least a `segment_dir` column. Extra columns are allowed and ignored by the batch wrapper. `segment_dir` values may be absolute paths or paths relative to the CSV file's parent directory. Use `--csv-root` to override that base directory.
Repeated rows for the same `segment_dir` are allowed; the wrapper converts each unique segment once, preserving the first-seen CSV order. This makes `/workspaces/data/kindergarten/svo2_segments_sorted.csv` a valid input even though it stores one row per camera file:
Repeated rows for the same `segment_dir` are allowed; the wrapper converts each unique segment once, preserving the first-seen CSV order. The repo includes a small example at `config/svo2_segments_sorted.sample.csv`:
```csv
timestamp,activity,group_path,segment_dir,camera,relative_path
@@ -182,6 +211,67 @@ timestamp,activity,group_path,segment_dir,camera,relative_path
2026-03-18T11-23-22,jump,jump/external/recording,jump/external/recording/2026-03-18T11-23-22,zed2,jump/external/recording/2026-03-18T11-23-22/2026-03-18T11-23-22_zed2.svo2
```
### Batch ZED Segments To MCAP
Use the wrapper to recurse through a dataset root, run `zed_svo_to_mcap --segment-dir` on every matched multi-camera segment, and show one aggregate tqdm progress bar:
```bash
uv run python scripts/zed_batch_svo_to_mcap.py \
<DATASET_ROOT> \
--recursive \
--jobs 2 \
--cuda-visible-devices GPU-9cc7b26e-90d4-0c49-4d4c-060e528ffba6 \
--end-frame 29
```
You can also preserve the precomputed kindergarten CSV ordering:
```bash
uv run python scripts/zed_batch_svo_to_mcap.py \
--segments-csv <SEGMENTS_CSV> \
--jobs 2 \
--end-frame 29
```
Enable per-camera pose export when the segment has valid tracking:
```bash
uv run python scripts/zed_batch_svo_to_mcap.py \
--segment-dir <SEGMENT_DIR> \
--with-pose \
--pose-config <POSE_CONFIG>
```
The batch MCAP wrapper writes `<segment>/<segment>.mcap` by default, skips existing outputs unless told otherwise, and returns a nonzero exit code if any segment fails.
The repo includes a minimal pose config at `config/zed_pose_config.toml` so MCAP conversion does not depend on a separate `cv-mmap` checkout.
In bundled multi-camera mode, `--end-frame` means the last emitted synced frame-group index from the common start timestamp.
Use `--probe-existing` to validate existing MCAPs before skipping them. Invalid outputs are treated as missing and requeued:
```bash
uv run python scripts/zed_batch_svo_to_mcap.py \
<DATASET_ROOT> \
--probe-existing \
--jobs 2
```
Use `--report-existing` to audit existing MCAPs without launching conversions:
```bash
uv run python scripts/zed_batch_svo_to_mcap.py \
<DATASET_ROOT> \
--report-existing
```
Use `--dry-run` to preview what would be converted after applying skip or probe logic:
```bash
uv run python scripts/zed_batch_svo_to_mcap.py \
--segments-csv <SEGMENTS_CSV> \
--probe-existing \
--dry-run
```
### Mandatory Acceptance (Standalone)
Run the full mandatory acceptance suite. This executes the complete protocol/codec matrix without requiring external servers.