feat(zed): add DuckDB segment timestamp indexer

Add a new mcap_video_bounds helper binary plus a zed_segment_time_index.py CLI that builds and queries an embedded DuckDB index for bundled ZED segment recordings. The index stores segment folders, MCAP paths, video time bounds, durations, camera labels, and dataset metadata, and reuses the existing recursive multi-camera segment discovery logic so nested kindergarten layouts are indexed correctly. Infer a dataset default timezone from folder names versus MCAP timestamps, and make point queries precision-aware so second-level folder timestamps like 2026-03-18T12-00-23 resolve to the matching segment instead of missing due to subsecond start offsets. Verification: - uv add 'duckdb>=1.0' - cmake --build build --target mcap_video_bounds - uv run python -m unittest tests.test_zed_segment_time_index - uv run python scripts/zed_segment_time_index.py build /workspaces/data/kindergarten --jobs 8 - uv run python scripts/zed_segment_time_index.py query /workspaces/data/kindergarten --at 2026-03-18T12-00-23
2026-03-23 09:35:54 +00:00
parent a0b9c95d5b
commit e3a423433e
7 changed files with 1185 additions and 0 deletions
@@ -365,7 +365,34 @@ set_target_properties(mcap_replay_tester PROPERTIES
 	OUTPUT_NAME "mcap_replay_tester"
 	RUNTIME_OUTPUT_DIRECTORY "${CMAKE_CURRENT_BINARY_DIR}/bin")
 add_executable(mcap_video_bounds src/tools/mcap_video_bounds.cpp)
 target_include_directories(mcap_video_bounds
 	PRIVATE
 		"${CMAKE_CURRENT_LIST_DIR}/include"
 		"${CMAKE_CURRENT_BINARY_DIR}")
 target_link_libraries(mcap_video_bounds
 	PRIVATE
 		CLI11::CLI11
 		cvmmap_streamer_foxglove_proto
 		cvmmap_streamer_mcap_runtime
 		mcap::mcap
 		PkgConfig::ZSTD
 		PkgConfig::LZ4)
 if (TARGET spdlog::spdlog)
 	target_link_libraries(mcap_video_bounds PRIVATE spdlog::spdlog)
 elseif (TARGET spdlog)
 	target_link_libraries(mcap_video_bounds PRIVATE spdlog)
 endif()
 target_link_libraries(mcap_video_bounds PRIVATE cvmmap_streamer_protobuf)
 if (TARGET PkgConfig::PROTOBUF_PKG)
 	target_link_libraries(mcap_video_bounds PRIVATE PkgConfig::PROTOBUF_PKG)
 endif()
 set_target_properties(mcap_video_bounds PROPERTIES
 	OUTPUT_NAME "mcap_video_bounds"
 	RUNTIME_OUTPUT_DIRECTORY "${CMAKE_CURRENT_BINARY_DIR}/bin")
 set(CVMMAP_STREAMER_INSTALL_TARGETS cvmmap_streamer)
 list(APPEND CVMMAP_STREAMER_INSTALL_TARGETS mcap_video_bounds)
 if (CVMMAP_HAS_ZED_SDK)
 	add_library(
@@ -0,0 +1,97 @@
 # ZED Segment Time Index
 `scripts/zed_segment_time_index.py` builds and queries an embedded DuckDB index for bundled ZED segment folders.
 Default artifact name:
 ```text
 <DATASET_ROOT>/segment_time_index.duckdb
 ```
 Primary commands:
 ```bash
 uv run python scripts/zed_segment_time_index.py build <DATASET_ROOT>
 uv run python scripts/zed_segment_time_index.py query <DATASET_ROOT> --at 2026-03-18T12-00-23
 uv run python scripts/zed_segment_time_index.py query <DATASET_ROOT> --start 2026-03-18T12-00-23 --end 2026-03-18T12-00-30
 ```
 ## Data Source Rules
 - Segment discovery is recursive and follows the same multi-camera layout assumptions as the batch ZED tooling.
 - A directory is considered a valid segment when it contains at least two unique `*_zedN.svo` or `*_zedN.svo2` files and no duplicate camera labels.
 - Timing is sourced from the segment MCAP, not from the SVO/SVO2 files.
 - A valid segment is skipped when it has no `.mcap` file or more than one `.mcap` file in the segment directory.
 ## MCAP Bounds Extraction
 `build/bin/mcap_video_bounds` scans `foxglove.CompressedVideo` messages in one MCAP and emits:
 - `start_ns`
 - `end_ns`
 - `duration_ns`
 - `video_message_count`
 - `start_iso_utc`
 - `end_iso_utc`
 The helper prefers the protobuf `CompressedVideo.timestamp` field and falls back to MCAP `logTime` when that field is zero.
 ## DuckDB Layout
 The database contains two tables: `meta` and `segments`.
 ### `meta`
 Key-value metadata for the index:
 - `schema_version`: current schema version, currently `1`
 - `dataset_root`: absolute dataset root used when the index was built
 - `built_at_utc`: build timestamp in UTC
 - `default_timezone`: inferred dataset wall-clock timezone used when querying with `--timezone dataset`
 ### `segments`
 One row per indexed segment.
 | Column | Type | Meaning |
 |---|---|---|
 | `segment_dir` | `VARCHAR` | Absolute path to the segment directory |
 | `relative_segment_dir` | `VARCHAR` | Path relative to the dataset root |
 | `group_path` | `VARCHAR` | Parent path of the segment within the dataset |
 | `activity` | `VARCHAR` | First path component under the dataset root |
 | `segment_name` | `VARCHAR` | Segment directory basename |
 | `mcap_path` | `VARCHAR` | Absolute MCAP path used for timing |
 | `start_ns` | `BIGINT` | Earliest video timestamp in nanoseconds since Unix epoch |
 | `end_ns` | `BIGINT` | Latest video timestamp in nanoseconds since Unix epoch |
 | `duration_ns` | `BIGINT` | `end_ns - start_ns` |
 | `start_iso_utc` | `VARCHAR` | UTC rendering of `start_ns` |
 | `end_iso_utc` | `VARCHAR` | UTC rendering of `end_ns` |
 | `camera_count` | `INTEGER` | Number of discovered camera inputs in the segment directory |
 | `camera_labels` | `VARCHAR` | Comma-separated camera labels, for example `zed1,zed2,zed3,zed4` |
 | `video_message_count` | `BIGINT` | Number of `foxglove.CompressedVideo` messages observed in the MCAP |
 | `index_source` | `VARCHAR` | Current extractor label, currently `mcap_video_bounds` |
 Indexes are created on `start_ns` and `end_ns`.
 ## Query Semantics
 - `--at` performs an overlap lookup, not just an exact nanosecond equality check.
 - Query precision follows the precision supplied by the user.
 - A second-precision value like `2026-03-18T12-00-23` is treated as the whole second `[12:00:23.000, 12:00:23.999999999]`.
 - Integer epochs are widened similarly by their apparent unit:
  - 10 digits or fewer: seconds
  - 11-13 digits: milliseconds
  - 14-16 digits: microseconds
  - 17+ digits: nanoseconds
 - `--start/--end` returns every segment whose `[start_ns, end_ns]` overlaps the requested interval.
 ## Timezone Behavior
 - Query default is `--timezone dataset`.
 - `dataset` resolves to the `default_timezone` stored in `meta`.
 - If inference is unavailable, the script falls back to `local`.
 - Explicit values are also accepted:
  - `local`
  - `UTC`
  - fixed offsets such as `UTC+08:00`
  - IANA zone names such as `Asia/Shanghai`
@@ -4,6 +4,7 @@ version = "0.0.0"
 requires-python = ">=3.10"
 dependencies = [
    "click>=8.1",
    "duckdb>=1.0",
    "numpy>=2.2",
    "opencv-python-headless>=4.11",
    "progress-table>=3.2",
@@ -0,0 +1,658 @@
 #!/usr/bin/env python3
 from __future__ import annotations
 import concurrent.futures
 import datetime as dt
 import json
 import os
 import re
 import subprocess
 import tempfile
 from dataclasses import dataclass
 from pathlib import Path
 from typing import Any
 from zoneinfo import ZoneInfo
 import click
 import duckdb
 SCRIPT_PATH = Path(__file__).resolve()
 REPO_ROOT = SCRIPT_PATH.parents[1]
 DEFAULT_INDEX_NAME = "segment_time_index.duckdb"
 INDEX_SCHEMA_VERSION = "1"
 SEGMENT_FILE_PATTERN = re.compile(r".*_zed([0-9]+)\.svo2?$", re.IGNORECASE)
 FOLDER_TIMESTAMP_PATTERN = re.compile(
    r"^(?P<date>\d{4}-\d{2}-\d{2})[T ](?P<hour>\d{2})-(?P<minute>\d{2})-(?P<second>\d{2})(?P<fraction>\.\d+)?(?P<timezone>Z|[+-]\d{2}:\d{2})?$"
 )
@dataclass(slots=True, frozen=True)
 class SegmentScan:
    segment_dir: Path
    matched_files: int
    camera_labels: tuple[str, ...]
    is_valid: bool
    reason: str | None = None
@dataclass(slots=True, frozen=True)
 class BoundsRow:
    segment_dir: Path
    relative_segment_dir: str
    group_path: str
    activity: str
    segment_name: str
    mcap_path: Path
    start_ns: int
    end_ns: int
    duration_ns: int
    start_iso_utc: str
    end_iso_utc: str
    camera_count: int
    camera_labels: str
    video_message_count: int
    index_source: str
 def sorted_camera_labels(labels: set[str]) -> tuple[str, ...]:
    return tuple(sorted(labels, key=lambda label: int(label[3:])))
 def scan_segment_dir(segment_dir: Path) -> SegmentScan:
    if not segment_dir.is_dir():
        return SegmentScan(
            segment_dir=segment_dir,
            matched_files=0,
            camera_labels=(),
            is_valid=False,
            reason=f"segment directory does not exist: {segment_dir}",
        )
    matched_by_camera: dict[str, list[Path]] = {}
    for child in segment_dir.iterdir():
        if not child.is_file():
            continue
        match = SEGMENT_FILE_PATTERN.fullmatch(child.name)
        if match is None:
            continue
        label = f"zed{int(match.group(1))}"
        matched_by_camera.setdefault(label, []).append(child)
    matched_files = sum(len(paths) for paths in matched_by_camera.values())
    camera_labels = sorted_camera_labels(set(matched_by_camera))
    duplicate_cameras = [label for label, paths in sorted(matched_by_camera.items()) if len(paths) > 1]
    if duplicate_cameras:
        return SegmentScan(
            segment_dir=segment_dir,
            matched_files=matched_files,
            camera_labels=camera_labels,
            is_valid=False,
            reason=f"duplicate camera inputs under {segment_dir}: {', '.join(duplicate_cameras)}",
        )
    if len(camera_labels) < 2:
        return SegmentScan(
            segment_dir=segment_dir,
            matched_files=matched_files,
            camera_labels=camera_labels,
            is_valid=False,
            reason=f"expected at least 2 camera inputs under {segment_dir}, found {len(camera_labels)}",
        )
    return SegmentScan(
        segment_dir=segment_dir,
        matched_files=matched_files,
        camera_labels=camera_labels,
        is_valid=True,
    )
 def discover_segment_dirs(root: Path, recursive: bool) -> tuple[list[SegmentScan], list[SegmentScan]]:
    if not root.is_dir():
        raise click.ClickException(f"input directory does not exist: {root}")
    candidate_dirs = {root.resolve()}
    iterator = root.rglob("*") if recursive else root.iterdir()
    for path in iterator:
        if path.is_dir():
            candidate_dirs.add(path.resolve())
    valid_scans: list[SegmentScan] = []
    ignored_partial_scans: list[SegmentScan] = []
    for segment_dir in sorted(candidate_dirs):
        scan = scan_segment_dir(segment_dir)
        if scan.is_valid:
            valid_scans.append(scan)
        elif scan.matched_files > 0:
            ignored_partial_scans.append(scan)
    if not valid_scans:
        raise click.ClickException(f"no multi-camera segments found under {root}")
    return valid_scans, ignored_partial_scans
 def locate_binary(name: str, override: Path | None) -> Path:
    if override is not None:
        candidate = override.expanduser().resolve()
        if not candidate.is_file():
            raise click.ClickException(f"binary not found: {candidate}")
        return candidate
    candidates = (
        REPO_ROOT / "build" / "bin" / name,
        REPO_ROOT / "build" / name,
    )
    for candidate in candidates:
        if candidate.is_file():
            return candidate
    raise click.ClickException(f"could not find {name} under {REPO_ROOT / 'build'}")
 def default_index_path(dataset_root: Path) -> Path:
    return dataset_root / DEFAULT_INDEX_NAME
 def find_unique_mcap(segment_dir: Path) -> Path | None:
    matches = sorted(path for path in segment_dir.iterdir() if path.is_file() and path.suffix.lower() == ".mcap")
    if len(matches) == 1:
        return matches[0]
    return None
 def format_ns_iso(ns: int, tzinfo: dt.tzinfo) -> str:
    seconds, nanos = divmod(ns, 1_000_000_000)
    stamp = dt.datetime.fromtimestamp(seconds, tz=dt.timezone.utc).astimezone(tzinfo)
    offset = stamp.strftime("%z")
    offset = f"{offset[:3]}:{offset[3:]}" if offset else ""
    return f"{stamp.strftime('%Y-%m-%dT%H:%M:%S')}.{nanos:09d}{offset}"
 def format_ns_utc(ns: int) -> str:
    return format_ns_iso(ns, dt.timezone.utc).replace("+00:00", "Z")
 def resolve_timezone(name: str) -> dt.tzinfo:
    if name == "local":
        local = dt.datetime.now().astimezone().tzinfo
        if local is None:
            raise click.ClickException("could not resolve local timezone")
        return local
    if name == "UTC":
        return dt.timezone.utc
    if name.startswith("UTC") and len(name) == len("UTC+00:00"):
        try:
            sign = 1 if name[3] == "+" else -1
            hours = int(name[4:6])
            minutes = int(name[7:9])
        except ValueError as exc:
            raise click.ClickException(f"invalid fixed UTC offset '{name}'") from exc
        return dt.timezone(sign * dt.timedelta(hours=hours, minutes=minutes))
    try:
        return ZoneInfo(name)
    except Exception as exc:  # pragma: no cover - defensive wrapper around system tzdb
        raise click.ClickException(f"unknown timezone '{name}': {exc}") from exc
 def normalize_timestamp_text(value: str) -> str:
    match = FOLDER_TIMESTAMP_PATTERN.fullmatch(value)
    if match is None:
        return value
    parts = match.groupdict()
    fraction = parts["fraction"] or ""
    timezone_text = parts["timezone"] or ""
    return f"{parts['date']}T{parts['hour']}:{parts['minute']}:{parts['second']}{fraction}{timezone_text}"
 def parse_folder_name_naive(value: str) -> dt.datetime | None:
    normalized = normalize_timestamp_text(value)
    try:
        parsed = dt.datetime.fromisoformat(normalized)
    except ValueError:
        return None
    if parsed.tzinfo is not None:
        return None
    return parsed
 def datetime_to_ns(value: dt.datetime) -> int:
    utc_value = value.astimezone(dt.timezone.utc)
    return int(utc_value.timestamp()) * 1_000_000_000 + utc_value.microsecond * 1_000
 def parse_timestamp_to_ns(value: str, timezone_name: str) -> int:
    stripped = value.strip()
    if not stripped:
        raise click.ClickException("timestamp value is empty")
    digit_text = stripped.lstrip("+-")
    if digit_text.isdigit():
        raw_value = int(stripped)
        digits = len(digit_text)
        if digits <= 10:
            return raw_value * 1_000_000_000
        if digits <= 13:
            return raw_value * 1_000_000
        if digits <= 16:
            return raw_value * 1_000
        return raw_value
    normalized = normalize_timestamp_text(stripped)
    if normalized.endswith("Z"):
        normalized = normalized[:-1] + "+00:00"
    try:
        parsed = dt.datetime.fromisoformat(normalized)
    except ValueError as exc:
        raise click.ClickException(f"invalid timestamp '{value}': {exc}") from exc
    if parsed.tzinfo is None:
        parsed = parsed.replace(tzinfo=resolve_timezone(timezone_name))
    return datetime_to_ns(parsed)
 def parse_timestamp_window(value: str, timezone_name: str) -> tuple[int, int]:
    stripped = value.strip()
    if not stripped:
        raise click.ClickException("timestamp value is empty")
    digit_text = stripped.lstrip("+-")
    if digit_text.isdigit():
        base_ns = parse_timestamp_to_ns(stripped, timezone_name)
        digits = len(digit_text)
        if digits <= 10:
            precision_ns = 1_000_000_000
        elif digits <= 13:
            precision_ns = 1_000_000
        elif digits <= 16:
            precision_ns = 1_000
        else:
            precision_ns = 1
        return base_ns, base_ns + precision_ns - 1
    normalized = normalize_timestamp_text(stripped)
    base_ns = parse_timestamp_to_ns(stripped, timezone_name)
    fraction_match = re.search(r"\.(\d+)", normalized)
    if fraction_match is None:
        precision_ns = 1_000_000_000
    else:
        digits = min(len(fraction_match.group(1)), 9)
        precision_ns = 10 ** (9 - digits)
    return base_ns, base_ns + precision_ns - 1
 def probe_mcap_bounds(bounds_bin: Path, mcap_path: Path) -> dict[str, Any]:
    result = subprocess.run(
        [str(bounds_bin), str(mcap_path), "--json"],
        check=False,
        stdout=subprocess.PIPE,
        stderr=subprocess.PIPE,
        text=True,
    )
    if result.returncode != 0:
        stderr = result.stderr.strip() or result.stdout.strip() or f"exit {result.returncode}"
        raise RuntimeError(f"{mcap_path}: {stderr}")
    try:
        return json.loads(result.stdout)
    except json.JSONDecodeError as exc:
        raise RuntimeError(f"{mcap_path}: failed to parse helper JSON: {exc}") from exc
 def build_row(dataset_root: Path, scan: SegmentScan, bounds_bin: Path) -> BoundsRow | None:
    mcap_path = find_unique_mcap(scan.segment_dir)
    if mcap_path is None:
        return None
    bounds = probe_mcap_bounds(bounds_bin, mcap_path)
    relative_segment_dir = scan.segment_dir.relative_to(dataset_root).as_posix()
    parent = Path(relative_segment_dir).parent
    group_path = "" if str(parent) == "." else parent.as_posix()
    parts = Path(relative_segment_dir).parts
    activity = parts[0] if parts else scan.segment_dir.name
    start_ns = int(bounds["start_ns"])
    end_ns = int(bounds["end_ns"])
    return BoundsRow(
        segment_dir=scan.segment_dir,
        relative_segment_dir=relative_segment_dir,
        group_path=group_path,
        activity=activity,
        segment_name=scan.segment_dir.name,
        mcap_path=mcap_path,
        start_ns=start_ns,
        end_ns=end_ns,
        duration_ns=max(0, end_ns - start_ns),
        start_iso_utc=str(bounds["start_iso_utc"]),
        end_iso_utc=str(bounds["end_iso_utc"]),
        camera_count=len(scan.camera_labels),
        camera_labels=",".join(scan.camera_labels),
        video_message_count=int(bounds["video_message_count"]),
        index_source="mcap_video_bounds",
    )
 def init_db(conn: duckdb.DuckDBPyConnection) -> None:
    conn.execute(
        """
        CREATE TABLE meta (
            key VARCHAR PRIMARY KEY,
            value VARCHAR NOT NULL
        );
        """
    )
    conn.execute(
        """
        CREATE TABLE segments (
            segment_dir VARCHAR PRIMARY KEY,
            relative_segment_dir VARCHAR NOT NULL,
            group_path VARCHAR NOT NULL,
            activity VARCHAR NOT NULL,
            segment_name VARCHAR NOT NULL,
            mcap_path VARCHAR NOT NULL,
            start_ns BIGINT NOT NULL,
            end_ns BIGINT NOT NULL,
            duration_ns BIGINT NOT NULL,
            start_iso_utc VARCHAR NOT NULL,
            end_iso_utc VARCHAR NOT NULL,
            camera_count INTEGER NOT NULL,
            camera_labels VARCHAR NOT NULL,
            video_message_count BIGINT NOT NULL,
            index_source VARCHAR NOT NULL
        );
        """
    )
    conn.execute("CREATE INDEX segments_start_ns_idx ON segments(start_ns);")
    conn.execute("CREATE INDEX segments_end_ns_idx ON segments(end_ns);")
 def write_index(index_path: Path, dataset_root: Path, rows: list[BoundsRow]) -> None:
    index_path.parent.mkdir(parents=True, exist_ok=True)
    with tempfile.NamedTemporaryFile(prefix=f"{index_path.name}.", suffix=".tmp", dir=index_path.parent, delete=False) as handle:
        temp_path = Path(handle.name)
    temp_path.unlink(missing_ok=True)
    inferred_timezone = infer_dataset_timezone(rows)
    try:
        conn = duckdb.connect(str(temp_path))
        try:
            init_db(conn)
            conn.executemany(
                "INSERT INTO meta (key, value) VALUES (?, ?)",
                [
                    ("schema_version", INDEX_SCHEMA_VERSION),
                    ("dataset_root", str(dataset_root)),
                    ("built_at_utc", dt.datetime.now(dt.timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ")),
                    ("default_timezone", inferred_timezone),
                ],
            )
            conn.executemany(
                """
                INSERT INTO segments (
                    segment_dir,
                    relative_segment_dir,
                    group_path,
                    activity,
                    segment_name,
                    mcap_path,
                    start_ns,
                    end_ns,
                    duration_ns,
                    start_iso_utc,
                    end_iso_utc,
                    camera_count,
                    camera_labels,
                    video_message_count,
                    index_source
                ) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
                """,
                [
                    (
                        str(row.segment_dir),
                        row.relative_segment_dir,
                        row.group_path,
                        row.activity,
                        row.segment_name,
                        str(row.mcap_path),
                        row.start_ns,
                        row.end_ns,
                        row.duration_ns,
                        row.start_iso_utc,
                        row.end_iso_utc,
                        row.camera_count,
                        row.camera_labels,
                        row.video_message_count,
                        row.index_source,
                    )
                    for row in rows
                ],
            )
        finally:
            conn.close()
        temp_path.replace(index_path)
    except Exception:
        temp_path.unlink(missing_ok=True)
        raise
 def infer_dataset_timezone(rows: list[BoundsRow]) -> str:
    offset_counts: dict[int, int] = {}
    for row in rows:
        folder_time = parse_folder_name_naive(row.segment_name)
        if folder_time is None:
            continue
        actual_utc = dt.datetime.fromtimestamp(row.start_ns / 1_000_000_000, tz=dt.timezone.utc).replace(tzinfo=None)
        offset_minutes = round((folder_time - actual_utc).total_seconds() / 60.0)
        offset_counts[offset_minutes] = offset_counts.get(offset_minutes, 0) + 1
    if not offset_counts:
        return "local"
    minutes = max(offset_counts.items(), key=lambda item: item[1])[0]
    if minutes == 0:
        return "UTC"
    sign = "+" if minutes >= 0 else "-"
    absolute_minutes = abs(minutes)
    hours, mins = divmod(absolute_minutes, 60)
    return f"UTC{sign}{hours:02d}:{mins:02d}"
 def require_query_window(at: str | None, start: str | None, end: str | None, timezone_name: str) -> tuple[int, int]:
    if at is not None and (start is not None or end is not None):
        raise click.ClickException("use either --at or --start/--end, not both")
    if at is not None:
        return parse_timestamp_window(at, timezone_name)
    if start is None or end is None:
        raise click.ClickException("provide --at or both --start and --end")
    start_ns = parse_timestamp_to_ns(start, timezone_name)
    end_ns = parse_timestamp_to_ns(end, timezone_name)
    if start_ns > end_ns:
        raise click.ClickException("query start must be before or equal to query end")
    return start_ns, end_ns
 def load_meta(conn: duckdb.DuckDBPyConnection) -> dict[str, str]:
    rows = conn.execute("SELECT key, value FROM meta").fetchall()
    return {str(key): str(value) for key, value in rows}
 def format_duration(duration_ns: int) -> str:
    return f"{duration_ns / 1_000_000_000:.3f}s"
@click.group()
 def cli() -> None:
    """Build and query a DuckDB index of bundled ZED segment timestamps."""
@cli.command()
@click.argument("dataset_root", type=click.Path(path_type=Path, file_okay=False))
@click.option("--index", "index_path", type=click.Path(path_type=Path, dir_okay=False))
@click.option("--recursive/--no-recursive", default=True, show_default=True)
@click.option("--jobs", type=click.IntRange(min=1), default=min(8, os.cpu_count() or 1), show_default=True)
@click.option("--bounds-bin", type=click.Path(path_type=Path, dir_okay=False))
 def build(dataset_root: Path, index_path: Path | None, recursive: bool, jobs: int, bounds_bin: Path | None) -> None:
    """Build or replace the embedded DuckDB time index for DATASET_ROOT."""
    dataset_root = dataset_root.expanduser().resolve()
    index_path = (index_path or default_index_path(dataset_root)).expanduser().resolve()
    bounds_binary = locate_binary("mcap_video_bounds", bounds_bin)
    valid_scans, ignored_partial_scans = discover_segment_dirs(dataset_root, recursive)
    click.echo(
        f"discovered {len(valid_scans)} valid segment directories under {dataset_root}",
        err=True,
    )
    if ignored_partial_scans:
        click.echo(f"ignored {len(ignored_partial_scans)} partial segment directories", err=True)
    rows: list[BoundsRow] = []
    skipped_missing_mcap: list[Path] = []
    errors: list[str] = []
    with concurrent.futures.ThreadPoolExecutor(max_workers=jobs) as executor:
        future_to_scan: dict[concurrent.futures.Future[BoundsRow | None], SegmentScan] = {
            executor.submit(build_row, dataset_root, scan, bounds_binary): scan for scan in valid_scans
        }
        for future in concurrent.futures.as_completed(future_to_scan):
            scan = future_to_scan[future]
            try:
                row = future.result()
            except Exception as exc:
                errors.append(f"{scan.segment_dir}: {exc}")
                continue
            if row is None:
                skipped_missing_mcap.append(scan.segment_dir)
                continue
            rows.append(row)
    rows.sort(key=lambda row: (row.start_ns, row.segment_dir.as_posix()))
    if skipped_missing_mcap:
        click.echo(f"skipped {len(skipped_missing_mcap)} segments with missing or ambiguous MCAP files", err=True)
    if errors:
        for error in errors:
            click.echo(f"error: {error}", err=True)
        raise click.ClickException(f"failed to probe {len(errors)} segment(s)")
    if not rows:
        raise click.ClickException("no indexable MCAP segments were found")
    write_index(index_path, dataset_root, rows)
    click.echo(
        f"wrote {len(rows)} segments to {index_path} (skipped_missing_mcap={len(skipped_missing_mcap)})",
        err=True,
    )
@cli.command()
@click.argument("dataset_root", type=click.Path(path_type=Path, file_okay=False))
@click.option("--index", "index_path", type=click.Path(path_type=Path, dir_okay=False))
@click.option("--at")
@click.option("--start")
@click.option("--end")
@click.option("--json", "as_json", is_flag=True)
@click.option("--timezone", "timezone_name", default="dataset", show_default=True)
 def query(
    dataset_root: Path,
    index_path: Path | None,
    at: str | None,
    start: str | None,
    end: str | None,
    as_json: bool,
    timezone_name: str,
 ) -> None:
    """Query the embedded time index for matching segment folders."""
    dataset_root = dataset_root.expanduser().resolve()
    index_path = (index_path or default_index_path(dataset_root)).expanduser().resolve()
    if not index_path.is_file():
        raise click.ClickException(f"index not found: {index_path}")
    conn = duckdb.connect(str(index_path), read_only=True)
    try:
        meta = load_meta(conn)
        indexed_root = Path(meta.get("dataset_root", "")).expanduser().resolve()
        if indexed_root != dataset_root:
            raise click.ClickException(
                f"index root mismatch: index was built for {indexed_root}, not {dataset_root}"
            )
        effective_timezone_name = meta.get("default_timezone", "local") if timezone_name == "dataset" else timezone_name
        query_start_ns, query_end_ns = require_query_window(at, start, end, effective_timezone_name)
        display_timezone = resolve_timezone(effective_timezone_name)
        result_rows = conn.execute(
            """
            SELECT
                segment_dir,
                relative_segment_dir,
                group_path,
                activity,
                segment_name,
                mcap_path,
                start_ns,
                end_ns,
                duration_ns,
                start_iso_utc,
                end_iso_utc,
                camera_count,
                camera_labels,
                video_message_count,
                index_source
            FROM segments
            WHERE start_ns <= ? AND end_ns >= ?
            ORDER BY start_ns, segment_dir
            """,
            [query_end_ns, query_start_ns],
        ).fetchall()
    finally:
        conn.close()
    payload = [
        {
            "segment_dir": row[0],
            "relative_segment_dir": row[1],
            "group_path": row[2],
            "activity": row[3],
            "segment_name": row[4],
            "mcap_path": row[5],
            "start_ns": row[6],
            "end_ns": row[7],
            "duration_ns": row[8],
            "start_iso_utc": row[9],
            "end_iso_utc": row[10],
            "camera_count": row[11],
            "camera_labels": row[12].split(",") if row[12] else [],
            "video_message_count": row[13],
            "index_source": row[14],
            "start_display": format_ns_iso(row[6], display_timezone),
            "end_display": format_ns_iso(row[7], display_timezone),
        }
        for row in result_rows
    ]
    if as_json:
        click.echo(json.dumps(payload, indent=2, ensure_ascii=False))
        return
    if not payload:
        click.echo("no matching segments")
        return
    click.echo(f"matched {len(payload)} segment(s)")
    for row in payload:
        click.echo(
            " | ".join(
                (
                    row["start_display"],
                    row["end_display"],
                    format_duration(int(row["duration_ns"])),
                    row["segment_dir"],
                    row["mcap_path"],
                )
            )
        )
 if __name__ == "__main__":
    cli()
@@ -0,0 +1,219 @@
 #include <CLI/CLI.hpp>
 #include <spdlog/spdlog.h>
 #include <foxglove/CompressedVideo.pb.h>
 #include <mcap/reader.hpp>
 #include <algorithm>
 #include <cstdint>
 #include <cstdlib>
 #include <iomanip>
 #include <iostream>
 #include <limits>
 #include <sstream>
 #include <string>
 namespace {
 enum class ToolExitCode : int {
 	Success = 0,
 	UsageError = 2,
 	OpenError = 3,
 	SchemaError = 4,
 	ParseError = 5,
 	EmptyError = 6,
 };
 struct Config {
 	std::string input_path{};
 	bool json{false};
 };
 struct BoundsSummary {
 	std::uint64_t start_ns{std::numeric_limits<std::uint64_t>::max()};
 	std::uint64_t end_ns{0};
 	std::uint64_t message_count{0};
 };
 [[nodiscard]]
 constexpr int exit_code(const ToolExitCode code) {
 	return static_cast<int>(code);
 }
 [[nodiscard]]
 std::uint64_t proto_timestamp_ns(const google::protobuf::Timestamp &timestamp) {
 	return static_cast<std::uint64_t>(timestamp.seconds()) * 1000000000ull + static_cast<std::uint64_t>(timestamp.nanos());
 }
 [[nodiscard]]
 std::string json_escape(const std::string &input) {
 	std::ostringstream output;
 	for (const unsigned char ch : input) {
 		switch (ch) {
 		case '\\':
 			output << "\\\\";
 			break;
 		case '"':
 			output << "\\\"";
 			break;
 		case '\b':
 			output << "\\b";
 			break;
 		case '\f':
 			output << "\\f";
 			break;
 		case '\n':
 			output << "\\n";
 			break;
 		case '\r':
 			output << "\\r";
 			break;
 		case '\t':
 			output << "\\t";
 			break;
 		default:
 			if (ch < 0x20) {
 				output << "\\u" << std::hex << std::setw(4) << std::setfill('0') << static_cast<int>(ch) << std::dec;
 			} else {
 				output << static_cast<char>(ch);
 			}
 			break;
 		}
 	}
 	return output.str();
 }
 [[nodiscard]]
 std::string format_iso_utc(const std::uint64_t timestamp_ns) {
 	const auto seconds = static_cast<std::time_t>(timestamp_ns / 1000000000ull);
 	const auto nanos = timestamp_ns % 1000000000ull;
 	std::tm tm{};
 #if defined(_WIN32)
 	gmtime_s(&tm, &seconds);
 #else
 	gmtime_r(&seconds, &tm);
 #endif
 	std::ostringstream output;
 	output << std::put_time(&tm, "%Y-%m-%dT%H:%M:%S") << '.' << std::setw(9) << std::setfill('0') << nanos << 'Z';
 	return output.str();
 }
 [[nodiscard]]
 bool is_video_message(const auto &view) {
 	if (view.channel == nullptr || view.schema == nullptr) {
 		return false;
 	}
 	return view.schema->encoding == "protobuf" &&
 		view.schema->name == "foxglove.CompressedVideo" &&
 		view.channel->messageEncoding == "protobuf";
 }
 [[nodiscard]]
 BoundsSummary collect_bounds(const Config &config, ToolExitCode &error_code) {
 	mcap::McapReader reader{};
 	const auto open_status = reader.open(config.input_path);
 	if (!open_status.ok()) {
 		spdlog::error("failed to open MCAP file '{}': {}", config.input_path, open_status.message);
 		error_code = ToolExitCode::OpenError;
 		return {};
 	}
 	BoundsSummary summary{};
 	auto messages = reader.readMessages();
 	for (auto it = messages.begin(); it != messages.end(); ++it) {
 		if (it->channel == nullptr) {
 			spdlog::error("MCAP message missing channel metadata");
 			reader.close();
 			error_code = ToolExitCode::SchemaError;
 			return {};
 		}
 		if (it->schema == nullptr) {
 			continue;
 		}
 		if (!is_video_message(*it)) {
 			continue;
 		}
 		foxglove::CompressedVideo message{};
 		if (!message.ParseFromArray(it->message.data, static_cast<int>(it->message.dataSize))) {
 			spdlog::error("failed to parse foxglove.CompressedVideo payload from '{}'", config.input_path);
 			reader.close();
 			error_code = ToolExitCode::ParseError;
 			return {};
 		}
 		auto timestamp_ns = proto_timestamp_ns(message.timestamp());
 		if (timestamp_ns == 0) {
 			timestamp_ns = it->message.logTime;
 		}
 		summary.start_ns = std::min(summary.start_ns, timestamp_ns);
 		summary.end_ns = std::max(summary.end_ns, timestamp_ns);
 		summary.message_count += 1;
 	}
 	reader.close();
 	if (summary.message_count == 0) {
 		spdlog::error("no foxglove.CompressedVideo messages found in '{}'", config.input_path);
 		error_code = ToolExitCode::EmptyError;
 		return {};
 	}
 	error_code = ToolExitCode::Success;
 	return summary;
 }
 void print_json(const Config &config, const BoundsSummary &summary) {
 	std::cout
 		<< '{'
 		<< "\"input_path\":\"" << json_escape(config.input_path) << "\","
 		<< "\"start_ns\":" << summary.start_ns << ','
 		<< "\"end_ns\":" << summary.end_ns << ','
 		<< "\"duration_ns\":" << (summary.end_ns - summary.start_ns) << ','
 		<< "\"video_message_count\":" << summary.message_count << ','
 		<< "\"start_iso_utc\":\"" << format_iso_utc(summary.start_ns) << "\","
 		<< "\"end_iso_utc\":\"" << format_iso_utc(summary.end_ns) << "\""
 		<< "}\n";
 }
 void print_text(const Config &config, const BoundsSummary &summary) {
 	std::cout
 		<< config.input_path << '\t'
 		<< summary.start_ns << '\t'
 		<< summary.end_ns << '\t'
 		<< summary.message_count << '\t'
 		<< format_iso_utc(summary.start_ns) << '\t'
 		<< format_iso_utc(summary.end_ns)
 		<< '\n';
 }
 }  // namespace
 int main(int argc, char **argv) {
 	Config config{};
 	CLI::App app{"mcap_video_bounds - emit bundled video timestamp bounds from an MCAP"};
 	app.add_option("input", config.input_path, "Input MCAP path")->required();
 	app.add_flag("--json", config.json, "Emit a JSON object instead of tab-separated text");
 	try {
 		app.parse(argc, argv);
 	} catch (const CLI::ParseError &e) {
 		return app.exit(e);
 	}
 	auto error_code = ToolExitCode::Success;
 	const auto summary = collect_bounds(config, error_code);
 	if (error_code != ToolExitCode::Success) {
 		return exit_code(error_code);
 	}
 	if (config.json) {
 		print_json(config, summary);
 	} else {
 		print_text(config, summary);
 	}
 	return exit_code(ToolExitCode::Success);
 }
@@ -0,0 +1,139 @@
 from __future__ import annotations
 import datetime as dt
 import tempfile
 import unittest
 from pathlib import Path
 import duckdb
 from scripts.zed_segment_time_index import (
    BoundsRow,
    format_ns_iso,
    infer_dataset_timezone,
    parse_timestamp_to_ns,
    parse_timestamp_window,
    require_query_window,
    scan_segment_dir,
    write_index,
 )
 class TimestampParseTests(unittest.TestCase):
    def test_parse_folder_style_timestamp(self) -> None:
        actual = parse_timestamp_to_ns("2026-03-18T12-00-23", "UTC")
        expected = parse_timestamp_to_ns("2026-03-18T12:00:23+00:00", "UTC")
        self.assertEqual(actual, expected)
    def test_parse_integer_epoch_milliseconds(self) -> None:
        self.assertEqual(parse_timestamp_to_ns("1710000000123", "UTC"), 1710000000123 * 1_000_000)
    def test_parse_timestamp_window_for_second_precision_text(self) -> None:
        start_ns, end_ns = parse_timestamp_window("2026-03-18T12-00-23", "UTC")
        self.assertEqual(end_ns - start_ns, 999_999_999)
    def test_require_query_window_rejects_mixed_modes(self) -> None:
        with self.assertRaises(Exception):
            require_query_window("1", "2", "3", "UTC")
    def test_format_ns_iso_utc(self) -> None:
        rendered = format_ns_iso(1_710_000_000_123_000_000, dt.timezone.utc)
        self.assertTrue(rendered.startswith("2024-03-09T16:00:00.123000000"))
 class SegmentDiscoveryTests(unittest.TestCase):
    def test_scan_segment_dir_accepts_multicamera_dir(self) -> None:
        with tempfile.TemporaryDirectory() as tmp:
            segment_dir = Path(tmp)
            for label in ("zed1", "zed2", "zed3", "zed4"):
                (segment_dir / f"2026-03-18T12-00-23_{label}.svo2").write_bytes(b"")
            scan = scan_segment_dir(segment_dir)
            self.assertTrue(scan.is_valid)
            self.assertEqual(scan.camera_labels, ("zed1", "zed2", "zed3", "zed4"))
    def test_scan_segment_dir_rejects_partial_dir(self) -> None:
        with tempfile.TemporaryDirectory() as tmp:
            segment_dir = Path(tmp)
            (segment_dir / "2026-03-18T12-00-23_zed1.svo2").write_bytes(b"")
            scan = scan_segment_dir(segment_dir)
            self.assertFalse(scan.is_valid)
 class DuckDbIndexTests(unittest.TestCase):
    def test_infer_dataset_timezone_from_folder_names(self) -> None:
        row = BoundsRow(
            segment_dir=Path("/tmp/bar/2026-03-18T11-59-41"),
            relative_segment_dir="bar/2026-03-18T11-59-41",
            group_path="bar",
            activity="bar",
            segment_name="2026-03-18T11-59-41",
            mcap_path=Path("/tmp/bar/2026-03-18T11-59-41/2026-03-18T11-59-41.mcap"),
            start_ns=1_773_806_381_201_081_000,
            end_ns=1_773_806_392_268_226_000,
            duration_ns=11_067_145_000,
            start_iso_utc="2026-03-18T03:59:41.201081000Z",
            end_iso_utc="2026-03-18T03:59:52.268226000Z",
            camera_count=4,
            camera_labels="zed1,zed2,zed3,zed4",
            video_message_count=1330,
            index_source="mcap_video_bounds",
        )
        self.assertEqual(infer_dataset_timezone([row]), "UTC+08:00")
    def test_write_index_and_query_overlap(self) -> None:
        with tempfile.TemporaryDirectory() as tmp:
            root = Path(tmp) / "dataset"
            root.mkdir()
            index_path = root / "segment_time_index.duckdb"
            rows = [
                BoundsRow(
                    segment_dir=root / "bar" / "2026-03-18T12-00-23",
                    relative_segment_dir="bar/2026-03-18T12-00-23",
                    group_path="bar",
                    activity="bar",
                    segment_name="2026-03-18T12-00-23",
                    mcap_path=root / "bar" / "2026-03-18T12-00-23" / "2026-03-18T12-00-23.mcap",
                    start_ns=100,
                    end_ns=200,
                    duration_ns=100,
                    start_iso_utc="1970-01-01T00:00:00.000000100Z",
                    end_iso_utc="1970-01-01T00:00:00.000000200Z",
                    camera_count=4,
                    camera_labels="zed1,zed2,zed3,zed4",
                    video_message_count=1330,
                    index_source="mcap_video_bounds",
                ),
                BoundsRow(
                    segment_dir=root / "run" / "2026-03-18T12-01-00",
                    relative_segment_dir="run/2026-03-18T12-01-00",
                    group_path="run",
                    activity="run",
                    segment_name="2026-03-18T12-01-00",
                    mcap_path=root / "run" / "2026-03-18T12-01-00" / "2026-03-18T12-01-00.mcap",
                    start_ns=250,
                    end_ns=400,
                    duration_ns=150,
                    start_iso_utc="1970-01-01T00:00:00.000000250Z",
                    end_iso_utc="1970-01-01T00:00:00.000000400Z",
                    camera_count=4,
                    camera_labels="zed1,zed2,zed3,zed4",
                    video_message_count=1400,
                    index_source="mcap_video_bounds",
                ),
            ]
            write_index(index_path, root, rows)
            conn = duckdb.connect(str(index_path), read_only=True)
            try:
                matches = conn.execute(
                    "SELECT relative_segment_dir FROM segments WHERE start_ns <= ? AND end_ns >= ? ORDER BY start_ns",
                    [300, 180],
                ).fetchall()
                self.assertEqual(matches, [("bar/2026-03-18T12-00-23",), ("run/2026-03-18T12-01-00",)])
            finally:
                conn.close()
 if __name__ == "__main__":
    unittest.main()
@@ -33,6 +33,7 @@ version = "0.0.0"
 source = { virtual = "." }
 dependencies = [
    { name = "click" },
    { name = "duckdb" },
    { name = "numpy", version = "2.2.6", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version < '3.11'" },
    { name = "numpy", version = "2.4.3", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version >= '3.11'" },
    { name = "opencv-python-headless" },
@@ -44,6 +45,7 @@ dependencies = [
 [package.metadata]
 requires-dist = [
    { name = "click", specifier = ">=8.1" },
    { name = "duckdb", specifier = ">=1.0" },
    { name = "numpy", specifier = ">=2.2" },
    { name = "opencv-python-headless", specifier = ">=4.11" },
    { name = "progress-table", specifier = ">=3.2" },
@@ -51,6 +53,48 @@ requires-dist = [
    { name = "zstandard", specifier = ">=0.23" },
 ]
 [[package]]
 name = "duckdb"
 version = "1.5.0"
 source = { registry = "https://pypi.org/simple" }
 sdist = { url = "https://files.pythonhosted.org/packages/ee/11/e05a7eb73a373d523e45d83c261025e02bc31ebf868e6282c30c4d02cc59/duckdb-1.5.0.tar.gz", hash = "sha256:f974b61b1c375888ee62bc3125c60ac11c4e45e4457dd1bb31a8f8d3cf277edd", size = 17981141, upload-time = "2026-03-09T12:50:26.372Z" }
 wheels = [
    { url = "https://files.pythonhosted.org/packages/e0/5d/8fa129bbd604d0e91aa9a0a407e7d2acc559b6024c3f887868fd7a13871d/duckdb-1.5.0-cp310-cp310-macosx_10_9_universal2.whl", hash = "sha256:47fbb1c053a627a91fa71ec883951561317f14a82df891c00dcace435e8fea78", size = 30012348, upload-time = "2026-03-09T12:48:39.133Z" },
    { url = "https://files.pythonhosted.org/packages/0c/31/db320641a262a897755e634d16838c98d5ca7dc91f4e096e104e244a3a01/duckdb-1.5.0-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:2b546a30a6ac020165a86ab3abac553255a6e8244d5437d17859a6aa338611aa", size = 15940515, upload-time = "2026-03-09T12:48:41.905Z" },
    { url = "https://files.pythonhosted.org/packages/0b/45/5725684794fbabf54d8dbae5247685799a6bf8e1e930ebff3a76a726772c/duckdb-1.5.0-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:122396041c0acb78e66d7dc7d36c55f03f67fe6ad012155c132d82739722e381", size = 14193724, upload-time = "2026-03-09T12:48:44.105Z" },
    { url = "https://files.pythonhosted.org/packages/27/68/f110c66b43e27191d7e53d3587e118568b73d66f23cb9bd6c7e0a560fd6d/duckdb-1.5.0-cp310-cp310-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:4a2cd73d50ea2c2bf618a4b7d22fe7c4115a1c9083d35654a0d5d421620ed999", size = 19218777, upload-time = "2026-03-09T12:48:46.399Z" },
    { url = "https://files.pythonhosted.org/packages/ec/9d/46affc9257377cbc865e494650312a7a08a56e85aa8d702eb297bec430b7/duckdb-1.5.0-cp310-cp310-manylinux_2_26_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:63a8ea3b060a881c90d1c1b9454abed3daf95b6160c39bbb9506fee3a9711730", size = 21311205, upload-time = "2026-03-09T12:48:48.895Z" },
    { url = "https://files.pythonhosted.org/packages/3b/34/dac03ab7340989cda258655387959c88342ea3b44949751391267bcbc830/duckdb-1.5.0-cp310-cp310-win_amd64.whl", hash = "sha256:238d576ae1dda441f8c79ed1370c5ccf863e4a5d59ca2563f9c96cd26b2188ac", size = 13043217, upload-time = "2026-03-09T12:48:51.262Z" },
    { url = "https://files.pythonhosted.org/packages/01/0c/0282b10a1c96810606b916b8d58a03f2131bd3ede14d2851f58b0b860e7c/duckdb-1.5.0-cp311-cp311-macosx_10_9_universal2.whl", hash = "sha256:3298bd17cf0bb5f342fb51a4edc9aadacae882feb2b04161a03eb93271c70c86", size = 30014615, upload-time = "2026-03-09T12:48:54.061Z" },
    { url = "https://files.pythonhosted.org/packages/71/e8/cbbc920078a794f24f63017fc55c9cbdb17d6fb94d3973f479b2d9f2983d/duckdb-1.5.0-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:13f94c49ca389731c439524248e05007fb1a86cd26f1e38f706abc261069cd41", size = 15940493, upload-time = "2026-03-09T12:48:57.85Z" },
    { url = "https://files.pythonhosted.org/packages/31/b6/6cae794d5856259b0060f79d5db71c7fdba043950eaa6a9d72b0bad16095/duckdb-1.5.0-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:ab9d597b1e8668466f1c164d0ea07eaf0ebb516950f5a2e794b0f52c81ff3b16", size = 14194663, upload-time = "2026-03-09T12:49:00.416Z" },
    { url = "https://files.pythonhosted.org/packages/82/07/aba3887658b93a36ce702dd00ca6a6422de3d14c7ee3a4b4c03ea20a99c0/duckdb-1.5.0-cp311-cp311-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:a43f8289b11c0b50d13f96ab03210489d37652f3fd7911dc8eab04d61b049da2", size = 19220501, upload-time = "2026-03-09T12:49:03.431Z" },
    { url = "https://files.pythonhosted.org/packages/fc/a2/723e6df48754e468fa50d7878eb860906c975eafe317c4134a8482ca220e/duckdb-1.5.0-cp311-cp311-manylinux_2_26_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:4f514e796a116c5de070e99974e42d0b8c2e6c303386790e58408c481150d417", size = 21316142, upload-time = "2026-03-09T12:49:06.223Z" },
    { url = "https://files.pythonhosted.org/packages/03/af/4dcbdf8f2349ed0b054c254ec59bc362ce6ddf603af35f770124c0984686/duckdb-1.5.0-cp311-cp311-win_amd64.whl", hash = "sha256:cf503ba2c753d97c76beb111e74572fef8803265b974af2dca67bba1de4176d2", size = 13043445, upload-time = "2026-03-09T12:49:08.892Z" },
    { url = "https://files.pythonhosted.org/packages/60/5e/1bb7e75a63bf3dc49bc5a2cd27a65ffeef151f52a32db980983516f2d9f6/duckdb-1.5.0-cp311-cp311-win_arm64.whl", hash = "sha256:a1156e91e4e47f0e7d9c9404e559a1d71b372cd61790a407d65eb26948ae8298", size = 13883145, upload-time = "2026-03-09T12:49:11.566Z" },
    { url = "https://files.pythonhosted.org/packages/43/73/120e673e48ae25aaf689044c25ef51b0ea1d088563c9a2532612aea18e0a/duckdb-1.5.0-cp312-cp312-macosx_10_13_universal2.whl", hash = "sha256:9ea988d1d5c8737720d1b2852fd70e4d9e83b1601b8896a1d6d31df5e6afc7dd", size = 30057869, upload-time = "2026-03-09T12:49:14.65Z" },
    { url = "https://files.pythonhosted.org/packages/21/e9/61143471958d36d3f3e764cb4cd43330be208ddbff1c78d3310b9ee67fe8/duckdb-1.5.0-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:cb786d5472afc16cc3c7355eb2007172538311d6f0cc6f6a0859e84a60220375", size = 15963092, upload-time = "2026-03-09T12:49:17.478Z" },
    { url = "https://files.pythonhosted.org/packages/4f/71/76e37c9a599ad89dd944e6cbb3e6a8ad196944a421758e83adea507637b6/duckdb-1.5.0-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:dc92b238f4122800a7592e99134124cc9048c50f766c37a0778dd2637f5cbe59", size = 14220562, upload-time = "2026-03-09T12:49:23.518Z" },
    { url = "https://files.pythonhosted.org/packages/db/b8/de1831656d5d13173e27c79c7259c8b9a7bdc314fdc8920604838ea4c46d/duckdb-1.5.0-cp312-cp312-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:1b74cb205c21d3696d8f8b88adca401e1063d6e6f57c1c4f56a243610b086e30", size = 19245329, upload-time = "2026-03-09T12:49:26.307Z" },
    { url = "https://files.pythonhosted.org/packages/1f/8d/33d349a3bcbd3e9b7b4e904c19d5b97f058c4c20791b89a8d6323bb93dce/duckdb-1.5.0-cp312-cp312-manylinux_2_26_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:6e56c19ffd1ffe3642fa89639e71e2e00ab0cf107b62fe16e88030acaebcbde6", size = 21348041, upload-time = "2026-03-09T12:49:30.283Z" },
    { url = "https://files.pythonhosted.org/packages/e2/ec/591a4cad582fae04bc8f8b4a435eceaaaf3838cf0ca771daae16a3c2995b/duckdb-1.5.0-cp312-cp312-win_amd64.whl", hash = "sha256:86525e565ec0c43420106fd34ba2c739a54c01814d476c7fed3007c9ed6efd86", size = 13053781, upload-time = "2026-03-09T12:49:33.574Z" },
    { url = "https://files.pythonhosted.org/packages/db/62/42e0a13f9919173bec121c0ff702406e1cdd91d8084c3e0b3412508c3891/duckdb-1.5.0-cp312-cp312-win_arm64.whl", hash = "sha256:5faeebc178c986a7bfa68868a023001137a95a1110bf09b7356442a4eae0f7e7", size = 13862906, upload-time = "2026-03-09T12:49:36.598Z" },
    { url = "https://files.pythonhosted.org/packages/35/5d/af5501221f42e4e3662c047ecec4dcd0761229fceeba3c67ad4d9d8741df/duckdb-1.5.0-cp313-cp313-macosx_10_13_universal2.whl", hash = "sha256:11dd05b827846c87f0ae2f67b9ae1d60985882a7c08ce855379e4a08d5be0e1d", size = 30057396, upload-time = "2026-03-09T12:49:39.95Z" },
    { url = "https://files.pythonhosted.org/packages/43/bd/a278d73fedbd3783bf9aedb09cad4171fe8e55bd522952a84f6849522eb6/duckdb-1.5.0-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:5ad8d9c91b7c280ab6811f59deff554b845706c20baa28c4e8f80a95690b252b", size = 15962700, upload-time = "2026-03-09T12:49:43.504Z" },
    { url = "https://files.pythonhosted.org/packages/76/fc/c916e928606946209c20fb50898dabf120241fb528a244e2bd8cde1bd9e2/duckdb-1.5.0-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:0ee4dabe03ed810d64d93927e0fd18cd137060b81ee75dcaeaaff32cbc816656", size = 14220272, upload-time = "2026-03-09T12:49:46.867Z" },
    { url = "https://files.pythonhosted.org/packages/53/07/1390e69db922423b2e111e32ed342b3e8fad0a31c144db70681ea1ba4d56/duckdb-1.5.0-cp313-cp313-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:9409ed1184b363ddea239609c5926f5148ee412b8d9e5ffa617718d755d942f6", size = 19244401, upload-time = "2026-03-09T12:49:49.865Z" },
    { url = "https://files.pythonhosted.org/packages/54/13/b58d718415cde993823a54952ea511d2612302f1d2bc220549d0cef752a4/duckdb-1.5.0-cp313-cp313-manylinux_2_26_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:1df8c4f9c853a45f3ec1e79ed7fe1957a203e5ec893bbbb853e727eb93e0090f", size = 21345827, upload-time = "2026-03-09T12:49:52.977Z" },
    { url = "https://files.pythonhosted.org/packages/e0/96/4460429651e371eb5ff745a4790e7fa0509c7a58c71fc4f0f893404c9646/duckdb-1.5.0-cp313-cp313-win_amd64.whl", hash = "sha256:9a3d3dfa2d8bc74008ce3ad9564761ae23505a9e4282f6a36df29bd87249620b", size = 13053101, upload-time = "2026-03-09T12:49:56.134Z" },
    { url = "https://files.pythonhosted.org/packages/ba/54/6d5b805113214b830fa3c267bb3383fb8febaa30760d0162ef59aadb110a/duckdb-1.5.0-cp313-cp313-win_arm64.whl", hash = "sha256:2deebcbafd9d39c04f31ec968f4dd7cee832c021e10d96b32ab0752453e247c8", size = 13865071, upload-time = "2026-03-09T12:49:59.282Z" },
    { url = "https://files.pythonhosted.org/packages/66/9f/dd806d4e8ecd99006eb240068f34e1054533da1857ad06ac726305cd102d/duckdb-1.5.0-cp314-cp314-macosx_10_15_universal2.whl", hash = "sha256:d4b618de670cd2271dd7b3397508c7b3c62d8ea70c592c755643211a6f9154fa", size = 30065704, upload-time = "2026-03-09T12:50:02.671Z" },
    { url = "https://files.pythonhosted.org/packages/79/c2/7b7b8a5c65d5535c88a513e267b5e6d7a55ab3e9b67e4ddd474454653268/duckdb-1.5.0-cp314-cp314-macosx_10_15_x86_64.whl", hash = "sha256:065ae50cb185bac4b904287df72e6b4801b3bee2ad85679576dd712b8ba07021", size = 15964883, upload-time = "2026-03-09T12:50:06.343Z" },
    { url = "https://files.pythonhosted.org/packages/23/c5/9a52a2cdb228b8d8d191a603254364d929274d9cc7d285beada8f7daa712/duckdb-1.5.0-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:6be5e48e287a24d98306ce9dd55093c3b105a8fbd8a2e7a45e13df34bf081985", size = 14221498, upload-time = "2026-03-09T12:50:10.567Z" },
    { url = "https://files.pythonhosted.org/packages/b8/68/646045cb97982702a8a143dc2e45f3bdcb79fbe2d559a98d74b8c160e5e2/duckdb-1.5.0-cp314-cp314-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:a5ee41a0bf793882f02192ce105b9a113c3e8c505a27c7ef9437d7b756317113", size = 19249787, upload-time = "2026-03-09T12:50:13.524Z" },
    { url = "https://files.pythonhosted.org/packages/15/1b/5abf0c7f38febb3b4a231c784223fceccfd3f2bfd957699d786f46e41ce6/duckdb-1.5.0-cp314-cp314-manylinux_2_26_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:f8e42aaf3cd217417c5dc9ff522dc3939d18b25a6fe5f846348277e831e6f59c", size = 21351583, upload-time = "2026-03-09T12:50:16.701Z" },
    { url = "https://files.pythonhosted.org/packages/93/a4/a90f2901cc0a1ce7ca4f0564b8492b9dbfe048a6395b27933d46ae9be473/duckdb-1.5.0-cp314-cp314-win_amd64.whl", hash = "sha256:11ae50aaeda2145b50294ee0247e4f11fb9448b3cc3d2aea1cfc456637dfb977", size = 13575130, upload-time = "2026-03-09T12:50:19.716Z" },
    { url = "https://files.pythonhosted.org/packages/64/aa/f14dd5e241ec80d9f9d82196ca65e0c53badfc8a7a619d5497c5626657ad/duckdb-1.5.0-cp314-cp314-win_arm64.whl", hash = "sha256:d6d2858c734d1a7e7a1b6e9b8403b3fce26dfefb4e0a2479c420fba6cd36db36", size = 14341879, upload-time = "2026-03-09T12:50:22.347Z" },
 ]
 [[package]]
 name = "numpy"
 version = "2.2.6"