feat!: reorganize detection and tracking pipeline
Refactor the package into common, schema, detection, and tracking namespaces and move dataset-specific ActualTest utilities into tests/support. Add a pluggable detection stack with typed protocols, pydantic-settings config, loguru-based runner logging, cvmmap and headless video sources, NATS and parquet sinks, and a structured coco-wholebody133 payload path. Teach tracking replay loading to consume parquet detection directories directly, preserve empty frames, and keep the video-to-parquet-to-tracking workflow usable for offline E2E runs. Vendor the local mmcv and xtcocotools wheels under Git LFS, update uv sources/lock state, and refresh the mmcv build so mmcv.ops loads successfully with the current torch+cu130 environment.
This commit is contained in:
@@ -9,13 +9,13 @@ Offline multiview body tracking experiments built around:
|
||||
## Install
|
||||
|
||||
```bash
|
||||
uv sync --extra dev
|
||||
uv sync --group dev
|
||||
```
|
||||
|
||||
## Run
|
||||
|
||||
```bash
|
||||
uv run pose-tracking-exp run data/scene.json data/replay.jsonl
|
||||
uv run pose-tracking-exp run_tracking data/scene.json data/replay.jsonl
|
||||
```
|
||||
|
||||
`scene.json` may declare camera extrinsics in either format:
|
||||
@@ -26,13 +26,58 @@ uv run pose-tracking-exp run data/scene.json data/replay.jsonl
|
||||
The loader normalizes both to OpenCV extrinsics for reprojection and converts to RPT pose only when building the triangulation config.
|
||||
If you already have an older hand-authored scene file that stored RPT camera pose directly, set `extrinsic_format` explicitly to `rpt_camera_pose`.
|
||||
|
||||
## Convert ParaJumping Payload Records
|
||||
## Convert cvmmap Pose Payload Records
|
||||
|
||||
```bash
|
||||
uv run pose-tracking-exp convert-parajumping input.jsonl output.jsonl
|
||||
uv run pose-tracking-exp convert-cvmmap-pose input.jsonl output.jsonl
|
||||
```
|
||||
|
||||
## ActualTest Calibration Caveat
|
||||
The current cvmmap `.pose` wire format is fixed to `COCO-WholeBody-133` keypoints.
|
||||
That is a transport compatibility constraint, not a tracker limitation: the tracker-side normalizer accepts both `coco17` and `coco_wholebody133`, because the first 17 body joints share the standard COCO ordering.
|
||||
|
||||
References:
|
||||
|
||||
- https://mmpose.readthedocs.io/en/latest/dataset_zoo/2d_wholebody_keypoint.html
|
||||
- https://github.com/jin-s13/COCO-WholeBody
|
||||
|
||||
## Run Detection
|
||||
|
||||
```bash
|
||||
uv sync --group dev --group detection
|
||||
uv run pose-tracking-exp run_detection --config detection.toml camera0 camera1
|
||||
uv run pose-tracking-exp run_detection --source video --output-dir data/detections --config detection.toml cam0=/data/cam0.mp4 cam1=/data/cam1.mp4
|
||||
```
|
||||
|
||||
The embedded 2D detection module is organized as a swapable shim:
|
||||
|
||||
- `FrameSource`: where images come from
|
||||
- `PoseShim`: object detection + pose estimation backend
|
||||
- `PoseSink`: where structured detections are published or stored
|
||||
|
||||
The default backend is `yolo_rtmpose`, and the heavy runtime dependencies live in the optional `detection` dependency group.
|
||||
Checkpoint paths are explicit config fields; the code does not hardcode local checkpoint locations.
|
||||
The only inferred path is the MMPose config path, which is resolved relative to the installed `mmpose` package when `pose_config_path` is omitted.
|
||||
For offline video runs, the default sink is parquet and writes one `*_detected.parquet` file per source. `run_tracking` can consume that directory directly as replay input.
|
||||
|
||||
Example `detection.toml`:
|
||||
|
||||
```bash
|
||||
instances = ["camera0", "camera1"]
|
||||
device = "cuda"
|
||||
yolo_checkpoint = "/path/to/yolo_checkpoint.pt"
|
||||
pose_checkpoint = "/path/to/coco_wholebody_pose_checkpoint.pth"
|
||||
```
|
||||
|
||||
## Actual Test Helper
|
||||
|
||||
```bash
|
||||
uv run --group dev --group detection python -m tests.support.actual_test /mnt/hddl/data/ActualTest_WeiHua --segment Segment_2 --frame-start 1100 --max-frames 120
|
||||
```
|
||||
|
||||
`actual_test` is a test/support helper, not part of the public installed CLI surface.
|
||||
It keeps the union of per-camera frame indices and fills missing camera rows with empty detections, so later 2-camera stretches are still usable instead of being dropped by a 4-camera intersection.
|
||||
|
||||
## Actual Test Calibration Caveat
|
||||
|
||||
`ActualTest_WeiHua/camera_params.parquet` appears to store raw OpenCV extrinsics from the ChArUco pipeline, not camera poses. The tracker now converts those values before calling `RapidPoseTriangulation`, because RPT expects camera centers and camera-to-world rotation.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user