Add an idempotent patch-mmdet-version-gate entrypoint that rewrites the installed mmdet mmcv compatibility assert into a warning so the local rebuilt mmcv wheel can be reused after uv sync. Cover the source rewrite with focused tests and document the post-sync command in the README so the local environment patch is reproducible instead of being a one-off manual edit inside .venv.
4.2 KiB
pose_tracking_exp
Offline multiview body tracking experiments built around:
RapidPoseTriangulationfor geometric birth proposals- a typed replay format for recorded per-camera detections
- a recursive active/lost tracker with fixed bone lengths
Install
uv sync --group dev
Run
uv run pose-tracking-exp run_tracking data/scene.json data/replay.jsonl
scene.json may declare camera extrinsics in either format:
opencv_world_to_camera: OpenCVsolvePnP/cv2.projectPointsconvention. This is the default.rpt_camera_pose: camera pose in world coordinates, which is whatRapidPoseTriangulationexpects internally.
The loader normalizes both to OpenCV extrinsics for reprojection and converts to RPT pose only when building the triangulation config.
If you already have an older hand-authored scene file that stored RPT camera pose directly, set extrinsic_format explicitly to rpt_camera_pose.
Convert cvmmap Pose Payload Records
uv run pose-tracking-exp convert-cvmmap-pose input.jsonl output.jsonl
The current cvmmap .pose wire format is fixed to COCO-WholeBody-133 keypoints.
That is a transport compatibility constraint, not a tracker limitation: the tracker-side normalizer accepts both coco17 and coco_wholebody133, because the first 17 body joints share the standard COCO ordering.
References:
- https://mmpose.readthedocs.io/en/latest/dataset_zoo/2d_wholebody_keypoint.html
- https://github.com/jin-s13/COCO-WholeBody
Run Detection
uv sync --group dev --group detection
uv run patch-mmdet-version-gate
uv run pose-tracking-exp run_detection --config detection.toml camera0 camera1
uv run pose-tracking-exp run_detection --source video --output-dir data/detections --config detection.toml cam0=/data/cam0.mp4 cam1=/data/cam1.mp4
The embedded 2D detection module is organized as a swapable shim:
FrameSource: where images come fromPoseShim: object detection + pose estimation backendPoseSink: where structured detections are published or stored
The default backend is yolo_rtmpose, and the heavy runtime dependencies live in the optional detection dependency group.
Checkpoint paths are explicit config fields; the code does not hardcode local checkpoint locations.
The only inferred path is the MMPose config path, which is resolved relative to the installed mmpose package when pose_config_path is omitted.
For offline video runs, the default sink is parquet and writes one *_detected.parquet file per source. run_tracking can consume that directory directly as replay input.
uv run patch-mmdet-version-gate is an idempotent local-environment patch for the current mmdet compatibility assert against the rebuilt mmcv wheel. Re-run it after uv sync if the environment is recreated.
Example detection.toml:
instances = ["camera0", "camera1"]
device = "cuda"
yolo_checkpoint = "/path/to/yolo_checkpoint.pt"
pose_checkpoint = "/path/to/coco_wholebody_pose_checkpoint.pth"
Actual Test Helper
uv run --group dev --group detection python -m tests.support.actual_test /mnt/hddl/data/ActualTest_WeiHua --segment Segment_2 --frame-start 1100 --max-frames 120
actual_test is a test/support helper, not part of the public installed CLI surface.
It keeps the union of per-camera frame indices and fills missing camera rows with empty detections, so later 2-camera stretches are still usable instead of being dropped by a 4-camera intersection.
Actual Test Calibration Caveat
ActualTest_WeiHua/camera_params.parquet appears to store raw OpenCV extrinsics from the ChArUco pipeline, not camera poses. The tracker now converts those values before calling RapidPoseTriangulation, because RPT expects camera centers and camera-to-world rotation.
In repo terms:
- OpenCV reprojection keeps
R,T, andrvecas world-to-camera extrinsics. - RPT export uses the derived camera pose
pose_R = R^Tandpose_T = -R^T t.
There is still one upstream caveat: the ParaJumping calibration notebook averages rvec samples component-wise before writing the parquet. That is a rough approximation for rotations and can introduce some bias even when the convention is handled correctly.