docs(readme): rewrite repo status and fork context

Rewrite the top-level README to describe the current Python-first package, the RGB-D pipeline ported from SimpleDepthPose, and the main differences from upstream RapidPoseTriangulation.
This commit is contained in:
2026-03-26 13:14:06 +08:00
parent ed721729fd
commit 502a90761b
+146 -40
View File
@@ -1,10 +1,11 @@
# RapidPoseTriangulation
Fast triangulation of multiple persons from multiple camera views. \
A general overview can be found in the paper [RapidPoseTriangulation: Multi-view Multi-person Whole-body Human Pose Triangulation in a Millisecond](https://arxiv.org/pdf/2503.21692).
Fast multi-view multi-person pose reconstruction, packaged as a Python-first C++ library.
This repository started from [RapidPoseTriangulation](https://arxiv.org/pdf/2503.21692) and has since been refactored into a slimmer library-focused fork. The current repo keeps the triangulation core, exposes it through `nanobind`, and adds an RGB-D reconstruction path inspired by [SimpleDepthPose](https://arxiv.org/pdf/2501.18478).
<div align="center">
<img src="media/2d-k.jpg" alt="2D detections"" width="65%"/>
<img src="media/2d-k.jpg" alt="2D detections" width="65%"/>
<b>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</b>
<img src="media/3d-p.jpg" alt="3D detections" width="30%"/>
<br>
@@ -12,58 +13,151 @@ A general overview can be found in the paper [RapidPoseTriangulation: Multi-view
<img src="media/2d-p.jpg" alt="3D to 2D projection" width="95%"/>
</div>
<br>
## What This Repository Is Now
## Build
- A packaged library named `rapid-pose-triangulation` with Python bindings under `rpt`
- A C++ core built with `scikit-build-core` and `nanobind`
- A triangulation library for calibrated multi-view 2D detections
- An RGB-D reconstruction helper layer that samples aligned depth, applies joint offsets, lifts poses into world coordinates, and merges per-view proposals
- Clone this project:
Current package status:
```bash
git clone https://gitlab.com/Percipiote/RapidPoseTriangulation.git
cd RapidPoseTriangulation/
```
- Python `>=3.10`
- NumPy runtime dependency
- Current version: `0.2.0`
- Enable GPU-access for docker building:
## Current Capabilities
- Install _nvidia_ container tools: [Link](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html)
The public Python API exposed by `rpt` currently includes:
- Run `sudo nano /etc/docker/daemon.json` and add:
- Camera/config helpers: `make_camera`, `convert_cameras`, `make_triangulation_config`
- Input preparation: `pack_poses_2d`
- Triangulation: `triangulate_poses`, `triangulate_debug`, `triangulate_with_report`
- Tracking/debug helpers: `filter_pairs_with_previous_poses`
- RGB-D helpers: `sample_depth_for_poses`, `apply_depth_offsets`, `lift_depth_poses_to_world`, `merge_rgbd_views`, `reconstruct_rgbd`
```json
{
"runtimes": {
"nvidia": {
"args": [],
"path": "nvidia-container-runtime"
}
},
"default-runtime": "nvidia"
}
```
At a high level there are now two supported reconstruction paths:
- Restart docker: `sudo systemctl restart docker`
1. Multi-view RGB triangulation from calibrated 2D detections
2. Multi-view RGB-D reconstruction from calibrated 2D detections plus aligned depth images
- Build docker container:
## Installation And Development Workflow
```bash
docker build --progress=plain -t rapidposetriangulation .
./run_container.sh
```
Clone the repo and use `uv` for local development:
- Build triangulator:
```bash
git clone https://git.weihua-iot.cn/crosstyan/RapidPoseTriangulation.git
cd RapidPoseTriangulation
uv sync --group dev
```
```bash
cd /RapidPoseTriangulation/
uv sync --group dev
uv run pytest tests/test_interface.py
uv build
```
Run the test suite:
<br>
```bash
uv run pytest -q
```
## Citation
Build source and wheel artifacts:
Please cite [RapidPoseTriangulation](https://arxiv.org/pdf/2503.21692) if you found it helpful for your research or business.
```bash
uv build
```
`run_container.sh` is still present in the repo, but it is a leftover helper script rather than the primary or best-supported development workflow.
## Python API Overview
Typical triangulation flow:
```python
import numpy as np
import rpt
cameras = rpt.convert_cameras(raw_cameras)
config = rpt.make_triangulation_config(
cameras,
roomparams=np.asarray([[5.6, 6.4, 2.4], [0.0, -0.5, 1.2]], dtype=np.float32),
joint_names=joint_names,
)
poses_2d, person_counts = rpt.pack_poses_2d(views, joint_count=len(joint_names))
poses_3d = rpt.triangulate_poses(poses_2d, person_counts, config)
```
Typical RGB-D flow:
```python
poses_2d, person_counts = rpt.pack_poses_2d(views, joint_count=len(joint_names))
poses_3d = rpt.reconstruct_rgbd(
poses_2d,
person_counts,
depth_images,
config,
use_depth_offsets=True,
)
```
The lower-level RGB-D helpers are also available if you want to inspect or customize the intermediate steps:
- `sample_depth_for_poses`: sample aligned depth around visible 2D joints
- `apply_depth_offsets`: add per-joint offsets derived from SimpleDepthPose
- `lift_depth_poses_to_world`: convert `[u, v, d, score]` joints into world-space `[x, y, z, score]`
- `merge_rgbd_views`: merge per-view world-space pose proposals into final poses
## Ported From SimpleDepthPose
This fork ports the RGB-D fusion path from the sibling `SimpleDepthPose` project into `rpt`.
The ported pieces are:
- Depth sampling around each visible 2D joint, based on the `add_depth` preprocessing flow
- Per-joint depth offsets, matching the SimpleDepthPose body-surface correction idea
- UVD-to-world lifting using the calibrated camera intrinsics/extrinsics
- Multi-view RGB-D pose fusion logic adapted from `PoseFuser`
Compared with the original SimpleDepthPose implementation, the port here has been changed to fit a reusable library:
- The workflow is exposed as stateless functions instead of script-driven pipelines
- The fusion logic lives in the `rpt` core instead of a separate wrapper class
- Camera and scene configuration are routed through `TriangulationConfig`
- The RGB-D path is covered by repo tests and packaged with the same Python API as the triangulation path
This repo does not attempt to port the full SimpleDepthPose project. It only ports the RGB-D reconstruction pieces that fit the current library scope.
## Changed Vs Upstream RapidPoseTriangulation
Compared with the upstream repository at `gitlab.com/Percipiote/RapidPoseTriangulation`, this fork has materially changed structure and scope:
- SWIG bindings were replaced with `nanobind`
- The repo was converted into a Python package under `src/rpt`
- The triangulation interface was simplified around immutable cameras and config structs
- The core was reshaped into a more library-oriented, zero-copy style API
- Debug tracing and tracked association reports were added
- Upstream integration layers and extra tooling were removed, including the old `extras/` stack and related deployment/inference wrappers
- An RGB-D reconstruction pipeline was added by porting and adapting parts of SimpleDepthPose
In practice, upstream is closer to a larger project tree with integrations and historical tooling, while this fork is closer to a compact reconstruction library.
## Testing
The repo currently ships Python-facing tests for both triangulation and RGB-D reconstruction:
```bash
uv run pytest tests/test_interface.py
uv run pytest tests/test_rgbd.py
```
Or run everything:
```bash
uv run pytest -q
```
The checked-in sample data under `data/` is used by the triangulation tests.
## Citation And Related Work
Please cite [RapidPoseTriangulation](https://arxiv.org/pdf/2503.21692) if you use the triangulation core:
```bibtex
@article{
@@ -74,3 +168,15 @@ Please cite [RapidPoseTriangulation](https://arxiv.org/pdf/2503.21692) if you fo
year={2025}
}
```
The RGB-D path in this fork is based on ideas ported from [SimpleDepthPose](https://arxiv.org/pdf/2501.18478):
```bibtex
@article{
simdepose,
title={{SimpleDepthPose: Fast and Reliable Human Pose Estimation with RGBD-Images}},
author={Bermuth, Daniel and Poeppel, Alexander and Reif, Wolfgang},
journal={arXiv preprint arXiv:2501.18478},
year={2025}
}
```