docs(readme): rewrite repo status and fork context

Rewrite the top-level README to describe the current Python-first package, the RGB-D pipeline ported from SimpleDepthPose, and the main differences from upstream RapidPoseTriangulation.
2026-03-26 13:14:06 +08:00
parent ed721729fd
commit 502a90761b
1 changed files with 146 additions and 40 deletions
@@ -1,10 +1,11 @@
 # RapidPoseTriangulation

-Fast triangulation of multiple persons from multiple camera views. \
-A general overview can be found in the paper [RapidPoseTriangulation: Multi-view Multi-person Whole-body Human Pose Triangulation in a Millisecond](https://arxiv.org/pdf/2503.21692).
+Fast multi-view multi-person pose reconstruction, packaged as a Python-first C++ library.
+
+This repository started from [RapidPoseTriangulation](https://arxiv.org/pdf/2503.21692) and has since been refactored into a slimmer library-focused fork. The current repo keeps the triangulation core, exposes it through `nanobind`, and adds an RGB-D reconstruction path inspired by [SimpleDepthPose](https://arxiv.org/pdf/2501.18478).

 <div align="center">
-    <img src="media/2d-k.jpg" alt="2D detections"" width="65%"/>
+    <img src="media/2d-k.jpg" alt="2D detections" width="65%"/>
    <b>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</b>
    <img src="media/3d-p.jpg" alt="3D detections" width="30%"/>
    <br>
@@ -12,58 +13,151 @@ A general overview can be found in the paper [RapidPoseTriangulation: Multi-view
    <img src="media/2d-p.jpg" alt="3D to 2D projection" width="95%"/>
 </div>

-<br>
+## What This Repository Is Now

-## Build
+- A packaged library named `rapid-pose-triangulation` with Python bindings under `rpt`
+- A C++ core built with `scikit-build-core` and `nanobind`
+- A triangulation library for calibrated multi-view 2D detections
+- An RGB-D reconstruction helper layer that samples aligned depth, applies joint offsets, lifts poses into world coordinates, and merges per-view proposals

- Clone this project:
+Current package status:

-    ```bash
-    git clone https://gitlab.com/Percipiote/RapidPoseTriangulation.git
-    cd RapidPoseTriangulation/
-    ```
+- Python `>=3.10`
+- NumPy runtime dependency
+- Current version: `0.2.0`

- Enable GPU-access for docker building:
+## Current Capabilities

-  - Install _nvidia_ container tools: [Link](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html)
+The public Python API exposed by `rpt` currently includes:

-  - Run `sudo nano /etc/docker/daemon.json` and add:
+- Camera/config helpers: `make_camera`, `convert_cameras`, `make_triangulation_config`
+- Input preparation: `pack_poses_2d`
+- Triangulation: `triangulate_poses`, `triangulate_debug`, `triangulate_with_report`
+- Tracking/debug helpers: `filter_pairs_with_previous_poses`
+- RGB-D helpers: `sample_depth_for_poses`, `apply_depth_offsets`, `lift_depth_poses_to_world`, `merge_rgbd_views`, `reconstruct_rgbd`

-    ```json
-    {
-      "runtimes": {
-        "nvidia": {
-          "args": [],
-          "path": "nvidia-container-runtime"
-        }
-      },
-      "default-runtime": "nvidia"
-    }
-    ```
+At a high level there are now two supported reconstruction paths:

-  - Restart docker: `sudo systemctl restart docker`
+1. Multi-view RGB triangulation from calibrated 2D detections
+2. Multi-view RGB-D reconstruction from calibrated 2D detections plus aligned depth images

- Build docker container:
+## Installation And Development Workflow

-  ```bash
-  docker build --progress=plain -t rapidposetriangulation .
-  ./run_container.sh
-  ```
+Clone the repo and use `uv` for local development:

- Build triangulator:
+```bash
+git clone https://git.weihua-iot.cn/crosstyan/RapidPoseTriangulation.git
+cd RapidPoseTriangulation
+uv sync --group dev
+```

-  ```bash
-  cd /RapidPoseTriangulation/
-  uv sync --group dev
-  uv run pytest tests/test_interface.py
-  uv build
-  ```
+Run the test suite:

-<br>
+```bash
+uv run pytest -q
+```

-## Citation
+Build source and wheel artifacts:

-Please cite [RapidPoseTriangulation](https://arxiv.org/pdf/2503.21692) if you found it helpful for your research or business.
+```bash
+uv build
+```
+
+`run_container.sh` is still present in the repo, but it is a leftover helper script rather than the primary or best-supported development workflow.
+
+## Python API Overview
+
+Typical triangulation flow:
+
+```python
+import numpy as np
+import rpt
+
+cameras = rpt.convert_cameras(raw_cameras)
+config = rpt.make_triangulation_config(
+    cameras,
+    roomparams=np.asarray([[5.6, 6.4, 2.4], [0.0, -0.5, 1.2]], dtype=np.float32),
+    joint_names=joint_names,
+)
+
+poses_2d, person_counts = rpt.pack_poses_2d(views, joint_count=len(joint_names))
+poses_3d = rpt.triangulate_poses(poses_2d, person_counts, config)
+```
+
+Typical RGB-D flow:
+
+```python
+poses_2d, person_counts = rpt.pack_poses_2d(views, joint_count=len(joint_names))
+poses_3d = rpt.reconstruct_rgbd(
+    poses_2d,
+    person_counts,
+    depth_images,
+    config,
+    use_depth_offsets=True,
+)
+```
+
+The lower-level RGB-D helpers are also available if you want to inspect or customize the intermediate steps:
+
+- `sample_depth_for_poses`: sample aligned depth around visible 2D joints
+- `apply_depth_offsets`: add per-joint offsets derived from SimpleDepthPose
+- `lift_depth_poses_to_world`: convert `[u, v, d, score]` joints into world-space `[x, y, z, score]`
+- `merge_rgbd_views`: merge per-view world-space pose proposals into final poses
+
+## Ported From SimpleDepthPose
+
+This fork ports the RGB-D fusion path from the sibling `SimpleDepthPose` project into `rpt`.
+
+The ported pieces are:
+
+- Depth sampling around each visible 2D joint, based on the `add_depth` preprocessing flow
+- Per-joint depth offsets, matching the SimpleDepthPose body-surface correction idea
+- UVD-to-world lifting using the calibrated camera intrinsics/extrinsics
+- Multi-view RGB-D pose fusion logic adapted from `PoseFuser`
+
+Compared with the original SimpleDepthPose implementation, the port here has been changed to fit a reusable library:
+
+- The workflow is exposed as stateless functions instead of script-driven pipelines
+- The fusion logic lives in the `rpt` core instead of a separate wrapper class
+- Camera and scene configuration are routed through `TriangulationConfig`
+- The RGB-D path is covered by repo tests and packaged with the same Python API as the triangulation path
+
+This repo does not attempt to port the full SimpleDepthPose project. It only ports the RGB-D reconstruction pieces that fit the current library scope.
+
+## Changed Vs Upstream RapidPoseTriangulation
+
+Compared with the upstream repository at `gitlab.com/Percipiote/RapidPoseTriangulation`, this fork has materially changed structure and scope:
+
+- SWIG bindings were replaced with `nanobind`
+- The repo was converted into a Python package under `src/rpt`
+- The triangulation interface was simplified around immutable cameras and config structs
+- The core was reshaped into a more library-oriented, zero-copy style API
+- Debug tracing and tracked association reports were added
+- Upstream integration layers and extra tooling were removed, including the old `extras/` stack and related deployment/inference wrappers
+- An RGB-D reconstruction pipeline was added by porting and adapting parts of SimpleDepthPose
+
+In practice, upstream is closer to a larger project tree with integrations and historical tooling, while this fork is closer to a compact reconstruction library.
+
+## Testing
+
+The repo currently ships Python-facing tests for both triangulation and RGB-D reconstruction:
+
+```bash
+uv run pytest tests/test_interface.py
+uv run pytest tests/test_rgbd.py
+```
+
+Or run everything:
+
+```bash
+uv run pytest -q
+```
+
+The checked-in sample data under `data/` is used by the triangulation tests.
+
+## Citation And Related Work
+
+Please cite [RapidPoseTriangulation](https://arxiv.org/pdf/2503.21692) if you use the triangulation core:

 ```bibtex
@article{
@@ -74,3 +168,15 @@ Please cite [RapidPoseTriangulation](https://arxiv.org/pdf/2503.21692) if you fo
  year={2025}
 }
 ```
+
+The RGB-D path in this fork is based on ideas ported from [SimpleDepthPose](https://arxiv.org/pdf/2501.18478):
+
+```bibtex
+@article{
+  simdepose,
+  title={{SimpleDepthPose: Fast and Reliable Human Pose Estimation with RGBD-Images}},
+  author={Bermuth, Daniel and Poeppel, Alexander and Reif, Wolfgang},
+  journal={arXiv preprint arXiv:2501.18478},
+  year={2025}
+}
+```