commit ee7e8641f89075936fcfd1bb859757be5f5adc5a Author: crosstyan Date: Tue Dec 9 15:54:50 2025 +0800 init diff --git a/.DS_Store b/.DS_Store new file mode 100644 index 0000000..5496de1 Binary files /dev/null and b/.DS_Store differ diff --git a/README.md b/README.md new file mode 100644 index 0000000..c107498 --- /dev/null +++ b/README.md @@ -0,0 +1,106 @@ +# Homework 01: Survey & Design of 3D Multi-view Human Pose Estimation System + +调研现有的多视角单人 (multiple view of single person) 多视角多人 (multi-view multi-person) +人体姿态估计系统/管线 + +## 参考答案 + +AI 生成的架构图, 不代表我此刻的实际想法, which changes every moment. + +```mermaid +flowchart TD + %% ========================= + %% Multi-view 2D stage + %% ========================= + subgraph VIEWS["Per-view input (cameras 1..N)"] + direction LR + C1["Cam 1\n2D detections: 133×2 (+conf)"] --> T1["2D latest tracking cache\n(view 1)"] + C2["Cam 2\n2D detections: 133×2 (+conf)"] --> T2["2D latest tracking cache\n(view 2)"] + C3["Cam 3\n2D detections: 133×2 (+conf)"] --> T3["2D latest tracking cache\n(view 3)"] + C4["Cam 4\n2D detections: 133×2 (+conf)"] --> T4["2D latest tracking cache\n(view 4)"] + end + + %% ========================= + %% Cross-view association + %% ========================= + subgraph ASSOC["Cross-view data association (epipolar)"] + direction TB + EPI["Epipolar constraint\n(Sampson / point-to-epiline)"]:::core + CYCLE["Cycle consistency / view-graph pruning"]:::core + GROUP["Assemble per-target multi-view observation set\n{view_id → 133×2}"]:::core + EPI --> CYCLE --> GROUP + end + + T1 --> EPI + T2 --> EPI + T3 --> EPI + T4 --> EPI + + %% ========================= + %% Geometry / lifting + %% ========================= + subgraph GEOM["3D measurement construction"] + direction TB + RT["Camera models\nK, [R|t], SO(3)/SE(3)"]:::meta + DLT["DLT / triangulation (init)"]:::core + NN["Optional NN lifting / completion"]:::core + BA["Optional reprojection refinement\n(1–5 iters)"]:::core + Y["3D measurement y(t)\nJ×3 positions (+quality / R / cov)"]:::out + RT --> DLT --> NN --> BA --> Y + end + + GROUP --> DLT + + %% ========================= + %% Tracking filter + lifecycle + %% ========================= + subgraph FILTER["Tracking filter (per target)"] + direction TB + GATE["Gating\n(Mahalanobis / per-joint + global)"]:::core + IMM["IMM (motion model bank)\n(CV/CA or low/med/high Q)"]:::core + PRED["Predict\nΔt, self-propagate"]:::core + UPD["Update\nKF (linear)\nstate: [p(3J), v(3J)]"]:::core + MISS["Miss handling & track lifecycle\n(tentative → confirmed → deleted)"]:::meta + + GATE --> IMM --> PRED --> UPD --> MISS + end + + Y --> GATE + + %% Optional inertial fusion + IMU["IMU (optional)"]:::meta --> INERT["EKF/UKF branch (optional)\nwhen augmenting state with orientation"]:::meta --> IMM + + %% ========================= + %% IK + optional feedback + %% ========================= + subgraph IKSTAGE["IK stage (constraint / anatomy)"] + direction TB + IK["IK optimization target\n(minimize joint position error,\nadd bone length / joint limits)"]:::core + FB["Optional feedback to filter\npseudo-measurement z_IK with large R"]:::meta + IK --> FB + end + + UPD --> IK + FB -.-> GATE + + %% ========================= + %% SMPL / mesh fitting + %% ========================= + subgraph SMPLSTAGE["SMPL / SMPL-X fitting"] + direction TB + VP["VPoser / pose prior"]:::core + SMPL["SMPL(θ, β, root)\nfit to joints / reprojection"]:::core + JR["JR: Joint Regressor\nmesh → joints (loop closure)"]:::core + OUT["Outputs\nmesh + joints + pose params"]:::out + VP --> SMPL --> JR --> OUT + JR -. residual / reproject .-> SMPL + end + + IK --> SMPL + + classDef core fill:#0b1020,stroke:#5eead4,color:#e5e7eb,stroke-width:1.2px; + classDef meta fill:#111827,stroke:#93c5fd,color:#e5e7eb,stroke-dasharray: 4 3; + classDef out fill:#052e2b,stroke:#34d399,color:#ecfeff,stroke-width:1.4px; +``` + +如果图片不能正确预览, 见 [fig/fig.svg](fig/fig.svg) diff --git a/build.md b/build.md new file mode 100644 index 0000000..70758f0 --- /dev/null +++ b/build.md @@ -0,0 +1 @@ +mmdc -i fig.mermaid -o fig.svg \ No newline at end of file diff --git a/fig.mermaid b/fig.mermaid new file mode 100644 index 0000000..8783fd8 --- /dev/null +++ b/fig.mermaid @@ -0,0 +1,93 @@ +flowchart TD + %% ========================= + %% Multi-view 2D stage + %% ========================= + subgraph VIEWS["Per-view input (cameras 1..N)"] + direction LR + C1["Cam 1\n2D detections: 133×2 (+conf)"] --> T1["2D latest tracking cache\n(view 1)"] + C2["Cam 2\n2D detections: 133×2 (+conf)"] --> T2["2D latest tracking cache\n(view 2)"] + C3["Cam 3\n2D detections: 133×2 (+conf)"] --> T3["2D latest tracking cache\n(view 3)"] + C4["Cam 4\n2D detections: 133×2 (+conf)"] --> T4["2D latest tracking cache\n(view 4)"] + end + + %% ========================= + %% Cross-view association + %% ========================= + subgraph ASSOC["Cross-view data association (epipolar)"] + direction TB + EPI["Epipolar constraint\n(Sampson / point-to-epiline)"]:::core + CYCLE["Cycle consistency / view-graph pruning"]:::core + GROUP["Assemble per-target multi-view observation set\n{view_id → 133×2}"]:::core + EPI --> CYCLE --> GROUP + end + + T1 --> EPI + T2 --> EPI + T3 --> EPI + T4 --> EPI + + %% ========================= + %% Geometry / lifting + %% ========================= + subgraph GEOM["3D measurement construction"] + direction TB + RT["Camera models\nK, [R|t], SO(3)/SE(3)"]:::meta + DLT["DLT / triangulation (init)"]:::core + NN["Optional NN lifting / completion"]:::core + BA["Optional reprojection refinement\n(1–5 iters)"]:::core + Y["3D measurement y(t)\nJ×3 positions (+quality / R / cov)"]:::out + RT --> DLT --> NN --> BA --> Y + end + + GROUP --> DLT + + %% ========================= + %% Tracking filter + lifecycle + %% ========================= + subgraph FILTER["Tracking filter (per target)"] + direction TB + GATE["Gating\n(Mahalanobis / per-joint + global)"]:::core + IMM["IMM (motion model bank)\n(CV/CA or low/med/high Q)"]:::core + PRED["Predict\nΔt, self-propagate"]:::core + UPD["Update\nKF (linear)\nstate: [p(3J), v(3J)]"]:::core + MISS["Miss handling & track lifecycle\n(tentative → confirmed → deleted)"]:::meta + + GATE --> IMM --> PRED --> UPD --> MISS + end + + Y --> GATE + + %% Optional inertial fusion + IMU["IMU (optional)"]:::meta --> INERT["EKF/UKF branch (optional)\nwhen augmenting state with orientation"]:::meta --> IMM + + %% ========================= + %% IK + optional feedback + %% ========================= + subgraph IKSTAGE["IK stage (constraint / anatomy)"] + direction TB + IK["IK optimization target\n(minimize joint position error,\nadd bone length / joint limits)"]:::core + FB["Optional feedback to filter\npseudo-measurement z_IK with large R"]:::meta + IK --> FB + end + + UPD --> IK + FB -.-> GATE + + %% ========================= + %% SMPL / mesh fitting + %% ========================= + subgraph SMPLSTAGE["SMPL / SMPL-X fitting"] + direction TB + VP["VPoser / pose prior"]:::core + SMPL["SMPL(θ, β, root)\nfit to joints / reprojection"]:::core + JR["JR: Joint Regressor\nmesh → joints (loop closure)"]:::core + OUT["Outputs\nmesh + joints + pose params"]:::out + VP --> SMPL --> JR --> OUT + JR -. residual / reproject .-> SMPL + end + + IK --> SMPL + + classDef core fill:#0b1020,stroke:#5eead4,color:#e5e7eb,stroke-width:1.2px; + classDef meta fill:#111827,stroke:#93c5fd,color:#e5e7eb,stroke-dasharray: 4 3; + classDef out fill:#052e2b,stroke:#34d399,color:#ecfeff,stroke-width:1.4px; \ No newline at end of file diff --git a/fig/.DS_Store b/fig/.DS_Store new file mode 100644 index 0000000..5008ddf Binary files /dev/null and b/fig/.DS_Store differ diff --git a/fig/fig.svg b/fig/fig.svg new file mode 100644 index 0000000..a9a2523 --- /dev/null +++ b/fig/fig.svg @@ -0,0 +1 @@ +

SMPL / SMPL-X fitting

IK stage (constraint / anatomy)

Tracking filter (per target)

3D measurement construction

Cross-view data association (epipolar)

Per-view input (cameras 1..N)

residual / reproject

Cam 1\n2D detections: 133×2 (+conf)

2D latest tracking cache\n(view 1)

Cam 2\n2D detections: 133×2 (+conf)

2D latest tracking cache\n(view 2)

Cam 3\n2D detections: 133×2 (+conf)

2D latest tracking cache\n(view 3)

Cam 4\n2D detections: 133×2 (+conf)

2D latest tracking cache\n(view 4)

Epipolar constraint\n(Sampson / point-to-epiline)

Cycle consistency / view-graph pruning

Assemble per-target multi-view observation set\n{view_id → 133×2}

Camera models\nK, [R|t], SO(3)/SE(3)

DLT / triangulation (init)

Optional NN lifting / completion

Optional reprojection refinement\n(1–5 iters)

3D measurement y(t)\nJ×3 positions (+quality / R / cov)

Gating\n(Mahalanobis / per-joint + global)

IMM (motion model bank)\n(CV/CA or low/med/high Q)

Predict\nΔt, self-propagate

Update\nKF (linear)\nstate: [p(3J), v(3J)]

Miss handling & track lifecycle\n(tentative → confirmed → deleted)

IMU (optional)

EKF/UKF branch (optional)\nwhen augmenting state with orientation

IK optimization target\n(minimize joint position error,\nadd bone length / joint limits)

Optional feedback to filter\npseudo-measurement z_IK with large R

VPoser / pose prior

SMPL(θ, β, root)\nfit to joints / reprojection

JR: Joint Regressor\nmesh → joints (loop closure)

Outputs\nmesh + joints + pose params

\ No newline at end of file