Add comprehensive documentation on coordinate system conversions in note.md. Explain the necessity and order of transformations for world and camera coordinates, clarifying the distinction between Z-up to Y-up and OpenCV to OpenGL conversions.

Refactor preProcessExtrinsic function to handle both Matrix4 and array inputs for extrinsic parameters. Add error handling for invalid inputs and update transformation sequence for camera-to-world conversion.
2025-03-25 10:22:26 +08:00 · 2025-03-25 10:17:08 +08:00
2 changed files with 73 additions and 13 deletions
--- a/note.md
+++ b/note.md
@ -107,6 +107,56 @@ The actual sequence is:

 Thank you for catching this! It's crucial to understand the correct order of operations, especially when dealing with coordinate system transformations.

+Let me break this down more clearly:
+
+## Why do we need these conversions?
+
+1. **World Coordinate Conversion (Z-up to Y-up)**:
+   - Your original data uses a Z-up coordinate system (Z is "up" in the real world)
+   - Three.js uses a Y-up coordinate system (Y is "up" in Three.js)
+   - We need to convert between these two world coordinate systems
+
+2. **Camera Coordinate Conversion (OpenCV to OpenGL)**:
+   - Camera coordinate systems define how the camera "sees" the world
+   - OpenCV uses a right-handed system where Z points forward from the camera
+   - OpenGL/Three.js uses a system where Z points backward from the camera
+   - We need to convert between these camera conventions
+
+## What's the difference?
+
+- **World conversion** (Z-up to Y-up) affects how we represent points in the world
+- **Camera conversion** (OpenCV to OpenGL) affects how the camera interprets what it sees
+
+These are completely separate transforms that operate on different coordinate spaces:
+- One operates on the world before the camera sees it
+- The other operates on the camera's view of the world
+
+## Why this specific order?
+
+The order matters because we're dealing with a transformation pipeline:
+
+1. First, we want to convert the world from Z-up to Y-up (`worldCvt`)
+   - This makes the world match Three.js's expectations
+
+2. Then, apply the camera's extrinsic matrix (`Rt`)
+   - This transforms world points into the camera's coordinate system
+   - At this point, we're in OpenCV camera space
+
+3. Finally, convert from OpenCV to OpenGL camera conventions (`cameraCvt`)
+   - This converts the camera space to match Three.js's expectations
+
+If we changed the order:
+- Applying camera conversion before the camera transformation would incorrectly transform world points
+- Applying world conversion after the camera transformation would transform points that are already in camera space
+
+Think of it as a pipeline:
+1. Fix the world coordinates (world conversion)
+2. View the world through the camera (camera extrinsic)
+3. Adjust how the camera interprets what it sees (camera conversion)
+
+The fact that matrix multiplication is associative means we can compute this entire pipeline as a single matrix operation, but the conceptual order still matters for getting the correct result.
+
+
 ---

 [Matrix4.multiply](https://threejs.org/docs/#api/en/math/Matrix4.multiply)
--- a/src/App.tsx
+++ b/src/App.tsx
@ -119,12 +119,14 @@ const Scene = () => {
    let Rt: THREE.Matrix4
    if (extrinsic instanceof THREE.Matrix4) {
      Rt = extrinsic
-    } else {
+    } else if (Array.isArray(extrinsic)) {
+      console.assert(extrinsic.length === 16, "extrinsic must be a 4x4 matrix")
      Rt = new THREE.Matrix4()
      // @ts-expect-error 16 elements
      Rt.set(...extrinsic)
+    } else {
+      throw new Error("extrinsic must be a 4x4 matrix or an array of 16 elements")
    }
-    console.assert(Rt.elements.length === 16, "extrinsic must be a 4x4 matrix")
    const font = new FontLoader().parse(HelvetikerRegular)
    const camera = new THREE.PerspectiveCamera(fov ?? 60, aspect ?? 4 / 3, near ?? DEFAULT_NEAR, far ?? DEFAULT_FAR)
    const helper = <cameraHelper args={[camera]} />
@ -161,17 +163,25 @@ const Scene = () => {
    )
  }

-  const preProcessExtrinsic = (extrinsic: number[]) => {
-    // Create the initial world-to-camera transform
-    const worldToCamera = new THREE.Matrix4()
+  const preProcessExtrinsic = (extrinsic: number[] | THREE.Matrix4) => {
+    let Rt: THREE.Matrix4
+    if (extrinsic instanceof THREE.Matrix4) {
+      Rt = extrinsic
+    } else if (Array.isArray(extrinsic)) {
+      console.assert(extrinsic.length === 16, "extrinsic must be a 4x4 matrix")
+      Rt = new THREE.Matrix4()
      // @ts-expect-error 16 elements
-    worldToCamera.set(...extrinsic)
+      Rt.set(...extrinsic)
+    } else {
+      throw new Error("extrinsic must be a 4x4 matrix or an array of 16 elements")
+    }

-    // Convert from Z-up to Y-up first (this affects world coordinates)
-    const worldZupToYup = Z_UP_TO_Y_UP.clone()

    // Then handle OpenCV to OpenGL camera convention
-    const cameraConversion = CV_TO_GL_MAT.clone()
+    const cameraCvt = CV_TO_GL_MAT.clone()
+
+    // Convert from Z-up to Y-up first (this affects world coordinates)
+    const worldCvt = Z_UP_TO_Y_UP.clone()

    // Final transformation:
    // 1. Convert world from Z-up to Y-up
@ -179,9 +189,9 @@ const Scene = () => {
    // 3. Convert camera coordinates from OpenCV to OpenGL
    const final = new THREE.Matrix4()
    final
-      .multiply(cameraConversion)
-      .multiply(worldToCamera)
-      .multiply(worldZupToYup)
+      .multiply(cameraCvt)
+      .multiply(Rt)
+      .multiply(worldCvt)

    // Invert to get the camera-to-world transform
    final.invert()
Author	SHA1	Message	Date
crosstyan	6406dbd19f	Add comprehensive documentation on coordinate system conversions in note.md. Explain the necessity and order of transformations for world and camera coordinates, clarifying the distinction between Z-up to Y-up and OpenCV to OpenGL conversions.	2025-03-25 10:22:26 +08:00
crosstyan	9fb7235d93	Refactor preProcessExtrinsic function to handle both Matrix4 and array inputs for extrinsic parameters. Add error handling for invalid inputs and update transformation sequence for camera-to-world conversion.	2025-03-25 10:17:08 +08:00