Compare commits

...

2 Commits

2 changed files with 73 additions and 13 deletions

50
note.md
View File

@ -107,6 +107,56 @@ The actual sequence is:
Thank you for catching this! It's crucial to understand the correct order of operations, especially when dealing with coordinate system transformations. Thank you for catching this! It's crucial to understand the correct order of operations, especially when dealing with coordinate system transformations.
Let me break this down more clearly:
## Why do we need these conversions?
1. **World Coordinate Conversion (Z-up to Y-up)**:
- Your original data uses a Z-up coordinate system (Z is "up" in the real world)
- Three.js uses a Y-up coordinate system (Y is "up" in Three.js)
- We need to convert between these two world coordinate systems
2. **Camera Coordinate Conversion (OpenCV to OpenGL)**:
- Camera coordinate systems define how the camera "sees" the world
- OpenCV uses a right-handed system where Z points forward from the camera
- OpenGL/Three.js uses a system where Z points backward from the camera
- We need to convert between these camera conventions
## What's the difference?
- **World conversion** (Z-up to Y-up) affects how we represent points in the world
- **Camera conversion** (OpenCV to OpenGL) affects how the camera interprets what it sees
These are completely separate transforms that operate on different coordinate spaces:
- One operates on the world before the camera sees it
- The other operates on the camera's view of the world
## Why this specific order?
The order matters because we're dealing with a transformation pipeline:
1. First, we want to convert the world from Z-up to Y-up (`worldCvt`)
- This makes the world match Three.js's expectations
2. Then, apply the camera's extrinsic matrix (`Rt`)
- This transforms world points into the camera's coordinate system
- At this point, we're in OpenCV camera space
3. Finally, convert from OpenCV to OpenGL camera conventions (`cameraCvt`)
- This converts the camera space to match Three.js's expectations
If we changed the order:
- Applying camera conversion before the camera transformation would incorrectly transform world points
- Applying world conversion after the camera transformation would transform points that are already in camera space
Think of it as a pipeline:
1. Fix the world coordinates (world conversion)
2. View the world through the camera (camera extrinsic)
3. Adjust how the camera interprets what it sees (camera conversion)
The fact that matrix multiplication is associative means we can compute this entire pipeline as a single matrix operation, but the conceptual order still matters for getting the correct result.
--- ---
[Matrix4.multiply](https://threejs.org/docs/#api/en/math/Matrix4.multiply) [Matrix4.multiply](https://threejs.org/docs/#api/en/math/Matrix4.multiply)

View File

@ -119,12 +119,14 @@ const Scene = () => {
let Rt: THREE.Matrix4 let Rt: THREE.Matrix4
if (extrinsic instanceof THREE.Matrix4) { if (extrinsic instanceof THREE.Matrix4) {
Rt = extrinsic Rt = extrinsic
} else { } else if (Array.isArray(extrinsic)) {
console.assert(extrinsic.length === 16, "extrinsic must be a 4x4 matrix")
Rt = new THREE.Matrix4() Rt = new THREE.Matrix4()
// @ts-expect-error 16 elements // @ts-expect-error 16 elements
Rt.set(...extrinsic) Rt.set(...extrinsic)
} else {
throw new Error("extrinsic must be a 4x4 matrix or an array of 16 elements")
} }
console.assert(Rt.elements.length === 16, "extrinsic must be a 4x4 matrix")
const font = new FontLoader().parse(HelvetikerRegular) const font = new FontLoader().parse(HelvetikerRegular)
const camera = new THREE.PerspectiveCamera(fov ?? 60, aspect ?? 4 / 3, near ?? DEFAULT_NEAR, far ?? DEFAULT_FAR) const camera = new THREE.PerspectiveCamera(fov ?? 60, aspect ?? 4 / 3, near ?? DEFAULT_NEAR, far ?? DEFAULT_FAR)
const helper = <cameraHelper args={[camera]} /> const helper = <cameraHelper args={[camera]} />
@ -161,17 +163,25 @@ const Scene = () => {
) )
} }
const preProcessExtrinsic = (extrinsic: number[]) => { const preProcessExtrinsic = (extrinsic: number[] | THREE.Matrix4) => {
// Create the initial world-to-camera transform let Rt: THREE.Matrix4
const worldToCamera = new THREE.Matrix4() if (extrinsic instanceof THREE.Matrix4) {
// @ts-expect-error 16 elements Rt = extrinsic
worldToCamera.set(...extrinsic) } else if (Array.isArray(extrinsic)) {
console.assert(extrinsic.length === 16, "extrinsic must be a 4x4 matrix")
Rt = new THREE.Matrix4()
// @ts-expect-error 16 elements
Rt.set(...extrinsic)
} else {
throw new Error("extrinsic must be a 4x4 matrix or an array of 16 elements")
}
// Convert from Z-up to Y-up first (this affects world coordinates)
const worldZupToYup = Z_UP_TO_Y_UP.clone()
// Then handle OpenCV to OpenGL camera convention // Then handle OpenCV to OpenGL camera convention
const cameraConversion = CV_TO_GL_MAT.clone() const cameraCvt = CV_TO_GL_MAT.clone()
// Convert from Z-up to Y-up first (this affects world coordinates)
const worldCvt = Z_UP_TO_Y_UP.clone()
// Final transformation: // Final transformation:
// 1. Convert world from Z-up to Y-up // 1. Convert world from Z-up to Y-up
@ -179,9 +189,9 @@ const Scene = () => {
// 3. Convert camera coordinates from OpenCV to OpenGL // 3. Convert camera coordinates from OpenCV to OpenGL
const final = new THREE.Matrix4() const final = new THREE.Matrix4()
final final
.multiply(cameraConversion) .multiply(cameraCvt)
.multiply(worldToCamera) .multiply(Rt)
.multiply(worldZupToYup) .multiply(worldCvt)
// Invert to get the camera-to-world transform // Invert to get the camera-to-world transform
final.invert() final.invert()