forked from HQU-gxy/camera-extrinsic-play
Add comprehensive documentation on coordinate system conversions in note.md. Explain the necessity and order of transformations for world and camera coordinates, clarifying the distinction between Z-up to Y-up and OpenCV to OpenGL conversions.
This commit is contained in:
50
note.md
50
note.md
@ -107,6 +107,56 @@ The actual sequence is:
|
||||
|
||||
Thank you for catching this! It's crucial to understand the correct order of operations, especially when dealing with coordinate system transformations.
|
||||
|
||||
Let me break this down more clearly:
|
||||
|
||||
## Why do we need these conversions?
|
||||
|
||||
1. **World Coordinate Conversion (Z-up to Y-up)**:
|
||||
- Your original data uses a Z-up coordinate system (Z is "up" in the real world)
|
||||
- Three.js uses a Y-up coordinate system (Y is "up" in Three.js)
|
||||
- We need to convert between these two world coordinate systems
|
||||
|
||||
2. **Camera Coordinate Conversion (OpenCV to OpenGL)**:
|
||||
- Camera coordinate systems define how the camera "sees" the world
|
||||
- OpenCV uses a right-handed system where Z points forward from the camera
|
||||
- OpenGL/Three.js uses a system where Z points backward from the camera
|
||||
- We need to convert between these camera conventions
|
||||
|
||||
## What's the difference?
|
||||
|
||||
- **World conversion** (Z-up to Y-up) affects how we represent points in the world
|
||||
- **Camera conversion** (OpenCV to OpenGL) affects how the camera interprets what it sees
|
||||
|
||||
These are completely separate transforms that operate on different coordinate spaces:
|
||||
- One operates on the world before the camera sees it
|
||||
- The other operates on the camera's view of the world
|
||||
|
||||
## Why this specific order?
|
||||
|
||||
The order matters because we're dealing with a transformation pipeline:
|
||||
|
||||
1. First, we want to convert the world from Z-up to Y-up (`worldCvt`)
|
||||
- This makes the world match Three.js's expectations
|
||||
|
||||
2. Then, apply the camera's extrinsic matrix (`Rt`)
|
||||
- This transforms world points into the camera's coordinate system
|
||||
- At this point, we're in OpenCV camera space
|
||||
|
||||
3. Finally, convert from OpenCV to OpenGL camera conventions (`cameraCvt`)
|
||||
- This converts the camera space to match Three.js's expectations
|
||||
|
||||
If we changed the order:
|
||||
- Applying camera conversion before the camera transformation would incorrectly transform world points
|
||||
- Applying world conversion after the camera transformation would transform points that are already in camera space
|
||||
|
||||
Think of it as a pipeline:
|
||||
1. Fix the world coordinates (world conversion)
|
||||
2. View the world through the camera (camera extrinsic)
|
||||
3. Adjust how the camera interprets what it sees (camera conversion)
|
||||
|
||||
The fact that matrix multiplication is associative means we can compute this entire pipeline as a single matrix operation, but the conceptual order still matters for getting the correct result.
|
||||
|
||||
|
||||
---
|
||||
|
||||
[Matrix4.multiply](https://threejs.org/docs/#api/en/math/Matrix4.multiply)
|
||||
|
||||
Reference in New Issue
Block a user