diff --git a/note.md b/note.md index c821d20..d430191 100644 --- a/note.md +++ b/note.md @@ -107,6 +107,56 @@ The actual sequence is: Thank you for catching this! It's crucial to understand the correct order of operations, especially when dealing with coordinate system transformations. +Let me break this down more clearly: + +## Why do we need these conversions? + +1. **World Coordinate Conversion (Z-up to Y-up)**: + - Your original data uses a Z-up coordinate system (Z is "up" in the real world) + - Three.js uses a Y-up coordinate system (Y is "up" in Three.js) + - We need to convert between these two world coordinate systems + +2. **Camera Coordinate Conversion (OpenCV to OpenGL)**: + - Camera coordinate systems define how the camera "sees" the world + - OpenCV uses a right-handed system where Z points forward from the camera + - OpenGL/Three.js uses a system where Z points backward from the camera + - We need to convert between these camera conventions + +## What's the difference? + +- **World conversion** (Z-up to Y-up) affects how we represent points in the world +- **Camera conversion** (OpenCV to OpenGL) affects how the camera interprets what it sees + +These are completely separate transforms that operate on different coordinate spaces: +- One operates on the world before the camera sees it +- The other operates on the camera's view of the world + +## Why this specific order? + +The order matters because we're dealing with a transformation pipeline: + +1. First, we want to convert the world from Z-up to Y-up (`worldCvt`) + - This makes the world match Three.js's expectations + +2. Then, apply the camera's extrinsic matrix (`Rt`) + - This transforms world points into the camera's coordinate system + - At this point, we're in OpenCV camera space + +3. Finally, convert from OpenCV to OpenGL camera conventions (`cameraCvt`) + - This converts the camera space to match Three.js's expectations + +If we changed the order: +- Applying camera conversion before the camera transformation would incorrectly transform world points +- Applying world conversion after the camera transformation would transform points that are already in camera space + +Think of it as a pipeline: +1. Fix the world coordinates (world conversion) +2. View the world through the camera (camera extrinsic) +3. Adjust how the camera interprets what it sees (camera conversion) + +The fact that matrix multiplication is associative means we can compute this entire pipeline as a single matrix operation, but the conceptual order still matters for getting the correct result. + + --- [Matrix4.multiply](https://threejs.org/docs/#api/en/math/Matrix4.multiply)