0

From mathematics perspective, cameras do convert the 3d shapes into 2d shapes in the photos. If we consider a 3D coordinate system X-Y-Z which the origins is the camera (or its lens or things like that) and select direction like this:

enter image description here

Where Blue: X, Green: Y, Red: Z

And say this camera gives us an image with 2D coordination system of X'-Y' with origins at middle:

enter image description here

How it is possible to get a general equation that converts 3D location of every point in X-Y-Z coordination system into Y'-Z' coordination system? Of course the reverse is not simply possible (it is not simply possible to reconstruct 3D objects from 2D image). Hope it makes sense...

Gerry Myerson
  • 179,216
epsi1on
  • 183
  • 1
    I don't think this question is on-topic here. You may be able to find an answer over at [maths.se], but please check the scope of on-topic questions at their help center. –  May 29 '15 at 10:40

3 Answers3

4

The general equation for how a camera converts a 3D point (x,y,z) specified in the camera's coordinate system (where z is the optical axis), into a 2D point (u,v) are:

u = -fx/z
v = -fy/z

where f is the focal length of the lens. To get meaningful 2D coordinates you may have to multiply by another constant for the sensor size/resolution, you can roll this up into a single value of "f" that represents all linear scaling factors.

3

This you want to calculate is named projection. You can check here for the complete article about 3D projection on (in general) 2D.

P.S. And IMHO on the first picture Z should be blue. Because the projection is on X/Y plane and Z is usually used for depth (from point of view of sensor)

1

To convert 3D coordinates in the X-Y-Z coordinate system to the 2D coordinates in the X'-Y' coordinate system as seen in the image, you'll need to perform a projection. The type of projection used in cameras is typically a perspective projection, which mimics how light rays from the 3D world converge to form a 2D image on the camera's image sensor.

The equations for perspective projection are as follows:

Let's assume the 3D point has coordinates (X, Y, Z).

The coordinates in the 2D image (X', Y') can be calculated as follows:

X' = (f * X) / Z Y' = (f * Y) / Z

Where:

(X', Y') are the coordinates in the 2D image. (X, Y, Z) are the coordinates in the 3D world. 'f' is the focal length of the camera. It's a constant that determines the perspective effect. The focal length is usually known for a given camera. These equations are simplifications and assume that the camera's image sensor is at a specific distance from the lens (which is often the focal length). The key insight here is that the 3D coordinates are divided by the Z-coordinate (depth) to account for perspective. This gives you the 2D coordinates in the image.

Note that this is a simplified explanation. Real-world camera projections can involve more complex transformations and corrections to account for lens distortions, aspect ratios, and sensor characteristics. Nonetheless, the equations above provide a basic understanding of the conversion from 3D to 2D coordinates in a perspective projection.