Mathematics of photography

Question

From mathematics perspective, cameras do convert the 3d shapes into 2d shapes in the photos. If we consider a 3D coordinate system X-Y-Z which the origins is the camera (or its lens or things like that) and select direction like this:

enter image description here

Where Blue: X, Green: Y, Red: Z

And say this camera gives us an image with 2D coordination system of X'-Y' with origins at middle:

enter image description here

How it is possible to get a general equation that converts 3D location of every point in X-Y-Z coordination system into Y'-Z' coordination system? Of course the reverse is not simply possible (it is not simply possible to reconstruct 3D objects from 2D image). Hope it makes sense...

I don't think this question is on-topic here. You may be able to find an answer over at [maths.se], but please check the scope of on-topic questions at their help center. — , May 29 '15 at 10:40

score 4 · Accepted Answer · answered May 29 '15 at 10:46

The general equation for how a camera converts a 3D point (x,y,z) specified in the camera's coordinate system (where z is the optical axis), into a 2D point (u,v) are:

u = -fx/z
v = -fy/z

where f is the focal length of the lens. To get meaningful 2D coordinates you may have to multiply by another constant for the sensor size/resolution, you can roll this up into a single value of "f" that represents all linear scaling factors.

Also, lens curvature, which can get quite complex with all the layers of glass. — Octopus, May 29 '15 at 21:07

score 3 · Answer 2 · answered May 29 '15 at 10:04

This you want to calculate is named projection. You can check here for the complete article about 3D projection on (in general) 2D.

P.S. And IMHO on the first picture Z should be blue. Because the projection is on X/Y plane and Z is usually used for depth (from point of view of sensor)

score 1 · Answer 3 · answered Nov 02 '23 at 06:14

To convert 3D coordinates in the X-Y-Z coordinate system to the 2D coordinates in the X'-Y' coordinate system as seen in the image, you'll need to perform a projection. The type of projection used in cameras is typically a perspective projection, which mimics how light rays from the 3D world converge to form a 2D image on the camera's image sensor.

The equations for perspective projection are as follows:

Let's assume the 3D point has coordinates (X, Y, Z).

The coordinates in the 2D image (X', Y') can be calculated as follows:

X' = (f * X) / Z Y' = (f * Y) / Z

Where:

(X', Y') are the coordinates in the 2D image. (X, Y, Z) are the coordinates in the 3D world. 'f' is the focal length of the camera. It's a constant that determines the perspective effect. The focal length is usually known for a given camera. These equations are simplifications and assume that the camera's image sensor is at a specific distance from the lens (which is often the focal length). The key insight here is that the 3D coordinates are divided by the Z-coordinate (depth) to account for perspective. This gives you the 2D coordinates in the image.

Note that this is a simplified explanation. Real-world camera projections can involve more complex transformations and corrections to account for lens distortions, aspect ratios, and sensor characteristics. Nonetheless, the equations above provide a basic understanding of the conversion from 3D to 2D coordinates in a perspective projection.

Mathematics of photography

3 Answers3