2. Image Transformation Flashcards

1
Q

What is an image?

A

Image is a 2D projection of the 3D world

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How do you model a perspective projection with a pinhole camera?

A

Assuming a virtual image:

Important: PINHOLE IS NOT INFINTELY SMALL, one spot of an object projects into a small area, not one spot

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are homogeneous coordinates and why do we need them? How to convert it back no non-homogeneous coords?

A
  • If we do a projection of 3D into 2D (assuming pinhole camera), we do something like:

(x, y, z) -> (x’, y’) where x’ = (f’ * x) / z and y’ the same way

This is a non-linear transformation so we need to make it linear (easier computation, just matrix multiplications) To do that, we use homogeneous vectors where we just add value of 1 do increase the dimensions.

To convert back, just divide by the third coordinate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a process we have to take to project an object into a 2D plane using a pinhole camera?

A

It is a two step process:

Extrinsic camera transformation takes world into camera coordinates.
Intrinsic camera transformation describes the image formation process.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Explain the extrinsic transformation in detail:
- what is it?
- Idea behind it
- Formulas

A

It is a transformation of coordinates from a world-coords into a camera-coords. It is basically first shifting a camera center to allign with the world-center, and then also taking into account the rotation of the camera so the world-rotation is alligned with the camera rotation.

XC = R(XW - c)
- XC - coords of the object in the camera frame (world)
- XW - coords of the object in the world frame (world)
- c - coords of the camera center in the world frame
- R - rotation matrix
- this is in non-homogeneous coords!!!!

Now, to do it in homogeneous coords:

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How many parameters to do extrinsic transofrmation. Which ones?

A

6 parameters needed

3 for position (x, y, z) of the camera (c)
3 for rotation (R) (3-axis system) for rotation of the camera

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What’s the difference between perpsecitve Projection and orthogonal projection (in the matrix)

A

Orthogonal: 1 is in the fourth column: z gets omitted (not divide by z)
Other one: 1 is in the third column: divide by z

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Why some image data is not enough to understand the world?

A
  • low resolution
  • sensor noise
    -. …
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the principal axis?

A

It is a line from the camera center perpendicular to the image plane (right angle)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the normalized camera coordinate system?

A

It is a coordinate system where center is the camera center (x, y) and the z-axis is the principal axis (line perpendicular to the image plane)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a principal point?

A

It is a point p where principal axis intersects the image plane. It is also an origin of the normalized coordinate system.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the difference between camera coordinate system and image coordinate system?

A

Camera coordinate system has the origin (center) at the principal point, while image coordinate system has the center at the corner (bottom left or upper left, think of how frontend has image coords usually)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How to account for Principal Point offset in the Calibration Matrix?

A

When we had a pinhole camera and we had a transformation between 3D into 2D, we multiplied the (X, Y, Z, 1) with the K matrix (diag of f, f, 1 + 0). The problem is, we need to account for the principal point offset so our origin is in the corner and not in the center. How do we do that?

We modify our K matric (callibration matrix) and add these offsets. Sure, we get the coords with the Z * p, but Z will cancel out when we divide by Z (from homogeneous to non-homogeneous coords)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How do camera and world frame relate (+ formula)?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How do we solve a problem of units (meter -> pixel)?

A

By multiplying our callibration matrix with diag (Mx, My, 1) A size of one pixel is 1/Mx * 1/My

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is a calibration matrix?

A

It is a 3x3 matrix that contains intrinsic parameters:
- principal point coords
- focal length
- pixel magnification factors
- skew (not in formula) for non-rectangular pixels

17
Q

Explain the perspective projection pipeline (with the formula)

A

XI - coords of the object in the image (2D)

1st step: Extrinsic, convert from world coords into camera coords using rotation and translation params
2nd step: Intrinsic, convert from camera coords into image coords using calibration matrix, appending 0s in the 4th column.

18
Q

What is a projection matrix? Explain it

A

Projection matrix is combining extrinsix and intrinsic transformation into one 3x4 matrix that transforms world coordinates into image coordinates (homogeneous).

It first takes the calibration matrix K, appends 0s in the 4th column, and then multiplies with the R-t matrix (R, t, 0, 1). This can be simplifies into P

19
Q

What is an orthographic projection?

A
  • When an object is infinitely far away from the pinhole so rays fall orthogonally onto an image plane (like there is no pinhole). Depth can be omitted as well as the focal length.
20
Q

What is the projection matrix of the parallel projection?

A

1 0 0 0
0 1 0 0
0 0 0 1

This transforms (x, y, z, 1) into (x, y, 1) = (x, y).

Z is just omitted. Pay attention to the 1 in the corner!! If 1 is in the third column, then we need to divide by z to get non-homogeneous coords.