w3 w gemini Flashcards

1
Q

What is a coordinate transformation (in the context of 3D space)?

A

A change of basis between different reference frames.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is another term for a coordinate transformation?

A

A change of basis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Define a reference frame (or coordinate system) in 3D space.

A

An arbitrary set of orthogonal axes (X, Y, Z) used to measure the position and orientation of points or items.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is Visual SLAM (Simultaneous Localisation And Mapping)?

A

A process used by robots and drones to orient themselves in the real world and create a map of their surroundings.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

In Visual SLAM, what coordinate system does the map typically start with?

A

The robot base’s coordinate system.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

In Visual SLAM, what transformation does the robot perform for each new image?

A

Transforms detected scene points from the camera coordinate system to the robot’s, and then to the world coordinate system.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

In multi-view pose estimation, what do we estimate from each camera?

A

The complete 2D body pose (e.g., skeleton with 2D joints) for each person.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

In multi-view pose estimation, how do we obtain an estimate of the 3D body pose?

A

By converting the 2D estimate into an accurate 3D pose with respect to the camera’s reference frame.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

In multi-view pose estimation, what can be chosen as the world reference frame?

A

The reference frame of one of the cameras, or any arbitrary origin.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is a convolution mask (or kernel)?

A

A small matrix used in image processing to perform operations like blurring, sharpening, or edge detection.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the purpose of padding an image with zeros before convolution?

A

To produce an output image of the same size as the input image.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is a rotation matrix in the context of coordinate transformations?

A

A 3x3 matrix that describes the rotation needed to align the orientation of one frame to another.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is a translation vector in the context of coordinate transformations?

A

A 3x1 vector that describes the translation needed to align the origin of one frame to another.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Explain the concept of ‘extrinsic’ and ‘intrinsic’ rotations (briefly).

A

Rotations around world axes (extrinsic) vs. rotations around the object’s local axes (intrinsic).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is aliasing in the context of image down-sampling?

A

Distortion or misrepresentation of the image due to insufficient sampling. (Downsampling problem of)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How is aliasing avoided when down-sampling images for an image pyramid?

A

By smoothing the image (convolving with a Gaussian mask) prior to down-sampling.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is a Gaussian image pyramid?

A

A multiscale representation of an image at different resolutions obtained by iteratively convolving with a Gaussian filter and down-sampling.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is a Laplacian image pyramid?

A

A multiscale representation of an image highlighting intensity discontinuities, obtained by subtracting a Gaussian-smoothed image from the previous level and down-sampling.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Why is it computationally cheaper to keep the mask size fixed and vary the image size in multiscale feature analysis?

A

Because the number of multiplications required for convolution is significantly reduced.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Give an example of the computational advantage of varying image size over varying mask size in multiscale analysis.

A

Convolving a 100x100 image with a 6x6 mask requires more multiplications than convolving a 50x50 image with a 3x3 mask.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is the formula for a 2D Gaussian?

A

G(x,y) = (1 / (2πσ²)) * exp(-(x² + y²) / (2σ²))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What are the four main categories of image features that can produce intensity-level discontinuities?

A

Depth discontinuities, Orientation discontinuities, Reflectance discontinuities, Illumination discontinuities.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is the mathematical operation for convolution?

A

I’(i, j) = Σk,l I(i+k, j+l)H(−k,−l)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is the purpose of convolving an image with a Laplacian mask?

A

To detect edges and regions of rapid intensity change.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What is a Laplacian of Gaussian (LoG) mask?

A

A convolution mask created by combining a Laplacian mask with a Gaussian mask.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Why is combining a Laplacian mask with a Gaussian mask advantageous for edge detection?

A

The Gaussian smooths noise, making the Laplacian less sensitive to it while still detecting edges.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What is a Difference of Gaussians (DoG) mask and how is it related to the LoG mask?

A

It’s another way to approximate the LoG mask, calculated as the difference between two Gaussian-blurred versions of the same image with different standard deviations.

28
Q

What are the two main approaches for performing multiscale feature analysis?

A

1) Keep image size fixed and vary mask size. 2) Keep mask size fixed and vary image size.

29
Q

What is aliasing in the context of image processing?

A

The distortion or misrepresentation of high-frequency information when sampling at a rate too low to capture it accurately.

30
Q

How is aliasing typically avoided when down-sampling images?

A

By applying a low-pass filter (like a Gaussian filter) before down-sampling to remove high-frequency components.

31
Q

What is a Gaussian image pyramid?

A

A series of images created by repeatedly smoothing and down-sampling an original image.

32
Q

What is a Laplacian image pyramid?

A

A series of images created by taking the difference between adjacent levels of a Gaussian pyramid, capturing the details lost during smoothing.

33
Q

Explain the concept of ‘separable masks’ in convolution.

A

A 2D convolution mask is separable if it can be expressed as the convolution of two 1D vectors (a row and a column vector).

34
Q

Why are separable convolutions more efficient?

A

They reduce the number of multiplications required for convolution.

35
Q

What are ‘mask as point-spread functions’?

A

When convolving a mask with an isolated point, the output is the mask itself, shifted to the point’s location.

36
Q

What are ‘mask as templates’?

A

Convolution output is highest when the image region under the mask closely resembles the (rotated) mask.

37
Q

What are the two main categories of mask weight values?

A

Smoothing masks and differencing masks.

38
Q

What are the characteristics of weights in a smoothing mask?

A

All positive, and sum to 1.

39
Q

What are the characteristics of weights in a differencing mask?

A

Contain positive and negative values, and sum to 0.

40
Q

What is spatial frequency in an image?

A

The rate of change of image intensity over space. High spatial frequency corresponds to fine details, low to coarse features.

41
Q

What is a box mask (in image smoothing)?

A

A simple smoothing mask where all weights are equal (e.g., 1/9 for a 3x3 mask).
Aka mean filter

42
Q

Why is a Gaussian mask generally better for smoothing than a box mask?

A

It has a smoother fall-off, giving more weight to nearby pixels and reducing artifacts.

43
Q

What is the effect of increasing the standard deviation (σ) of a Gaussian smoothing mask?

A

More blurring, more high frequencies suppressed, noise is more effectively suppressed.

44
Q

What are Gaussian derivative masks used for?

A

Edge detection, by emphasizing vertical and horizontal structures.

45
Q

What is the first step in the Canny edge detector?

A

Convolution with derivatives of Gaussian masks.

46
Q

What is the second step in the Canny edge detector?

A

Calculation of the magnitude and direction of the gradient.

47
Q

What is the third step in the Canny edge detector?

A

Non-maximum suppression to thin out edges.

48
Q

What is the fourth step in the Canny edge detector?

A

Recursive hysteresis thresholding to link edges.

49
Q

What is the main drawback of edge detection algorithms?

A

Results are highly dependent on parameter values, and not all detected edges are meaningful.

50
Q

What is the general algorithm for feature detection using masks?

A

Convolve the image with a mask, choose a threshold, and detect the feature where the convolved image exceeds the threshold.

51
Q

What is the core idea behind template matching across scales?

A

Convolution can find locations where the image matches a template, but the scale of the feature may be unknown.

52
Q

What are the two main approaches for finding image features with invariance to scale?

A

Apply filters of different sizes, or apply a fixed-size filter to images at different scales (image pyramid).

53
Q

What is the common name for applying filters of a fixed size to an image presented at different sizes?

A

Image Pyramid.

54
Q

What is down-sampling?

A

Decreasing the size of an image.

55
Q

What is a common problem encountered during down-sampling?

A

Aliasing.

56
Q

How can the problem of aliasing be mitigated during down-sampling?

A

By smoothing the image before down-sampling.

57
Q

What is the core idea behind Gaussian Pyramid creation?

A

Repeatedly smooth and down-sample the image.

58
Q

What is the relationship between the standard deviation of Gaussians used in successive levels of a Gaussian Pyramid?

A

If smoothing with σ and then again with σ, the result is equivalent to smoothing with √(2)σ.

59
Q

What is a key property of the Laplacian of Gaussian (LoG) mask regarding intensity discontinuities?

A

It detects intensity discontinuities (e.g., edges) at all orientations.

60
Q

How is a Laplacian image pyramid created from a Gaussian pyramid?

A

By subtracting each level of the Gaussian pyramid from the next higher level (after upsampling the higher level).

61
Q

What are the key steps involved in linear filtering?

A

Convolution and using a mask (weights).

62
Q

Why is edge detection important in computer vision?

A

It is important for subsequent analysis and can be sufficient for recognizing the content of an image.

63
Q

What are the main steps in edge detection?

A

Requires filtering, and involves convolution.

64
Q

What are the advantages of using Gaussian derivative masks for edge detection compared to the DoG mask?

A

They tend to produce more robust results.

65
Q

What is the main idea behind multi-scale feature detection?

A

To find features that are invariant to scale.

66
Q

What are the two main types of image pyramids?

A

Gaussian image pyramids and Laplacian image pyramids.

67
Q

LoG mask

A

-1/8 -1/8 -1/8
-1/8 1 -1/8
-1/8 -1/8 -1/8