Past Papers Flashcards

1
Q

Equation for depth of a 3d point visible from 2 cameras

A

Z = (f * B)/(x_L - x_R) where f is the focal length of the cameras (assuming it is the same for both) and B is the distance between them

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Define image processing

A

Signal processing applied to an image, with another image as the resulting output

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Define the epipolar constraint

A

We can reduce the problem space for the correspondence problem to a line by using the geometric properties of the camera(s) between images

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a hyper column

A

A region of V1 that contains neurons covering the full range of RF types for a single spatial location

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is V1

A

Primary visual cortex
Performs initial low level processing on incoming informtion (from LGN)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a forward problem

A

One where we know the causes and want to predict the outcome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is an inverse problem.

A

One where we know the outcomes and want to infer the causes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

List 5 things that an object recognition algorithm should be insensitive to

A

Illumination
Occlusion
Viewpoint (orientation, scale, translations)
Non rigid deformation.
Within category variations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Draw a cross sectional diagram of how a lense forms an image of a point

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Equation of projection (pinhole camera)

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

When asked about depth in relation to time

A

Where Vx is velocity of camera and x• is velocity of image point

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Steps of canny edge detector

A

1) convolve image with DoG
2) see image
3) non max suppression of any pixel that has a neighbour perpendicular to the direction of the edge with a higher magnitude
4) hysterisis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Math model used to simulate receptive fields of cortical simple cells

A

Gabor(x, y) =

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How do we simulate complex cells

A

Repeating convolution with gabor mask but varying orientation, phase, spatial frequency etc

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Describe the role of mid level vision

A

Appropriately Group together image elements
And segment them from other image elements

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Starting difference between regiong growing and region merging

A

Merging: each pixel begins with a unique label, use final marking, region mean used for comparison
Growing: each begins unlabelled, don’t use final marking, individual neighbouring pixels used for comparison

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is the correspondence problem

A

The problem of finding the same 3d point or location in 2 (or more) images

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

For coplanar cameras, if the baseline (B) distance increases between the cameras, how does this effect accuracy of measuring depth of a 3d point

A

This increases disparity which increases accuracy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

In a feature based solution to the correspondence problem, explain what is meant by descriptor and detector

A

A detector is a method used to locate points of interest or image features

A descriptor is a vector for identified points/features to be used for comparison with potential matches

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is RANSAC

A

RANDOM SAMPLING AND CONSENSUS
1) randomly sample the minimum number of data points to fit the model
2) fit th3 model to this sample
3) test all other data points against this fitted model
4) count #of inliers (consensus set)
5) repeat 1-4 for N trials and choose the parameters that fit best overall (best fit is that with the highest support (# in conessus set)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Cross correlation formula for vectors a and b

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Correlation coefficient formula for vectors a and b

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is the difference between top down and bottom up

A

Top down: coming from internal knowledge, prior experience
Bottom up: coming from image properties
These approaches are not mutually exclusive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Gestalt laws

A

Bottom up factors:
Proximity
Similarity
Closure
Continuity
Common fate
Symmetry
Common region
Connectivity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What does the LGN do

A

LGN cells have centre surround RFs
Traditionally viewed to just relay info from retina to cortex
Recent evidence suggests does more

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

On centre off surround is

A

Activates If centre brighter than surround

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Explain how ganglion cells in the eye and simple cells in v1 work together to detect edges

A

Ganglion cells are centre surround
Retina is over represented (every area in retina is part of receptive field of several ganglion cells)
With several centre surround lined up, their combination shows edges

28
Q

Example of function of complex cells in v1

A

Combine input from several simple cells so can detect patterns that are a combination of single cell detections

29
Q

Describe how lateral connections in v1 can explain some gestalt laws

A

Lateral connections connect together areas of v1 that deal with adjacent areas of the visual field
If adjacent cells detect something similar, eg a line segment, this can stand out more than 2 separate segments
If the segments were end-on, this would be continuity
If they were side by side, this would be similarity

30
Q

What is required of a mask for it to have no effect on intensity

A

All elements in the mask sum to 1

31
Q

Example of difference mask

A

[1, -1]
Positive x direction

32
Q

Laplacian mask is? Used to?

A

Difference mask in every direction
Used to detect disconinuities in intensity in every direction

33
Q

Advantage and disadvantage of laplacian mask? How to mitigate dis

A

Advantage- good at detecting discontinuities
Disadvantage- maybe too good, most sensitive to a single pixel that stands out from neighbours, therefore increases noise, leading to many false positives during edge detection

Mitigate dis apply averaging filter before applying laplacian, this is overall typically done by Laplacian of Gaussian (convolve gassing with laplacian)

34
Q

Splitting and merging pseudocode

A

Splitting:
Start with whole image as 1 region
If all pixels are not similar, Split into 4 quadrants
Repeat until all regions are homogenous
Merging
Compare each region to neighbouring region, merge all that are. similar
Continue until no more regions can merge

35
Q

Disadvantage of splitting and merging

A

Works well if regions are fairly homogenous otherwise many spurious regions created

36
Q

Explain how a CCD forms an RGB image

A

A CCD is an array of MOS capacitors that accumulate charge from light proportionally to light intensity
Individual diodes are made sensitive to R,G and B by placing a filter between the light sources and the diode

37
Q

Relating pixels to camera coordinates

A

Where (Ox, Oy) is the coordinates (in pixels) of the image principal point
α is the magnification factor in the x direction
β is the magnification factor in the y direction

38
Q

What is a horopter

A

An imaginary surface on which all points have 0 disparity

39
Q

Compare image sampling mechanisms used in eye to camera

A

Camera:
- sensitive to 3 wavelength RGB
- sensing elements occur in fixed ratio across whole image plane
- sampling density is uniform across whole image plane
Eye:
- sensitive to 4 wavelengths RGBW
- sensing elements occur in variable ratios across image plane (cones density highest at fovea, rod density highest outside fovea)
- sampling density is non uniform across image plane

40
Q

Explain the difference between ‘view centered’ and ‘object centere’ approaches to object recognition

A

View centred:
3d object is modelled as a set of 2d images of different views of the object
Object centred:
Single 3d model used to describe object

41
Q

If asked to derive thin lense equation

A

Make sure to use similar triangles

42
Q

How do you change focus of a camera

A

The focal length of the lense is fixed
Therefore we change the distance between the lense and the image plane

43
Q

2y2 = ?

A

-1
2
-1

44
Q

When convomving a mask with another

A

Flip the smaller mask!!!

45
Q

Steps of agglomerative hierarchical clustering g

A
46
Q

Define the aperture problem and suggest how to mitigate

A

Direction of motion of a small image patch can be ambiguous
Particularly for am edge, direction of motion is only available perpendicular to the edge
Overcome by using info from multiple sensors or by giving preference to image locations where image structure provides unambiguous information about optic flow (eg corners)

47
Q

Calculating depth when camera is moving along optical axis and given image dimensions/central point

A
48
Q

2 constraints applied to correspondence problem for video (note limitations)

A

Spatial coherence (assuming neighbours have similar optical flow):
Neighbouring points have similar optical flow
fails at discontinuities between surfaces at different depths

Small motion (assuming optical flow vectors have similar magnitude):
Optical flow vectors tend to have small magnitude
Fails if relative motion is fast or frame rate is low

49
Q

Monocular cues to depth

A

Interposition/Occlusion
Size familiarity
Texture gradients
Linear perspective
Aerial perspective
Shading

50
Q

Normalised cross correlation formula

A
51
Q

1st derivative masks

A
52
Q

2nd derivative masks

A
53
Q

Define focus

A

All rays if light from a scene point converge in a single image point

54
Q

Define focal length

A

An intrinsic property of a lense related to its shape, specifically, the distance from the lense at which optical axis intersects with diffraction rays of light that were travelling parallel to the optical axis before passing through the lens

55
Q

Define focal range

A

The range of object locations such that blurring due to the difference between the receptor plane and the focal plane is less than the resolution of the receptor device

56
Q

For a pinhole camera, at what length should image plane be placed to bring object into focus

A

Use thin lense equation (with moduli)

57
Q

In image formation, what properties of an image are determined by radiometrix parameters and by geometric parameters

A

Radiometric parameters tend to determine intensity/colour:
Illumination, surface reflectance, sensor wavelength properties

Geometric properties determine where on the image a scene point appears:
Camera position & orientation in space
Camera optics

58
Q

Forming DoG mask for On Centre Off surround

A

Subtract mask with larger σ from mask with smaller

59
Q

Decomposing 11 x 11 mask involves

A

11 x 1 row vector and 1 x 11 col vectors
The row vector is convolved with every valid position in the image (11 + 10 operations per mask placement) and then so too is the col vector

60
Q

How many operations are involved in convolving an image with a DoG

A

DoG is 2 gaussians with subtraction
G = #of operations in convolving with separated gaussian
D = size of image, which is the subtraction of the convolution with 1 gaussian subtracted from other

2G + D

61
Q

Single link

A

Min distance

62
Q

Complete link

A

Max distance

63
Q

Group average

A

Average distance

64
Q

Calculating depth from 2 images given image centre

A

Z = x1Vz/x°
Where x1 is original x displacement from center
Vz is speed of camera moving on z axis/optical axis
X° is (x2 - x1)/time

65
Q

What is the sliding window approach to object recognition

A

Sliding window applies a classifier (usually a deep NN) to image patches
Tolerance is achieved by training the classified to recognise the object despite changes in appearance and using different shapes and sizes of image patch