Lecture 6 - Object Detection Flashcards

1
Q

What is “Recognition”?

A
  • One to many matching - matching one object to many objects
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is “Verification”?

A
  • One to one matching - matching one object to another to see if they are the same
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is Categorisation?

A

Interclass and Intraclass, comparing objects and the type of object they are

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the difference between detection and recognition?

A

Are there faces in this image? (binary decision, no localization)
* Where are faces in this image? (face detection)
Is your person of interest present in this image? (again binary decision)
* Where is your person of interest? (face recognition)
* What are these people doing?(activity or event recognition)
REFER TO SLIDES

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How does scale of detection work?

A

We can have nested detections
– Detect face
– Detect features such as eye corners, nose tip etc

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are design algorithms capable of?

A

– Classifying images or videos
– Detect and localize image
– Estimate semantic and geometrical attributes
– Classify human activity and events

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is semantic vision?

A

In general, semantics concerns the extraction of meaning from data. Semantic vision seeks to understand not only what objects are present in an image but, perhaps even more importantly, the relationship between those objects.

The ability to attribute relationships between objects demonstrates reasoning, an important step towards true “cognition”.
=====
Semantic vision can transform visual images into descriptions of the world; providing a more robust foundation for change tracking.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the challanges of detection and recognition?

A

Shape and Appearance Variations even in a class
Viewpoint Variations
Illumination
Background Clutter
Scale
Occlusion
There can be multiple challanges in one image or there can only one

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Recognition and detection in the world - what works today?

A

Reading license plates, zip codes, checks
Fingerprint recognition
Face detection
Recognition of flat textured objects (CD covers, book covers, etc)
REFER TO DEEPFACE EXAMPLE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the object recognition pipeline

A

Similar to supervised learning -> REFER TO SLIDES FOR DIAGRAM

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the two primary characteristics for object recognition?

A

shape and appearance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How can shapes be modelled with Principal Component Analysis (PCA)

A

REFER TO SLIDES 31 - 60
1. Center the data
2. Calculate the covariance matrix
3. Calculate the Eigenvalues
4. Calculate the Eigenvectors
5. Order the eigenvectors
6. Calculate the principal components

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

PCA and Eigenfaces

A

REFER TO SLIDES

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How is Reconstruction using PCA done?

A

Only selecting the top P eigenfaces reduces the dimensionality.
Fewer eigenfaces result in more information loss, and hence less discrimination between faces

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are some issues with PCA?

A

PCA finds directions of maximum variance of the data.
This may not separate classes at all.
Basic PCA is also sensitive to noise and outliers (read other variants e.g. Robust PCA).
Linear Discriminant Analysis (LDA) finds the direction along which between class distance is maximum.
Sometimes PCA is followed by LDA to combine the advantages of both.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is a colour histogram?

A

Colour histogram is a type of appearance features

Colour stays constant under geometric transformations
Colour is a local feature
– It is defined for each pixel
– It is robust to partial occlusion
Idea:
– can use object colours directly for recognition, or
– better – use statistics of object colours

17
Q

What is RGB

A

Primaries are monochromatic lights
– for camera: Bayer filter pattern (half green, one quarter red and one quarter blue)
– for monitors; they correspond to the 3 types of phosphors

18
Q

What are the 3 colour models?

A
  • RGB (red, green, blue) colour model is the most popular way to mix and create colours
  • CMYK (cyan, magenta, yellow, key) commercial printers
  • HSV (hue, saturation, value) in the colour picker of the graphics software
19
Q

What is the colour space, specifically CIE XYZ?

A

Links physical pure colours (i.e wavelengths) in the electromagnetic visible spectrum and physiological perceived colours in human colour vision.
Primaries 𝑋, 𝑌, and 𝑍 are imaginary, but the matching functions are everywhere positive

20
Q

What is HSV/HSB?

A

HSV - Hue, Saturation, Value (Brightness)
* HSV is closer to how humans perceive colour.
* Describes colors (hue or tint) in terms of their shade (saturation or amount of gray) and their brightness value.
* Nonlinear – reflects topology of colours by coding hue as an angle

21
Q

What is colour normalisation?

A

One component of the 3D colour space is intensity
– If a colour vector is multiplied by a scalar, the intensity changes but not the colour itself.
– This means colours can be normalized by the intensity, removing the brightness effect which may vary depending on lighting conditions, cameras, and other factors.
– Note: intensity is given by 𝐼 = (𝑅 + 𝐺 + 𝐵)/3
REFER TO SLIDES FOR OTHER FORMULAS OF R G AND B

22
Q

Object Recognition based on Colour Histograms

A

Objects are identified by matching a colour histogram from an image region with a colour histogram from a sample of the object.
Technique has been shown to work remarkably robust to :
– changes in object’s orientation
– changes of scale of the object
– partial occlusion, and
– changes of viewing position and direction.
REFER TO SLIDES FOR EXAMPLES

23
Q

What are some comparison measures?

A

Euclidean distance
Chi-Square distance
KL (Kullback–Leibler divergence)/Jeffreys divergence
EMD (Earth Movers Distance)

24
Q

What is Euclidean distance

A

Motivation of the Euclidean distance:
– Focuses on the differences
between the histograms.
– Interpretation: distance in the feature space.
– Range: [0, ∞).
– All cells are weighted equally.
– Not very robust to outliers !

25
Q

What is Chi-Square distance

A

Motivation of the 𝜒^2 distance:
– Statistical background
– Test if two distributions are different.
– Possible to compute a significance score.
– Range: [0, ∞).
– Cells are not weighted equally !
– More robust to outliers than the Euclidean distance, if the histograms contain enough observations

26
Q

What measure is the best?

A

– It depends on the application
– Euclidean distance is often not robust enough.
– Generally, 𝜒 2 distance gives good performance for histograms
– KL (Kullback–Leibler divergence)/Jeffreys divergence works well sometimes, but is expensive
– EMD (Earth Movers Distance) is the most powerful, but also very expensive.

27
Q

Object Recognition Using Histograms Algorithm

A

REFER TO SLIDES

28
Q

What is the Machine learning framework?

A
  • Training data consists of data samples and the target vectors
  • Learning / Training: Machine takes training data and automatically learns mapping from data samples to target vectors
    Test data
    – Target vectors are concealed from the machine
    – Machine predicts the target vectors based on previously learned model
    – Accuracy can be evaluated by comparing the predicted vectors to the actual vectors
29
Q

What is classification?

A

Assign input vector to one of two or more classes
Any decision rule divides input space into decision regions separated by decision boundaries

30
Q

What is the Nearest Neighbour Classifier

A

Partitioning of feature space for two-category 2D data using 1-nearest-neighbour
* Voronoi diagram is a partition of a plane into regions close to each of a given set of objects.
REFER TO SLIDES FOR FORMULA

31
Q

What are soem practical matter with K-NN

A

Choosing the value of k
– If too small, sensitive to noise points
– If too large, neighbourhood may include points from other classes
– Solution: cross-validation
===
Can produce counter-intuitive results
– Each feature may have a different scale (e.g Height & Weight)
Solution: normalize each feature to zero mean, unit variance
===
Curse of dimensionality
When the dimensionality increases, the volume of the space increases so fast that the available data become sparse. In order to obtain a reliable result, the amount of data needed often grows exponentially with the dimensionality.
– Solution: no good solution exists so far

32
Q

Linear and Non-Linear SVM

A

REFER TO SLIDES - Discriminatory and SVM