w9 gemini Flashcards
What is the topic of today’s lecture, according to slide 1?
Object recognition
What are the three levels of computer vision discussed in the recap on slide 1?
Low-level vision, Mid-level vision, High-level vision
Name four topics covered under Mid-level vision in the recap (slide 1).
Segmentation and grouping, Correspondence problem, Stereo and Depth, Video and Motion
What are the three main aspects of object recognition, as defined on slide 2?
Identification, Categorisation, Localisation
Give three examples of methods for performing object recognition (slide 2).
Template matching, Sliding window, Edge matching
What is the goal of object identification, according to slide 3?
To determine the identity of an individual instance of an object.
Give an example of object identification from slide 3.
Distinguishing between two specific individuals (Clinton vs. Bush) or two specific phone models (Samsung Galaxy On8 vs. iPhone 7 Plus).
What is the goal of object categorisation, as described on slide 4?
To determine the category of an object.
Provide an example of object categorisation from slide 4.
Classifying images as belonging to the category ‘Human’ or ‘Chimpanzee’, or ‘Telephone’ or ‘Calculator’.
What is object localisation, according to slide 5?
Determining the presence and/or location of an object in an image.
What is semantic segmentation, as defined on slide 6?
Localisation that is sufficiently fine-grained and for a sufficiently large number of categories, resulting in image segmentation.
Explain the concept of a category hierarchy in object recognition (slide 7).
Classification can occur at different levels of abstraction, from general categories (like ‘object’) to specific instances (like ‘Rex’).
What are the three levels in the category hierarchy shown on slide 7?
Superordinate level, Basic level, Subordinate level
Why is the ‘basic level’ significant for human object recognition (slide 8)?
Humans are usually fastest at recognizing category members at this level, start with basic-level categorization before identification, and it’s the first level understood by children.
List two reasons why the basic level is considered special (slide 9).
It’s the highest level where category members share many common features, and the lowest level where members have features distinct from other categories at the same level.
What are the two main requirements for object recognition systems (slide 10)?
Sensitivity to image differences relevant to distinguishing objects, and insensitivity/tolerance to differences that don’t affect object identity or category.
Give examples of image variations that object recognition systems should be insensitive to (slides 11-13).
Background clutter, occlusion, viewpoint, lighting, non-rigid deformations, within-category variation.
What are the three main components required for object recognition (slide 14)?
Image data, representations of objects, matching techniques.
Describe the ‘off-line’ stage of object recognition procedure (slide 15).
Extracting representations from training examples.
Describe the ‘on-line’ stage of object recognition procedure (slide 15).
Extracting representation from an input image and matching it with training examples to determine the object class or identity.
What are the two key aspects in which object recognition methods vary (slide 16)?
Representation used and matching procedure.
What is the representation used in template matching (slide 17)?
An image of the object to be recognized (an array of pixel intensities).
Describe the matching process in template matching (slide 17).
Searching every image region and calculating the similarity between the template and the image region.
Name three similarity measures that can be maximised in template matching (slide 18).
Cross-correlation, Normalised cross-correlation (NCC), Correlation coefficient.
Name three similarity measures that can be minimised in template matching (slide 19).
Sum of Squared Differences (SSD), Euclidean distance, Sum of Absolute Differences (SAD).
How can template matching be used to recognise multiple objects (slide 20)?
By using multiple templates, one for each object.
What is a potential problem with using SAD in template matching, as shown in the example on slide 21?
It can produce peaks in areas that are simply darker, not necessarily a true match.
What is a potential problem with template matching regarding ‘true’ and ‘false’ matches (slide 23)?
Distinguishing true matches from false matches, and deciding what constitutes a match and how many peaks to consider.
Why can template matching be ineffective if the target object is scaled or rotated (slide 25)?
Because the template needs to be very similar to the target object, and scaling or rotation introduces differences.
What is a common approach to address viewpoint and within-category variation in template matching (slide 26)?
Using multiple templates for each object, representing different viewpoints and variations.
Explain the dilemma regarding the threshold in template matching (slide 27).
A high threshold avoids false matches but might miss true matches, while a low threshold finds true matches but increases false matches.
Why can template matching be computationally expensive (slide 28)?
Because of the need for many comparisons, especially when dealing with variations in appearance, viewpoints, and scales.
Why is template matching sensitive to occlusion (slide 29)?
Because if an object is occluded, the template may not fully match the visible parts.
What is the fundamental issue that makes template matching not robust (slide 30)?
The metric used for comparison is fundamentally not robust to changes in appearance between the template and the image patch.