w10 gemini Flashcards
What is the difference between the “viewer-centred” and “object-centred” approach to object recognition?
In the viewer-centred approach, the 3D object is modeled as a set of 2D images, showing different views of the object. In the object-centred approach, a single 3D model is used to describe the object.
What is a geon?
Geons are simple three-dimensional shapes such as spheres, cubes, cylinders, cones, or wedges.
What is a structural description?
A structural description is a representation of an object in terms of its component geons and their relative locations and sizes.
Explain the nearest mean classifier method.
The nearest mean classifier calculates the mean of the feature vectors for all the training examples in each class (the prototype). For a new object, it finds the closest class prototype (using Euclidean distance) and assigns the new object to that class label.
Explain the nearest neighbour classifier method.
The nearest neighbour classifier saves the feature vectors for all the training examples. For a new object, it finds the closest training example (using Euclidean distance) and assigns the new object to the same class label as that closest example.
Explain the k-nearest neighbour classifier method with k=3.
The k-nearest neighbour classifier with k=3 saves the feature vectors for all the training examples. For a new object, it finds the 3 closest training examples (using Euclidean distance). The new object is assigned to the class label that is the majority among these 3 nearest neighbours.
Write down Bayes’ theorem.
p(H|E) = (p(E|H) * p(H)) / p(E)
Explain the interpretation of p(H|E) in Bayes’ theorem in relation to a computer vision system.
p(H|E) is the posterior probability that a hypothesis H (e.g., the object is a chair) is true, given the image evidence E. This is what the vision system needs to evaluate to determine the most likely explanation for the image data.
Explain the interpretation of p(E|H) in Bayes’ theorem in relation to a computer vision system.
p(E|H) is the likelihood that if hypothesis H were true, the image would contain particular evidence E. This is based on our understanding of how images are formed, such as how surface properties and lighting create certain images.
Explain the interpretation of p(H) in Bayes’ theorem in relation to a computer vision system.
p(H) is the prior probability, representing our initial assumptions about the likelihood of a hypothesis being true before seeing any evidence. If a hypothesis is initially improbable, stronger evidence is needed to support it.
Explain the interpretation of p(E) in Bayes’ theorem in relation to a computer vision system.
p(E) is the probability of observing the evidence E regardless of whether the hypothesis H is true. If the evidence is very common, it reduces our confidence in inferring a specific hypothesis based on that evidence.
In the production line problem, what is the probability of objA?
p(objA) = 0.75
In the production line problem, what is the probability of objB?
p(objB) = 0.25
In the production line problem, what is the probability of an indistinguishable image given objA at oriA?
p(I|objA) = 0.1
In the production line problem, what is the probability of an indistinguishable image given objB at oriB?
p(I|objB) = 0.2
Using Bayes’ theorem, if an indistinguishable image is observed, what is the probability it is objA at oriA?
p(objA|I) = (p(I|objA) * p(objA)) / p(I) = k * (0.1 * 0.75) = 0.075k
Using Bayes’ theorem, if an indistinguishable image is observed, what is the probability it is objB at oriB?
p(objB|I) = (p(I|objB) * p(objB)) / p(I) = k * (0.2 * 0.25) = 0.05k
In the production line problem, if an indistinguishable image is observed, which bin should the robot sort the object into to minimize errors?
The robot should sort the object into the bin for objA because p(objA|I) > p(objB|I), meaning it’s more likely to be objA.
What is the viewer-centred approach to object recognition?
The 3D object is modelled as a set of 2D images, showing different views of the object. Recognition occurs by matching the current view to a stored view.
What is the object-centred approach to object recognition?
A single 3D model is used to describe the object. Recognition involves decomposing the viewed object into its components (geons) and matching their arrangement to stored models.
What are geons designed to be?
Geons are designed to be sufficiently different from each other to be easily discriminated, robust to noise (identifiable even with missing parts), and view-invariant (look similar from most viewpoints).
What are the problems with the object-based recognition by components theory?
It can be difficult to decompose an image into geons, it’s difficult to represent many natural objects using geons, and it cannot detect finer details necessary for identifying individuals or similar objects.
What is the image-based approach to object recognition?
Each object is represented by storing multiple 2D views (images). Object recognition occurs when a current pattern matches a stored pattern.
What is template matching in the context of image-based recognition?
An early form of image-based approach where the current view is directly compared to stored templates. It’s considered too rigid to account for the flexibility of human object recognition.