w10 gemini Flashcards

Question

What is the multiple views approach in the context of image-based recognition?

Answer 1

A more recent image-based approach where multiple views of objects are stored. Recognition occurs by matching the current view, and interpolation between stored views allows recognition from novel viewpoints.

Answer 2

Configural theories emphasize the global shape and relationships between features in object recognition.

Answer 3

Featural theories emphasize the individual features of an object in recognition.

Answer 4

Inverted faces are processed featurally, meaning individual features are processed independently, and the relationships between them are ignored. Upright faces are processed configurally, or holistically.

Answer 5

Rules, prototypes, and exemplars.

Answer 6

Category membership is defined by abstract rules. Anything that satisfies the rule(s) for the category belongs to that category.

Answer 7

For: explains over-extension of grammar rules. Against: some members are better examples of a category (graded membership).

Answer 8

We calculate the average (or prototype) of all individual instances from each category. A new stimulus is compared to these stored prototypes and assigned to the category of the nearest one.

Answer 9

For: explains why prototypical category members are accessed more quickly. Against: variations within a class cannot be represented.

Answer 10

Specific individual instances of each category ('exemplars') are stored in memory. A new stimulus is compared to these stored exemplars and assigned to the category of the nearest one.

Answer 11

For: explains some kinds of mis-categorizations. Against: struggles with the concept of graded membership (some members being better examples).

Answer 12

Learning where the class for each data point in the training set is known. New data points are assigned to appropriate classes based on similarity to these labelled training examples.

Answer 13

Learning where the class for each data point is unknown. All data points are assigned to appropriate classes based on similarity without pre-existing labels.

Answer 14

It calculates the mean feature vector for each class and assigns new data to the class with the nearest mean. It's suitable only if the data is linearly separable.

Answer 15

It assigns new data to the class of the nearest training example. It's suitable if the data is non-linearly separable.

Answer 16

It assigns new data to the class that is the majority among its k nearest training examples. It's suitable if the data is non-linearly separable.

Answer 17

Sum of Squared Differences (SSD), Euclidean distance, Sum of Absolute Differences (SAD) (Manhattan distance), Cross-correlation, Normalised cross-correlation, Correlation coefficient.

Answer 18

The 'What' pathway (ventral stream) goes from V1 to the inferotemporal cortex and is involved in object identity and category information. The 'Where' pathway (dorsal stream) goes from V1 to the parietal cortex and is involved in spatial and motion information.

Answer 19

As you progress along a pathway, neurons' preferred stimuli get more complex, receptive fields become larger, and there is greater invariance to location.

Answer 20

Eye -> LGN: Centre-surround Cells. LGN -> V1: Simple Cells (respond to edges/bars at a specific orientation and location). Simple Cells -> Complex Cells (respond to edges/bars of a specific orientation within a small region).

Answer 21

Receptive fields become larger, have higher complexity, and higher invariance to location.

Answer 22

Models that propose a purely serial, feedforward sequence of cortical information processing, like HMAX and CNN.

Answer 23

HMAX is a feedforward model that uses alternating layers of simple (S) and complex (C) cells to increase selectivity and invariance of receptive fields. S-cells respond to conjunctions, and C-cells respond to any input in a small neighbourhood.

Answer 24

S-cells perform a summation ('and'-like operation) increasing selectivity. C-cells perform a max operation ('or'-like operation) increasing invariance.

Answer 25

Features at one stage are built from features at earlier stages, forming increasingly complex templates.

Answer 26

A hierarchical model similar to HMAX that uses standard image processing techniques: convolution and sub-sampling.

Answer 27

Convolution in CNNs is equivalent to the function of S-layers in HMAX (responding to conjunctions). Sub-sampling in CNNs is equivalent to the function of C-layers in HMAX (responding to any input in a small neighbourhood).

Answer 28

Models that incorporate feedback connections and lateral connections within cortical regions, allowing for interaction and combination of bottom-up and top-down information.

Answer 29

(1) Lateral connections within a region enabling interaction between neurons in the same population. (2) Feedback connections conveying information from higher cortical regions to primary sensory areas.

Answer 30

Using the information in the stimulus itself to aid in identification. They are stimulus-driven and discriminative.

Answer 31

Using context, previous knowledge, and expectation to aid in identification. They are knowledge-driven and generative.

Answer 32

Bayes' Theorem describes an optimum method of combining bottom-up (likelihood) and top-down (prior) information to form a posterior probability.

Answer 33

Posterior (p(object|image)) is what we want to know: the probability of a particular object being present given the image. Likelihood (p(image|object)) is what we can calculate: the probability of the particular image being a projection of the particular object. Prior (p(object)) is what we know from prior experience: the probability that the particular object will be present in the environment. Evidence (p(image)) is the probability of observing the image, which can often be ignored as it's the same for all possible interpretations.

Answer 34

Because we know the pixel intensities (outcomes) and want to infer the causes (objects in the scene).

Answer 35

Because there are usually multiple solutions (multiple causes that could give rise to the same outcomes).

Answer 36

By using assumptions, constraints, or priors about the nature of the physical world.

Answer 37

Texture is circular and homogenous, light comes from above, faces are convex, size is constant, neighbouring features are related, similar features are related, connected features are related, strings of letters form words, knowledge about image content.

Answer 38

Discriminative methods model the posterior probability directly. Generative methods model the likelihood and the prior probability.

w10 gemini Flashcards

(62 cards)