Object Recognition Flashcards
2 pathways of the brain
ventral and dorsal
what is object agnosia
intact primary visual functions, verbal descriptions and logical reasoning but immediate object recognition is lost
which stream is damaged in object agnosia
ventral
What kind of approach is Marr’s model of object recognition/vision
computational
What does Marr’s model of vision suggest
- vision is an information processing task
- we have to understand the nature of the task
- we have to understand how it can be accomplished
- we need to propose algorithms and mechanisms that can accomplish the visual task
- explanations of visual experience and visual physiology should come from an understanding of the implementation of those algorithms
Different models of object recognition used
template-matching model
feature-detector models
structural-description models
view-dependent models
template-matching model
match the exact pattern of light on the retina with previously remembered patterns.
Neuron makes connections with pixels and when the specific pixels are detected the neuron fires
problems with template-matching model
- very constrained - can’t cope with diversity of natural object variation
- doesn’t explain how we detect variances in images
- there would be too many template detectors
what model solves the problems of the template-matching model
feature-detector model: looks at specific features in the image rather than pixels
Who put forward the idea of ‘feature-detectors’?
Hubel and Wiesel
what do feature-detector models suggest?
we can detect specific combinations of features rather than specific patterns
the features don’t have to exactly match a pre-existing template, it just needs to contain the same sub-features that
what are demons in the feature-detector model
subroutines
who out forward the structural description model
Marr & Nishihara (1978)
What did Marr & Nishihara (1978) suggest the goal of vision is in the structural description model
to describe the object unambiguously in its core geometric components
Marr & Nishihara’s critera for good representation of high level vision
- Accessibility
- scope
- uniqueness
- stability
- sensitivity
what did Marr & Nishihara (1978) suggest the primitives (basic units of information in representation) are
objects are described in terms of axes and the volumes around them
the description is hierarchical
this means the object can be described at many scales
according to Marr & Nishihara (1978) how is information organised into an abject description?
recognition:
by finding the closest model in the model store and specifying its parameters
What did Biederman (1987) suggest for structural description models
similar to Marr & Nishihara (1978) but proposed a specific “alphabet” of primitive volumes callled Geons which object are decomposed
what properties are geons similar to
any 2D projection
how many geons are there estimated to be
less that 36
evidence for structural models
intersections of geons are particularly informative in object recognition e.g. it is hard to tell what an object is when straight lines or corners are removed
what are IT cells
higher visual neurons in IT area
Pros of structural description models
explain invariance well
recognition is description not matching
evidence that structural information matters to both humans and neurons
cones of structural description models
extracting model parameters can be hard in real mages
structural description is difficult for some objects e.g. campfires
driven by theoretical desirability rather than behavioural evidence
what does the the view-dependent model suggest
that the arbitrary image has a brute association with the object
what are the primitives in the view-dependent model?
sub-regions of the image
consisting of lines, curves, textures, colour, shading etc
what kind of model is the view-based model
simple feet forward activation
evidence for view-baed models
human object recognition is not perfectly viewpoint invariant
Monkey IT neurons selective for particular viewpoints
simulations show good invariance and similar errors in human observers
pros of view-dependent model
straightforward
newer models based directly on what we know of physiology
abstract feature units are recombines
good evidence
cons of view-based models
humans often show good generalisation across viewpoints even for novel objects
more memory intensive than other models e.g. geon models
no understanding of underlying relationships