Video Module 7: Recognizing Objects Flashcards
template theory
a theory which suggests that we could recognize objects by comparing current input to stored images of every object and individual we’ve seen
issues:
- How is it that we’re able to store a practically infinitesimal amount of templates in our memory?
- How is it that we can recognize objects under varying lighting, shadow, and perspectives?
feature net
a proposed model for how we detect shapes and objects that relies on a hierarchical model of layers of detectors
- we can imagine that different detectors for different features correspond to specific neurons
- detectors in lower layers integrate the information they receive and pass it on to higher layers of the feature net
- relies on the idea of a network of neurons
priming
the idea that the amount of stimuli needed to reach the recognition “threshold” for particular detectors can be lowered if those detectors have been exposed to excitatory stimuli recently or frequently
- a primed detector takes less stimulation than normal to produce a signal (action potential)
- it takes less input to detect common stimuli than rare stimuli
- it takes less input to detect stimuli that you’ve recently seen
- a key aspect of how feature nets function
bigram detectors
detectors which are sensitive to the frequency of two-letter sequences in printed language
- common letter sequences (e.g. CA, CH) are more primed than ones that are more rare (e.g. PN, TC)
- priming tells us the most likely image of what we see if we’re only exposed to partial information
How do feature nets help us resolve ambiguity?
Feature nets rely on both sensory input (bottom-up) and top-down influence from priming. For example, in a feature net, bigram detectors that are more primed are more likely to fire. In this sense, feature nets tell us the most likely image of what we see if we’re only exposed to partial information.
How might feature nets result in recognition errors?
Priming can sometimes cause us to make recognition errors because we may mistakenly guess incorrectly about what we saw due to having an expectation of what we’re most commonly or recently exposed to. Feature nets are efficient in that they can help us identify the most likely image, however they don’t always help us identify the correct image.
What is the word superiority effect and how can it be explained by priming?
The phenomenon that letters are more easily recognizable in the context of a word than in isolation.
The word superiority effect relies on the relative priming of bigram detectors.
- letters are most recognizable in the context of a real word, somewhat recognizable in a word-like sequence, and least recognizable in isolation or in a random, non word-like sequence.
What is recognition by components theory? How is it related to the idea of feature nets?
Recognition by components (RBC) uses a similar proposed model to that of feature nets. Just like letters have 2D features, objects have 3D features called geons. The most basic layer of RBC is geons.
- RBC is viewpoint independent, meaning that it proposes that our perspective of an object doesn’t matter, so long as we can recognize its geons
- It is more difficult to identify an object if its geons are obscured
What is viewpoint dependent recognition? How does it contrast with RBC?
viewpoint dependent recognition is the idea that object recognition is reliant on the angle at which we see objects, not just whether we can see its geons. RBC maintains that object recognition is not viewpoint dependent.
- Study results reveal that we are faster to recognize objects when they are at particular angles/viewpoints.
- Viewpoint dependent recognition is the most significant challenge to geon theory (RBC)
How do faces differ from other objects in terms of our ability to recognize them?
For faces, we are more sensitive to changes in configuration than changes in features. For objects, we are more sensitive to changes in features. Face recognition is also affected by orientation: it is more difficult to notice changes in faces when they are upside down (face inversion effect).
Lastly, face processing is based on the whole face rather than individual features—it is hard to identify a face by features alone when in the context of a coherent face (composite face effect).