Perception and Modularity 2: Face and Object Recognition Flashcards
What are the similarities between face and object recognition?
both are fast and automatic
we have a large repetoire for both (look at something and instantaneously know what it is)
How are the goals of face and object recognition different?
-when you recognize an object you recognize it has a member of a class being able to recognize individuals is key aspect of face recognition
-objects have basic level vs individual identification for faces -what is it? vs who is it
What is basic for the basic level of identification?
-this is really a question about conceptual organization rather than perception
-dog cat and bird are more basic than phoebe robin and dove
-bird is more general than dove but not as general as animal so still somewhat specific
What are the three main psycholinguistic properties for words of basic level categories?
-the preferred term when naming
-short
-appear first in children’s vocabulary
What information is the most important at the basic level for object recognition?
-parts
-two or three parts are often sufficient to identify an object at the basic level
-have a fairly restricted set of parts you can combine in different ways to provide an abstract illustrations of a 3d shape of an object
What is some prima facie evidence that parts are identified rapidly and automatically/
-rapid serial visual presentation
-can recognize the line drawing of an objects
-if you were told to raise your hand when you see a flashlight on the screen - can recognize it with only 100ms of exposure - takes longer for you to raise your hand than raise your hand
What did Tanaka and Farah do in their experiment with configural vs featural representations?
-taught people to recognize noses and doors in isolation
-they also taught people to recognize whole faces and houses
-they found that people are better at recognizing whole faces than noses but equally good at recognizing houses and doors
What does the chimera face illusion show in regards to face processing?
-the preference for configural processing of faces is so strong that it is not immediately obvious that there is a face chimera whereas for objects you can pull out parts fairly readily even in unusual objects
How can we define the information that allows us to recognize faces through reverse engineering of face recognition?
we can use noise
-a base image created using morphing - gender is ambiguous and expression is neutral - add noise to the image to make the gender ambiguous -take an avg of male and female faces with neutral expression and add somewhat random noise and take a bunch of orientations and spatial frequncies and put them over the images until we a get a noise mask which is recoverable and the degree to which they are present and absent - take baseline image and add extra something so different array of cells is activated- can characterize what is landing on the retina and characertize what is similar and different from one another via noise - can add noise to a face image and can extract what noise pattern is important for what categorical face is important
When we use noise to mask an image what are we replicating?
what gabors or simple cells in v1 are activated in response to the surface of a face
When we overlay different noise patterns on the neutral expression average of male and female faces face what do we see? What question does this finding allow us to ask?
people’s judgments of gender and facial expression of the face change
-what patterns of noise shift the perception of gender an facial expression
When we have a base image and two class images and in one set of class images the expression changes from happy to sad and in another set of class images the expression changes from female to male where do we see distortion?
class image set one - happy to sad - distortion around mouth
class image set two - female to male - distortion around eyes and mouth (females have bigger eyes and lips)
Hoes are expression and gender different from individual recogniton?
they are driven by local features like smiling and frowning and relative size of eyes and mouth
What information do we use when we try to tell people apart within large categories aka individual identification?
-gabor filter - these are a reasonable model of the receptive field for V1 neurons - sinusoids at different orientations and spatial frequencies provide an over complete representation at a specific retinotopic location - location is defined as a Gaussian
Example: tom cruise and John Travolta and can add noise and say is this tom cruise or John Travolta some are 50/50 some are very tom cruise or very John Travolta and the average across people tells us how effectively we reproduced one or the other - can transform one of these faces into the other by applying a noise mask which captures something about how these images are being processed in the primary visual cortex - can add noise that is into the processing properties
What is a Gabor jet?
a collection of gabors at a range (form low to high) of spatial frequencies (5) and spatial orientations
What does summarizing the activity in a gabor jet give us?
a simplified model of the information in V1 at a specific location
i.e. one a top is going to respond to horizontal thick lines - then as you go down they will respond more weakly cause the orientation is off - can describe a patch of image based on how similar it is to these gabors
What does a gabor based representation of the face illustrate and how you make one?
is all the faces are lined up and facing the same direction a simple grid works well
-can then plot with spatial frequnency on x axis (5 scales) and orientation on y axis (8 orientations) what each point on grid responds to
How can you get some viepwoint invariance in a a gabor based representation of the face?
by quickly finding some landmarks and warping the grid
What is the gabor similarity?
can compare two images and get a numerical answer and can compute the distance between them and see
-sum of Ma Mb / sqrt(sum of Ma^2 and sume of Mb^2)
What is the match to sample task? With what speed can we do this?
people are presented with three face and one on top and two on bottom - have to determine - Which of the bottom two faces is the same as the top?
(can do this quickly with brief 300 ms stimulus presentation)
The greater the calculated euclidean distance of gabor simple cells what is the error rate for tha match to sample task?
the error rate drops because the greater the distance the more metrically disismlar the two faces are and the easier it is for people to tell them apart
Similarity between two faces during the match to sample task is predicted by a Gabor Jet model with what accuracy?
-can predict with a computer v1 representation (gabor jet model) very accurately how hard it is to decipher to faces
-this supports the claim that face recognition is supported in part by processes that are sensitive to metrical similairty of the surface image
Why are people surprisingly bad at ignoring changes in lighting direction in speeded face recognition?
due to dependence on surface information
-lighting direction messes with face recognition senstiivty is lower in different lighting conditions and people are slower to identify people so reaction time increases while sensitivity decreases
How does contrast inversion affect face processing?
it disrupts face processing and appears to impact processing by reducing our ability to recover surface information
-is dependent on surface information as modeled by gabor jets
How does contrast inversion affect object processing?
object processing is less affected by this because it depends on parts more than surfaces (but cannot tell what the object is made of with contrast inversion)
What are five important aspects of face individuation that fail to explain much about object identitifcation?
-surface based
-metric sensitivity
-preservers fine detail
-configural
-viewpoint dependent
What did Hayworth and Biederman do in their study for object recognition?
-subjects name a series of briefly presented pictures shown one at a time in a first block of trials
-in the second block the pictures include identical pictures - different exemplars of the same basic level object
-complements
-new items
What are the five possible things which could possibly be involved in mediating the priming of picture naming?
-local image features
-whole object templates
-basic level conceptual or lexical or word production priming
-subordinate level conceptual priming
-parts
Defintiions:
Conceptual Priming: This involves the activation of related concepts in memory, facilitating the processing of related information. For example, seeing the word “doctor” might make you faster to recognize the word “nurse.”
Lexical Priming: This focuses specifically on the activation of words and their meanings. For instance, if you hear the word “bread,” you might be quicker to respond to “butter” due to their lexical association.
Word Production Priming: This type involves the facilitation of the production of words following exposure to related stimuli, often assessed in tasks requiring naming or generating words.
Subordinate Level: The most specific category (e.g., “beagle”).
In subordinate level conceptual priming, exposure to a specific example at the subordinate level can facilitate the recognition or processing of related, even more specific concepts. For example, if you hear “beagle,” you might find it easier to think of specific traits or related breeds like “basset hound” or “poodle.”
What were the findings of hayworth and biederman in their object recogition task for the complement created by deleting every other line and vertex in the image?
-people are faster and more accurate at all the objects after the first block
-the identical and the complmenet has the same response time and error perentage whcih was lower than the different exemplar
-this is for the complement created by deleting every other line and vertex in the image so that if you superimposed the complements hey would add up to a complete model - still preserve the parts this way