WEEK 3- ACHIEVING OBJECT CONSTANCY Flashcards
Why do we need visual object constancy
the visual system must recognise familiar objects whilst generalising over irrelevant variation due to depth rotations, plane rotations and size, position and lighting changes- object constancy must be achieved fast and accurately- there is no time for slow iterating or double- checking processes
what does object constancy allow us to do?
to access the SAME semantic information whatever view is seen
what are the two factors needed for object constancy?
GENERALISATION across variation between stimuli in order to identify objects= achievement of object constancy. DISCRIMINATION between stimuli= categorisation. both generalisation and discrimination are necessary to access semantic knowledge effectively from the input stimulus. the trade off between these two things is tricky for our visual system
what are the four alternative account of how we we generalise in order to categorise objects?
- defining features. 2. template. 3. multiple views plus transformations. 4. structural descriptions
what is the defining features account?
a unique feature distinguishes the target object from the alternative distractors whatever your viewpoint- but are there defining features which uniquely discriminate all the everyday objects which we know across all different views that we can encounter in daily life?- probably not- many different view of everyday familiar objects are possible- what is unique about the side view of a car? conclusion: defining features are only useful if just a small set of distinctive objects must be distinguished - so this route probably has little role in everyday object recognition
what did Hayward and Williams 2000 find when they showed coloured objects shown in a picture- picture matching task when they tried different view points eg rotated 40 degrees round (experiment 1)
Experiment 1: group a: objects were a uniquely coloured shape group b: objects had unique parts group c: neither unique colours or parts- people were much worse if the object was rotated round from the first to the second view found that group C (view invariant recognition) were the slowest than B then group A were flat because it was view invariant performance
What did Hayward and Williams 2000 find when group A now saw all the objects the same colour- all grey (experiment 2-)?
their performance now got worse (response times longer) as the number of degrees rotated increased - now view sensitive performance whereas before it was view invariant
what are template theories?
they assume we store a different internal representation for each significantly different view of an object we encounter:
- a side view of a car is then matched to a stored 2D representation of a side view of a car
a front view of a car matches to a stored front view of a car ect. once enough different views of a given object have been stored, most input stimuli can be matched farily directly to one of the stored views. this has large memory demands but less online computation (so may be faster) than structural description accounts. no decomposition of the image into its parts and no coding of the spatial relations between the parts, unlike structural description theories. we store an internal representation of every significantly different view of an object we see- awful lot of information to store in memory - called a combinatorial explosion
what is the psychological evidence for object-specific cells
Hubel & Wiesel 1962- used electrophysiology to reveal a hierarchy of cells responding to increasingly complex inputs from the simplest features (edges, corners) upwards. there was speculation there would even be ‘grandmother cells’ that respond to a particular object. for example, single cell recording studies have detected hand- specific neurons in monkey inferotemporal cortex (gross et al 1972). there do seem to be cells that are very specialized in our brains ie fine tuned to recognize things like hands
what other body part is there also good evidence there is specific cells for in monkeys
the face (face- specific neurons in the infero-temporal cortex)
what did bruce et al in 1981 find in a study about face-specific neurons?
these cells fired more intensely to more face-like stimuli relative to various visual control stimuli. also that cells fire preferentially specifically for faces
what is the imaging evidence for object- specific cells?
more recent fmri studies have deomnstrated separate regions in human fusiform gyrus for the perception of faces- kanwisher et al 1997, stick figures (peelen and downing 2005) and body parts (downing et al 2001). although we shouldnt get carried away by this positive evidence as it is only evidence for a narrow range of types of stimuli. we havent found evidence for templates for bananas and trees and pens ect (ie everyday objects)
what are the other questions left open by the template account?
- how can so many views be stored and assessed efficiently?
- what if the input does not exactly match a stored template?
-when do we decide to store a new view?
how do we recognise previously unseen views?
what is the conclusion about template theories?
templates are usually considered too inflexible and expensive in resources (eg memory capacity) to be a general solution. templates may let us recognise a few types of biologically or socially important stimuli (faces, hands, human bodies) but they seem unlikely to be useful for general purpose object recognition. they may be used in narrow circumstances ie faces and hand (things that are socially relevant to us). although there is good. there is good evidence however we store large amount of detailed visual information- but this does not necessarily mean that we just store templates
what study suggests we can store a large amount of detailed visual information (evidence FOR the template theory)?
Brady et al 2008- 14 people shown 2500 photos over 5.5 hours then did a two- alternative forced choice task. Right: example test pairs presented during the 2AFC task for three different conditions (novel, exemplar and state). the number of people choosing the correct item is shown for each pair. you get two different images and have to say which one is the one you saw before (exemplar and state conditions the hardest) conclusion: huge and detailed storage capacity for visual images - most people got the right answer in each condition