Visual Perception Flashcards
Why do objects appear coloured?
Because they reflect different wavelengths of light from different parts of the visible spectrum.
What is Hue?
Hue is the quality that distinguishes red from blue.
What is brightness?
Brightness is the perceived intensity of light.
What is saturation?
Saturation characterises a colour as pale or vibrant.
What is a metamer?
A sensory stimulus that is perceptually identical to another stimulus, but physically different.
What are the 3 cone types? What colours do they associate with?
S cones - blue
M cones - green
L cones - red
What is the principle of univariance?
Any single photopigment is colour-blind because an appropriate combination of wavelength and intensity can result in an identical neural response - this is the principle of univariance.
What are dichromats?
Non-primate mammals that rely heavily on sound and smell that only have 2 pigments.
What are pentachromats?
Some birds that rely heavily on vision that have up to 5 pigments.
What type of chromat are humans? Why?
Trichromats, because we have 3 cone types.
Which type of cone do we have significantly less of than the other 2?
S cones (blue).
There are no __ cones in the fovea.
S.
Does cone distribution impact the ability to perceive colours?
No.
What is opponent coding theory?
The idea that colours are grouped into opposing pairs (blue & yellow, red & green).
___________ have chromatically opponent receptive fields. What does this mean?
Parvocellular RGCs.
This means that (for example) the centre might be excited by red light, whereas the inhibitory surround may be excited by green light.
Which layers of the LGN get their input for the achromatic luminance channel?
1 and 2.
Which layers of the LGN get their input for the two chromatic channels?
3 to 6.
What are the cardinal directions of colour space?
Red-green/cyan (0-180 hue angle) or blue/purple-yellow (90-270 hue angle). Nearly all cells at the LGN prefer stimuli modulated along this.
What is a double opponent receptive field?
Where the centre is excited by red and inhibited by green, but the surround is excited by green and inhibited by red.
What is colour constancy? What is the change in perceived colour called?
The ability to assign a fixed colour to an object, even though the actual spectral info entering the eye changes in different illumination conditions.
Chromatic induction.
What is acquired cerebral achromatopsia (CVD) typically due to damage of?
V4.
What are the % chances of colour blindness for XX and XY chromosomes in congenital CVD?
XX chromosomes: 0.5%
XY chromosomes: 8%
What types of cones does congenital CVD usually affect?
M or L cones.
What is an anomalous trichromat?
Where all 3 cone types are present, but one does not work optimally.
What do we use to detect colour vision deficiency?
Ishihara colour plates
Where does object recognition take place? What is this model called?
The two-stream model.
After the visual cortex, info is transmitted via 2 pathways.
1. The ventral stream (the what pathway)
2. The dorsal stream (the where/how pathway)
What functions is the ventral stream associated with?
Object recognition, memory.
What functions is the dorsal stream associated with?
Motion, location, saccadic control.
Who came up with the computational approach for object recognition?
David Marr
What are the 3 levels of analysis for the computational approach?
- Computational - What machinery is required?
- Algorithmic - What processes and sequences are there?
- Implementational - How does this produce a recognised object?
What is the simplest model for object recognition?
Template matching models.
What are template-matching models?
You have a detector for (e.g.) letter A.
When an object appears in the receptive field of this detector that matches that template, it signals.
For this to work, we would need a detector for every possible orientation, scale, font – requiring an implausibly large brain!
The computer vision equivalent is the machines that read cheques; they work well, but the letters must be in exactly the expected location and orientation.
What is Selfridge’s Pandemonium model (1959) an example of? What is it?
A feature-detection model.
Selfridge described it similar to demons with different jobs.
What are the 3 jobs of the demons in Selfridge’s Pandemonium model (1959)?
- The feature demons
- These look at the image and write down how many examples of the feature they see. - The cognitive demons
- These shout if they think that combination of features applies to their image; the more confident they are, the louder they shout. - The decision demons
- These listen to the cognitive demons and decide who is shouting the loudest.
Who came up with a structural-description model? What is this?
Marr & Nishihara (1978).
They believed the goal of the model is to describe the object unambiguously.
Therefore, the system must be invariant to transformations in viewpoint, illumination, etc.
This means the system must know which properties are invariant under transformation, and how other properties might vary.
What are the 3 properties of Marr and Nishihara’s structural description model?
Coordination system should be:
- Object centred (negates problem of transformation variance)
- Should have a volumetric approach (only requires axis and size info, maintaining specificity without needing too much storage space)
- The object is described in terms of its axes and the volumes around them, (this is modular and hierarchal) meaning that it can be described at many scales and be matched.
What was Marr & Nishihara’s way of explaining recognition?
The “model store”.
Even if the object doesn’t match anything in your model store exactly, you can find the closest match and have enough information from the image and your memory to help you interact with it.
How did Biedermann’s (1987) ideas differ from Marr & Nishihara’s?
Very similar, but instead of cylinders objects are decomposed into geons (geometric ions).
How many geons did Biedermann estimate there were?
Around 36.
How many possible 2-geon objects did Biedermann propose there were?
Around 75,000.
What point was Bierdermann trying to get across in his geon experiment?
Obscuring a geon is more damaging to recognition than equal amounts of other information, particularly when you only have a short time to make the judgement.
What are some strengths of structural description models?
- Invariance is well explained
- Recognition relies on description rather than matching
- Graded representations cope with discrimination and generalisation
- Evidence that structural info matters to humans and to neurones
What are some weaknesses of structural description models?
- Extracting model parameters can be hard in real images
- Structural description can be hard for some objects (eg. fire/crumpled paper)
- They are driven by theoretical desirability rather than behavioural/physiological evidence
What is a view-dependent model?
Recognition by matching input to the closest stored view.
What is brute-force association?
Eg. “I recognise that as a horse because I have seen a horse on many occasions, and it looks like that.”
What is the main difference between view-dependent and structural description models?
View-dependent models use a weighted approach between layers, rather than the winner takes it all.
Which scientists came up with view-dependent models for object recognition?
Bülthoff & Edelman (1992), Riesenhuber & Poggio (1999).
What are the 3 viewpoints in the viewing sphere? (in order of difficulty)
Interpolation
Extrapolation
Orthogonal axis
Where is interpolation?
Between previous viewpoints
Where is extrapolation?
Beyond previous viewpoints, but in the same axis
Where is the orthogonal axis?
From a completely new viewpoint
What are some strengths of view-dependent models?
- Straightforward
- Minimises transformations needed
- Newer models are based directly on what we know of physiology
- Abstract features are recombinable
- Lots of evidence
What are some weaknesses of view-dependent models?
- Humans often show quite good generalisation across viewpoints, even for new objects
- Still more memory intensive than (for example) the geon model
Object recognition must overcome changes in _______ and _______.
Rotation, lighting.
Is face recognition within-class or between-class recognition? Why?
Within-class.
This is because the face has extremely similar distractors.
What is it called when we see faces where there are none?
Pareidolia.
What are the 2 features needed for face recognition?
- A symmetrical noise pattern
- A natural distribution of frequencies
Where in the brain are faces processed?
The FFA (fusiform face area).
What are the two main ideas on facial perception?
- FACES ARE SPECIAL
The domain specificity hypothesis
- We are born with mechanisms solely for facial recognition, operating differently to those for object recognition. - FACES ARE NOT SPECIAL
The expertise hypothesis
- Face perception simply shows us how general object recognition mechanisms work for objects we are very well-practised at observing
Domain specificity hypothesis evidence:
What were the findings of the neonatal face discrimination study?
Newborn babies prefer to look at face-like patterns more than non-face-like patterns (Johnson et al., 1991).
…But this might be a broader preference for top-heavy patterns (Simion et al., 2001)
…But babies as young as 1-4 days old seems to be able to tell their mother’s face from that of a stranger (Field et al., 1984).
Domain specificity hypothesis evidence:
What were the findings regarding prosopagnosia?
Some people cannot recognise (exclusively) faces
They often have different gaze patterns (Schwarzer et al., 2007)
This could be acquired through damage to the occipeto-temporal regions (e.g. stroke, TBI).
Domain specificity hypothesis evidence:
What were the findings on the Inversion Effect?
We are more attuned to faces that are the correct orientation.
Domain specificity hypothesis evidence:
What were the findings on sensitivity to facial configuration?
Inversion of a face disrupts configural more than featural information.
This is evidence of holistic processing (the inability to attend to just one aspect of the face).
If we change one part of the face, the whole face then looks different.
Domain specificity hypothesis evidence:
What were the findings on the Part-whole effect?
Sub-parts of faces are not independently recognisable (Tanaka & Farah, 1993).
When given the whole face to learn, it was processed holistically.
When given the scrambled face to learn, face specific mechanisms were not activated and the component parts were processed individually.
Domain specificity hypothesis evidence:
What were the findings on the Composite effect?
We can’t help but see a whole face.
The Composite effect slows reaction time for aligned faces, but only when they are upright.
Expertise hypothesis: What were the findings of the effect of (un)familiarity?
We are so much better at identifying people we have already seen (Jenkins et al., 2011).
Expertise hypothesis: What were the findings of the “other race” effect?
People are better at remembering, more accurate at matching and can make finer discriminations amongst faces of their own race rather than another.
Dutch celebrity example.
This is more evidence for the importance of practice/familiarity.
Expertise hypothesis: What were the findings about object inversion in expects?
Orientation is more critical in situations where the participant has extensive practice in making subtle object discrimination (Diamond & Carey, 1986).
Experts tested on dog breeds. Only dog experts were worse at recognising dogs when they were inverted.
Expertise hypothesis: What were the findings on the Part-whole effect in objects?
Parts are often recognised better in their original context, not just faces (Gauthier & Tarr, 2002).
Expertise hypothesis: What were the findings on FFA activation in car experts?
The FFA may simply be an area for expertise.
Most voxels preferred faces. But the amount these voxels were activated to cars was correlated with how expert the person was with cars.