High Level Perception (P&C) Flashcards
What are problems in object recognition
Image changes with distance, position, lighting, occlusion of parts etc.
Gestalt Psychology
The whole is more than the sum of parts
‘Grouping principles’ of perceptual organisation (we group things in our mind based on the following criteria)
- Similarity
- Proximity
- Closure
- Good continuation
- Common fate
Define figure ground
- area bounded by contour (closure) is seen as separate object
- contours seen as belonging to one object at a time (we can either see two faces or a vase, but never at the same time)
Define problems behind Gestalt theories of object recognition:
- No mechanisms – only a description of phenomena
- Speculative neurophysiology is rubbish
- Made assumptions that perceptual grouping was independent of perceptual learning – no evidence and wrong anyway.
Describe the stages Marr’s model of recognition
- Primal sketch: 2 dimensional representation of luminance - edges, countours (raw primal sketch → full primal sketch)
- 2 1/2 D sketch: description of depth and orientation, shading, texture, motion and binocular disparity (viewpoint dependent)
- 3D model: description of 3D shape of objects (viewpoint invariant)
How does Marr’s model work
→ Analyse image with range of edge filters
→ Use Gestalt grouping - such as continuity finds outline
→ Segment outline at concavities
→ Define arrangement of parts (cylinders)
- start with biggest cylinder (principal axis)
- work through progressively smaller cylinders
→ Match description of parts to 3D models in memory
What does Marr’s model predict?
Marr’s model predicts that the visibility of principal axis is important and different orientations equally easy to recognise
→ BUT, many objects are hard to recognise if upside down or if the principle axis is tilted towards the viewer
Describe Biederman’s Recognition by Components
- Edge extraction: surface characteristics: luminance, texture and colour
- Detect arrangement of edges: curvature, parallel, co-terminating, symmetry, co-linear, such arrangements do not alter with view
- Segment object into components (parts): detect concave parts
- Determine GEON type for each component (part): 36 GEONS needed to have every possible version of objects in the entire world
→ determine arrangement of GEONS (above/below, bigger/smaller) and match GEON description to memory
Object processing pathways (in the monkey brain)
Parietal lobe and temporal lobe have diff functions for our perception of objects
- monkey presented with two objects and made to touch the new objects in multiple trials (novel object detection) → temporal lobe
- bowl with lid and food inside and an object closer to the bowl with food, monkey can discriminate between objects and can locate where different objects are → parietal lobe
- then produced lesions in different bits of cortex, lesions to temporal lobe, they cannot work out with object is new but the can do the landmark recognition task
- monkeys without the parietal lobe could not do the landmark task but could work out the novel object detection
- double dissociation
- damage to temporal stops ability to locate objects in space but spared object location and detection
Describe object agnosia
- damage to this area of cortex leads to difficulties in recognising objects (ventral visual processing)
- they can see the object (like a carrot) but cannot put the parts of it together and recognise what the object is
- symptoms: no loss of intelligence, failure to recognise objects, no simple visual impairment, can see edges but cannot put them together, may draw object OK but not recognise drawing
Describe Patient DF case study
carbon monoxide poisoning (object agnosia)
→ tested on her ability to see and report size, shape and orietnation of objects BUT, she could guide her hand and finger towards the object altough she could not tell you what it is or the size of it
→ 2 tasks she was tested with
- Matching: putting a disc through its specific shape, DF could not match the angle of the disc with the angle of the slot infront of her
- Posting: if you asked her to just post the disc through the slot she could do it (can’t describe the angle but can act on the object)
Describe Titchener circles illusion
- Target circles size is influences by surrounding array
1. Disks physically the same
2. Disks perceptually the same
the circle surrounding smaller circles seems larger than the circle surrounded by large circles but they are the same size
Define optic ataxia
(in dorsal visual processing)
→ difficulty completing visually guided reaching tasks
→ difficulty reaching in right direction
→ difficulty positioning fingers correctly towards and object
→ little relationship between grip aperture and object size
Describe in what and where pathways
Early models assume purpose of vision is to construct an internal model of reality - foundation for all visually derived thought and action
More of a focus on requirements of vision - modularity based upon to waht uses vision can be put
Dorsal (where) stream and ventral (what stream)
- Identification of object - object centered (shape/size/colour)
- Action of an object - viewer centered (shape/surface)
Vision for perception and vision for action
How do we recognise the same object seen in more than one location?
- Lateral Occipital Cortex (LOC) = identity representation
What is LOC role in object location? → location tolerant object information and object tolerant location information
Hierarchical processing of visual information: Hubel & Weisel’s findings in visual cortex
Simple → complex → hyper complex
Down the temporal lobe:
V1: edges
V2: contours
V4: colour and shape
PIT: simple features
AIT: elaborate features
Patterns processing in Inferotemporal cortex
Cell selectivity:
→ code shape (e.g. star shape) and colour and texture
→ respond to all objects with these properties
→ generalise accross position
→ organised in columns though cortex surface
→ posterior cells are orientation and size specific, anterior cells are less sensitive to orientation and size
so it doesnt matter what angle the shape is or how big it is, the visual aspect of the stimulus become less and less important going down the temporal lobe because the general understanding of the object is most important
Hierarchical model of object recognition
(Riesenhuber & Poggio 1999) view based module
Input image - bent paper clip
Simple cells respond to different orientations
→ Some V1 cells respond to the simple lines
V1 provide input to complex cells and these input in hypercomplex cells
___
→ Anatomically and psychologically plausible - based upon knwon connections and properties of brain cells from V1 to IT
→ based on earlier hierarchical models
→ Copes with viewpoint dependence and viewpoing independence
→ Incorporates theories of learning
→ Copes with multiple objects and objects in different contexts
Object recognition and context
context can help recognise objects that you’d otherwise find difficult to understand
Bottom up processing model
- bottom up processing mode (from the bottom): low lvl feature detectors -> mid level pattern detectors -> high lvl object detectors
Top down influence model
from the top
memorised concepts -> high lvl object detectors -> mid lvl pattern detectors
Top down and bottom up processing
Information flow is bottom up and top down
expectations of what something might be lower threshold for likely items
Ascending and Descending information
Anatomy - more connections descend than ascend
→ 10 times more information going back down the visual system from temporal to LGN
Eye -> LGN -> V1 -> V2 -> V4 -> Temporal cortex
Context within scene and objects
- Within scene - recognition is easier with correct context
- Within objects - word biases interpretation
Word superiority effect
detecting a letter is easier when in a word
Expectations: object permanence
- We expect an object moved behind a screen to reappear with the same form when screen removed
- Infants search for hidden objects as they understand they still exist
→ an object becomes hidden behind a screen and when the screen is moved >6 months infants surprised if the object is gone
→ they expect the object to still be there
Object representation
Similarity underlies how objects are represented
- similarity guides classification, naming and behaviour
- What makes objects similar or dissimilar
- If two objects are similar = neural representation similar
What did Chichy et al investigate
investigated the link between brain activity and different object properties
→ judged similarity of shape, function, colour, background and freely without instructions
→ important to characterise the dimensions that relate psychological and neural representations of objects
Describe the findings for object representation research
Perceived similarity of objects related to ventral visual cortex activity
→ Representations emerge within 200ms: object colour first and earlier in hierarchy, object shape next and later
→ Shape, background, free arangement (same place and time) in ventral temporal cortex
→ Object representations distributed and based upon similarity
How might AI perform object recognition
→ AI should provide us with information that is accurate and meaningful
→ usually: supervised learning - you train the network on known objects
- Algorithm must be suited to task
- Must have a training dataset with appropriate coverage (led to the rise of multi million datasets)
→ convulational neural networks + 1.2M object database + 2.5M scene databse ~ human performance