Problem 5 - DONE Flashcards
perception of objects
levels of vision
- low-level vision = extraction of basic features from image
- -> V1 (sensitivity to lines/edges in specific locations)
- middle-level vision = organisation into groups of elements of visual scene
- -> V2 (sensitivity to border ownership, real and illusory contours)
- high-level vision = object recognition and scene understanding
- -> V4 (interest in complex attributes)
perceptual organisation
= process by which elements in environment become perceptually grouped
- goal: create perception of objects
two components:
- grouping = visual events are put together into units/objects
- segregation = separating one area/object from another
structuralism
= distinguished between sensations and perceptions
- -> sensations = elementary processes; occur due to stimulation of the senses
- -> perceptions = more complex, conscious experiences; e.g. our awareness of objects
=> rejected by Gestalt psychology
- apparent movement
- illusory contours
Gestalt psychology
= roughly translated, means configuration
- how are configurations formed from smaller elements?
apparent movement
= illusion of movement; although movement is perceived, nothing is actually moving
- physically: two images flashing on/off, separated by period of darkness
–> perceptually: don’t see darkness, because perceptual system adds during period of darkness (connects both pictures)
- demonstration against structuralism
–> there are no sensations in the dark space between the flashing images
=> the whole is different than the sum of its parts
illusory contours
= illusion of contour; although contour is perceived (edges create triangle), there actually are no physical edges present
- demonstration against structuralism
–> there are no sensations along contours, there is only white between
=> the whole is different than the sum of its parts
gestalt organising principles
- determine how elements in a scene can become grouped together
- connect bottom-up processing (elements that occur in environment) with top-down processing (our knowledge/memory what usually belongs together)
- heuristics ≠ laws
- -> likelihood principle = we perceive the object that is most likely to have caused the pattern of stimuli we have received
principle of good continuation
= points that when connected result in straight or smoothly curving lines are seen as belonging together, and the lines tend to be seen in such a way as to follow the smoothest path
- example: objects that are partially covered by other objects are seen as continuing behind the covering object
- also in auditory perception
principle of prägnanz/good figure
= every stimulus pattern is seen in such a way that the resulting structure is as simple as possible
- example: olympic signal (see it as five circles and not as a larger number of more complicated shapes)
principle of similarity
= similar things appear to be grouped together
- -> colour, shape, size, orientation
- also in auditory perception
principle of proximity/nearness
= things that are near each other appear to be grouped together
principle of common fate
= things that are moving the same direction appear to be grouped together
- even if objects are dissimilar
principle of common region
= elements that are within same region of space appear to be grouped together
- example circles within ovals: overpowers proximity (proximity: nearby circles would be perceived together; common region: circles do not group, we perceive circles inside ovals as belonging together)
principle of uniform connectedness
= connected region of the same visual properties is perceived as a single unit
–> same visual properties: lightness, colour, texture, motion
perceptual segregation
= perceptual separation of one objet from another
- determine properties of figure and ground
- determine what causes us to perceive one area as figure and other as ground
- -> problem of figure-ground segregation
problem of figure-ground segregation
- figure = when we see separate object; figure that stands out from its background
- -> more thing-like, memorable, seems to in front
- -> border ownership
- ground = background
- -> not memorable
- -> near borders: unformed material, without specific shape
reversible figure-ground
= patterns that can be perceived different/alternately depending on what we see as figure/ground
what factors determine which area is figure?
- area: lower region of a display
- -> no left-right preference
- symmetry: symmetrical areas/convex regions
- meaningfulness: displays that look familiar to us
- grouping
inference in perception
= use knowledge of physical and semantic regularities to infer what is present in a scene
- retinal ambiguity = pattern of stimulation on retina that can be caused by many different objects
- theory of unconscious inference = some perceptions are the result of unconscious assumptions we make about the environment
- -> likelihood principle
bayesian inference
= perception is combination of current stimulus and our knowledge about conditions of the world
–> what is and is not likely to occur
- statistical technique to quantify inferential perception
–> enables to calculate probability that world is in a particular state given a particular observation
two factors:
- prior probability (likelihood of outcome we are proposing)
- consistency of hypothesis with each outcome
object recognition
- what pathway/ventral stream
- several difficulties of environment are easily solved by humans:
- -> inverse projection problem
- -> viewpoint invariance
inverse projection problem
= determining the object responsible for a particular image on the retina
- starting with image on retina –> extending rays out from eye
viewpoint invariance
= ability to recognise an object seen from different points of view
–> enables to tell whether faces seen from different angles are the same
theories of object recognition
- naive template theory
- recognition-by-components model
naive template theory (image descriptions)
= visual system recognises objects by matching the neural representation of the image (low-level features) with a stored representation in memory (template)
- problem: one would need way too many templates
- solution: conceptual match (instead of matching each point to one template)
- -> matching of structural descriptions = description of an object in terms of the basic nature of its parts and the relationship between those parts
recognition-by-components model (structural descriptions)
= objects are recognised by identities and relationships of their component parts
- we have a library of basic shapes/geons in our brain
- -> geons (“geometric icons”) = basic set of 3D shapes
- collection of non-accidental features (viewpoint invariance) –> robust to noise/orientation
- problems:
- -> object perception is not completely viewpoint invariant (letter recognition is object dependent) + cannot distinguish objects that differ in detail (naive template theory)
- -> cannot explain language of object recognition
- two objects can have exactly the same structural description/ visual representation but a different name
levels of object recognition
- entry-level category = label that comes to mind most quickly when we identify it (e.g. bird)
- subordinate level = the object might be more specifically named (e.g. eagle)
- superordinate level = it might be more generally named (e.g. animal)
un-learn viewpoint invariance
- has to be suppressed to learn reading letters
- letters do not stay the same from every point of view (b and d)
- visual word form area: suppresses viewpoint invariance for letter recognition