Perception and action Flashcards by Elena Binns

What was Berkely’s (1709) insight to how depth perception is ambiguous?

Distance, of itself invisible (Berkeley 1709)
Each ray of light projected onto the retina could have originated from an infinite number of points along its path.
There is no information intrinsic to the image associated with the ray that coveys the distance of its source
Across a whole set of points projected onto the retina, there is an infinite number of possible objects or scenes from which this could have arised from.

How well did you know this?

Not at all

Perfectly

What evidence existed prior to Gibson’s regarding direct/ indirect perception?

Direct sources of depth and distance, accommodation and vergence, may provide estimates over short distances.
Binocular disparity may provide information about depth but needs to be scaled for viewing distance in order to be interpreted correctly.
It is not obvious how this can be achieved over anything but the nearest parts of the scene.
Our estimates of depth and distance are supplemented by cues and clues which Berkeley suggested arose from our ability to associate them with distance through learning.
These associations then represent heuristics which the visual system relies on to interpret the scene.

How well did you know this?

Not at all

Perfectly

What is the issue of linear perspective?

Difficulties arise with this as the information conveyed by pictorial cues can be ambiguous.
Lines of two completely different objects in a scene such as the turret and road in magritte’s work are of identical length and angle and could equally well represent converging lines in the vertical plane or parallel lines receding into the distance in the horizontal plane

How well did you know this?

Not at all

Perfectly

What is meant by indirect perception?

Based on these considerations there is a school of psychology that believes that perception is indirect.
There isn’t sufficient information in the retinal image to specify unambiguously the structure of the scene that gave rise to it
The visual system must rely on cues and clues, multiple sources of incomplete information, in order to arrive at the best possible interpretation of the scene that is consistent with the information available and our prior knowledge.
This happens through a process called unconscious inference (Helmholz), or Hypothesis testing (Gregory), or Intelligent-thought-like-processes (Irving Rock)

How well did you know this?

Not at all

Perfectly

What is meant by direct perception?

James Gibson led an alternative school of philosophy that believes perception is direct.
Argued there is no need to make inferences in perception as there is more than enough information available to interpret the image, and this information is acquired directly.
Replace classical approach with one that emphasises the perception of surfaces in the environment
Ground consists of surfaces at different distances and slants composed of textural elements
Need an appropriate ecological geometry to describe the environment
It is the structure in the light rather than stimulation by light that furnishes information for visual perception
Rejected the claim that the retinal image is the starting point for visual processing
The whole array of light rays reaching an observer after structuring by surfaces and objects in the world, provide direct information about the layout of those surfaces and objects and about movement within the worlds and by the observer

How well did you know this?

Not at all

Perfectly

What is meant by optic flow?

Those studying static ray diagrams tend to overlook one of the richest sources of information, found in the changing image of the moving observer: optic flow.
Optic flow is the change in the pattern of light reflecting on the retina associated with movement of the observer through the scene or movement of parts of the scene relative to the observer.
It provides powerful cues about depth and distance in the scene as well as the 3D shapes and objects within it.
consists of an infinite collection of light rays of different wavelengths and intensities emitted from different sources in the scene. The rays form a hierarchical and overlapping set of solid angles corresponding to the boundaries of objects. Changes to the pattern or properties of the light from one solid angel to another signal boundaries in the world where one object may partially occlude another
Textural surfaces structure the light reflected off them - the resulting array could inform an observer about the shapes and orientations of those surfaces
Varients in information in the optic array are produced by movements of the observer and in the motion of objects in the world

How well did you know this?

Not at all

Perfectly

What is motion parallax?

Experience having looked out the window of a travelling train or car.
As one looks out, the gradient of motion factors is largest at the observer and corresponds to the observer’s speed, and then recedes towards the distance.
The speed and direction of flow is relative to the fixation point.

How well did you know this?

Not at all

Perfectly

What do expansion and contraction tell us?

The pattern of optic flow associated with moving forwards and backwards through the environment.
Motion vectors spread outwards when moving forwards or when an object looms towards an observer and they move inwards when moving backwards or the object recedes.
The focus of expansion or contraction provides a cue for the direction of motion.

How well did you know this?

Not at all

Perfectly

What evidence is there from the kinetic depth effect?

Movement can also provide a powerful cue for inferring shape
When we perceive a pattern of seemingly random lines, setting them in motion enables us to recognise them as edges of a complex 3D shape projected onto a flat surface.
Can easily recover the third dimension to perceive the original shape.
Underlies our ability to interpret scenery and objects depicted in computer games and movies.

How well did you know this?

Not at all

Perfectly

What evidence is there from the sterokinetic effect?

Seemingly flat black and white patterns change shape into a solid hollow cone when rotated
Stereo = solid

How well did you know this?

Not at all

Perfectly

What evidence is there from random dot kinematograms?

From setting white dots on a black background in motion we can see that they represent a pair of hollow rotating cylinders, purely from the motion vectors depicted by the dots.
As long as they move we can see the cylinders in 3D, when they stop we can only see the dots on the surfaces of those cylinders projected as a flat 2D plane.

How well did you know this?

Not at all

Perfectly

What can can be concluded from demonstrations like random dot kinematograms?

Motion adds so much more information to an image
It appears to be very easy for us to extract the third dimension from motion
We can infer shape and action from relatively sparse sets of motion vectors

How well did you know this?

Not at all

Perfectly

What is exterospecifc information?

information such as depth, distance and shape that relates to the layout of the scene and the objects within it

How well did you know this?

Not at all

Perfectly

What is propriospecifc information?

information about our own movements, which is essential for guiding our actions through the environment.

How well did you know this?

Not at all

Perfectly

what is expropriospecifc information?

the binary conception of exterospecifc and propriospecifc informatoin obscures the fact that animals interact with their environment so they need expropriospecifc information to adequately control action
that referring to the position, orientation and movements of the animal’s body relative to its environment (link to idea of affordances).
Lee’s classical swinging room experiments illustrate the distinctions between these types of information and the importance of optic flow over mechanoreceptors and the vestibular system in controlling body sway or balance.

How well did you know this?

Not at all

Perfectly

How has knowledge about optic flow changed thought concerning the maintenance of balance?

Study These Flashcards

Maintaining stance and balance is a fundamental motor skill requiring expropriospecifc information about the orientation and sway of the body relative to its environment.
Traditionally it was thought that we rely on proprioception signals from stretch and pressure receptors from the muscles in the feet and inner ankles, as well as vestibular information in the inner ear.

What evidence is there from David Lee’s (1974) swinging room experiments?

Study These Flashcards

David Lee demonstrated the importance of optic flow using the swinging room - a large box with no floor suspended from the a track on the ceiling so that it could be moved back and forth/
The walls of the room are covered with texture so that when it moved it created the same pattern of optic flow as would be observed by the observer if they were swaying.
Observer is asked to stand in the box and the box is moved slightly. The observer compensates for their perceived swaying by swaying in the same direction as the box.
An expanding flow field would normally suggest that the observer was tilting forwards so they would compensate by swaying backwards.
By moving the room back and forth the observes could be induced backward and forward, unconsciously, to such an extent that Lee described them as being hooked like puppets

What is the trolley variation of Lee’s studies in the 1970s

Study These Flashcards

Lee asked the observer to push a trolley which was mechanically couple to the swinging room so that as the observer walked forward the wall facing them moved away faster than when they stepped towards it and when they walked backwards the wall facing them moved towards them faster than they stepped away from it.
When walking forward the observer reported that they were moving backwards which is compatible with the optic flow pattern generated by the receding wall and vice versa.

How does the performance of Toddlers differ in Lee’s experiments?

Study These Flashcards

The effect is more pronounced in toddlers who will typically fall over backwards if the wall moves towards them and forwards if the wall moves away and performance is worse in children than adults up to the age of ten which suggests that the visual system trains and calibrates the motor system

How did Redfern and Frueman (1994) demonstrate the interdependence of the vestibular and visual systems in balance?

Study These Flashcards

adults whose vestibular systems of the inner ear have been affected by disease in later life as their vision regains the role of controlling posture sway more strongly than controls
suggests that the visual system trains and calibrates the motor system
the optic flow is still more influential than the motor system because all individuals tested, including healthy adults, exhibit the same sway response to the moving wall
The control of posture involves the integration of mechanical and visual information and adapts to changes caused by growth or sensory loss but is dominated by information from optic flow.

How does vision control action?

Study These Flashcards

The basic function of vision is to obtain information for controlling activity which goes on at an unconscious level.
Must control balance and body sway which is primarily controlled by vision.
Brain takes visual information more seriously than information from muscles and balance mechanisms.
Toddlers fall over due to the movement of walls but the ground is completely still.
The optic flow patterns generated by these walls were relatively subtle and the observes were unaware that anything was happening when the walls moved.
If the extent or speed of the room is increased, we feel as if we’re moving.
Pitted optic flow information against proprioceptive and vestibular information and found that optic flow completely dominates in the perception of self motion.
Also suggests that the dominance of vision could be useful for tuning up or calibrating the other senses

What is Gibson’s theory?

Study These Flashcards

The purpose of vision is not for producing visual representations or internal images of objects, but rather guiding action.
The starting point is not the retinal image, but the optic array - the spatial pattern of light rays impinging on the eye from all surfaces in the scene.
Movement causes transformations in the optic array that are meaningfully related to changes in the relative positions of light sources, surfaces and the observer and produce optic flow.
The process of vision involves identifying invariance which are things that remain constant despite the transforming optic array.
E.g the size of an object changes with distance from an observer but the relative sizes of two objects stays the same
Perception is direct because the transforming optic array can unambiguously specify the properties of the surrounding world.
There is no need for cognitive processes to interpret the images.
The end point of perception is not internal representations or conscious percepts, but affordances - what, in the way of interaction, an object offers an observer.
E.g a handle affords graspability
Make sense in the context of action implying movement and movement implying optic flow and decisions about how to ac

What is Lee’s time-to-contact model?

Study These Flashcards

Judging distance is crucial for 3D perception
Not clear that we are actually capable of judging distance over any but the shortest distances without resorting to potentially unreliable pictorial cues.
Lee argued that if the purpose of perception is to guide action then knowing when to act might be more useful than knowing how far away an object is.
E.g when driving we apply the breaks based on how soon we are about to hit an object rather than a set braking distance
Possible to estimate time to contact directly without having to access distance or to rely on information other than that which is directly available to the visual system
Time to contact = distance of object (ZT)/ speed (VT) =Rt/ vt
Vt can be measured directly by the retina without having to rely on any information about the distance (zt)
As long as we only need to know when to act so as to interact with a moving object, no information need be added to that already available in the retinal image.
The visual system only needs to read out the speed of expansion of the optic flow field at any given eccentricity and look up time to contact.
Can all be implemented using place coding.
If the visual system can obtain τ (tau) from an expanding retinal image then information about time to contact with a surface is directly available to the brain to compute when to time actions.

What are real life examples of time to contact?

Study These Flashcards

long-jump

breaking

What evidence did Wang and Frost (1992) provide for the neural basis of time to contact

a neural basis for the required place coding was found by Wang and Frost (1992) in pigeons. In the experiments pigeons were presented with graphical stimulation on a monitor screen of a solid patterned ball moving in depth. They found a group of cells that were strongly selective for the simulated direction of the ball that only fired when the ball appeared to be on a direct course towards the eye. Each cell fired at a different specific value of the ball's time to contact with the eye and this remained constant over changes in both the size and approach speed of the ball. This implies that each cell is tuned to a particular time to contact an approaching object. Thus these results are strongly suggestive of the computation of τ in a visual pathway from a single cell. Combining both this neurophysiological and behavioral evidence supports the hypothesis that information from optic flow expansion is a guide to locomotion in terms of influencing when to act. Therefore information from optic flow enables interaction of organisms with their environments.

What are affordances?

Gibson argues that the end point of perception is not the internal representations or conscious percepts but affordances - what in the way of interaction an object offers an observer. Implies the object is in control of the persons actions although this is not what Gibson meant

What evidence is there from neuropsychology for affordances?

Milner and Goodale studied two types of patients with visual impairments caused by localised brain damage Optic ataxia - damage to the parietal cortex. Able to describe the orientation of a slot in a board held up in front of them but are unable to turn their hand to the correct orientation when asked to poke their hand through the slot. Visual agnosia - occipital temporal cortex. Unable to describe the orientation of the slot but are able to turn their hand to the correct orientation through the slot when asked to do so. Pairing of complementary deficits associated with distinct cortical areas is a classic double dissociation Implies the existence of two separe systems or pathways One for visual perception and one for visually guided action. What is intact in visual agnosia but impaired in optic ataxia is something akin to affordance - visual system is no longer able to index the action appropriate to the object

What contention is there over the interpretation of illusions between the two schools of psychology?

Linear perspective cues in Magritte’s painting where we cannot tell whether they represent converging lines in the vertical plane or parallel lines receding in the distance in the horizontal plane This is because they are projections of 3D features onto a 2D canvas Looking at the actual scene (with one eye shut to discount binocular disparity) and moving you head to side to side resolves this ambiguity easily. The turret in the vertical plane would slide across whereas the avenue in the horizontal plane would rotate. Optic flow resolves the ambiguity If you allow the head to move, then there is sufficient information available from which to reconstruct the 3D scene. Many illusions taken as support for indirect perception such as the ikclsan cards and ames room only work because the observer was required to look through a peep hole which effectively prevented them from moving their heads and eliminated optic flow One interpretation of illusions like this is that they show how susceptible the visual system is to ambiguity - implying that perception is indirect Gibson would counter that, arguing that they are ecologically invalid Because they deliberately deprive the viewer of the essential information that could be picked up directly were they allowed to move their heads for example. Forced perception shots

Why is so much of the brain's resources invested in processing binocular disparity?

The only time binocular disparity might be more useful than optic flow is when we keep our heads still Gibson would argue this is ecologically invalid But is it correct to describe it as such There may be occasions in real life where an animal may need to keep its head perfectly motionless To avoid being detected by a predator or to give their position away to prey Still need to perceive the layout of the scene and the things within it A well-designed system should be able to appraise the scene whether it is safe to move the head or not If that means relying on different sources of information, optic flow or binocular disparity in different circumstances, then it should. The concept of ecological validity in this context is irrelevant - should not be used as a reason for disregarding data.

What can we conclude?

Key question is whether there is sufficient information in the retinal image to specify unambiguously the structure of the scene that gave rise to it. If there isn't’ sufficient information we have to rely on cues and clues and unconscious inferences to reconstruct the scene - indirect perception If there is sufficient information so there is no need to involve cognitive processes to reconstruct the scene - perception is direct The difference seems to be whether or not the head is allowed to move A well-designed system ought to be able to reconstruct the scene in 3D whether or not head movements are permitted

Perception and action Flashcards

(30 cards)