- judged distance between two parallel surfaces - random-element stereograms (parallel, oblique or perpendicular) and force-feedback devices - within-modality experiments: vision alone = precision of estimates varied with surface orientation, while haptic alone = precision didn't vary with orientation - humans behave like ideal observer i.e. combine visual and haptic info, weighting them as function of orientation, so combined estimate is more precise and approaches statistical optimality + reassigns weights as reliability of cues change = evidence for MLE - good study because it measured component reliabilities separately and used this info to make quantitative predictions about combined percept which can be compared to empirical observations - observed and predicted PSEs were very similar, although observed and predicted JNDs differed consistently. probably due to small amount of correlated noise which MLE model doesn't take into account

- vision doesn't always win during integration: no. of beeps often influenced no. of flashes reported - "auditory capture" - audition is more reliable for temporal judgements

- ventriloquism effect - one sense doesn't simply dominate or "capture" the other as previously thought, but instead by a simple model of optimal cue integration of visual and auditory info - model combines visual and auditory cues, weighted by their reliability e. g. image more blurry = ventriloquism effect weaker (more weight to audition)

- online determination of variance achieved with population codes - model assumes that noise distributions of sensory estimates are independent, however this is not always the case, e.g. Landy & Kojima (2001) could involve correlated noise estimates. this would reduce reliability of integrated estimates compared to if variances were independent, however there is still a benefit of integration even with correlated noise - correspondence problem: how are cues actually integrated? must be solved before integration can occur. usually if they occur at same spatial location / same time but not if spatial discrepancy too large or temporal sequence not appropriate. what are the limits for this? - Effect of top-down influences such as learning, memory and attention? MLE is bottom-up only. prior determined by gaussian distribution = bayesian model

- oddity task: pick out "odd" stimulus when giving 2 examples of standard and 1 comparison - manipulated individual signals independently. harder to detect discrepancies for disparity/texture within vision than it is for visual/haptic (addresses correspondence problem) - cues from same modality are combined so lose single-cue info - cues from same modality fused together to form perceptual metamers (more strongly coupled!). observed metameric behaviour in disparity-texture condition where discrimination performance for combined stimuli is worse than for single cue condition

- MLE models derived from probabilistic perspective but not bayesian about uncertain correspondence between observations = mandatory fusion models that assume observations correspond - vs. causal inference models which also infer causal structure of multisensory observations

- cue integration involves specialised neural circuits that fuse signals from same or different modalities - fMRI study with children ages 6-12, viewed displays with two visual cues (binocular disparity and relative motion) - children >10.5 years = sensory fusion in V3B (visual area thought to integrate depth cues in adult brain) - children >10.5 = no evidence for sensory fusion in any visual area - also a shift in perceptual performance at age 10-11 - brain circuits that fuse cues take a long time to develop (higher-level process)

- two-alternative forced choice procedure to measure discrimination thresholds: thresholds improved by 30% from 5 to 10 years - also used dual-modality condition where visual and haptic info were in conflict: PSEs in adults or older children were always shifted towards visual cue, but 5 year olds = PSE shifted towards haptic cue for size task (even though it's more unreliable) and visual info for orientation task - before 8 years old integration of visual and haptic info is far from optimal - either vision or touch dominates totally, even if dominating sense is more unreliable - 8-10 years = statistically optimal integration (follows MLE prediction) - during development perceptual systems are constantly being recalibrated, so cross-sensory comparison is crucial. using one sense to calibrate the other is initially more important than combination of the two sources

- most developmental studies looking at multisensory integration in children use vision - what happens when visual cues not available? does integration of auditory-haptic cues also develop late? or even at same time? - by 8-10, children can optimally integrate visual-haptic cues - but by same age, children do not show optimal integration for auditory-haptic - suggests optimal integration of non-visual cues for object size discrimination might occur later in life - although 10-11 year olds have lower discrimination thresholds than 7-8 year olds their performance on the bimodal condition does not differ, suggesting the lack of development between these ages is specific to the ability to reduce variability when given both cues

- texture and disparity cues - increased slant = weight of texture cue increases as it becomes more reliable - increased distance = disparity weight decreases (texture reliability does not change with distance but it becomes relatively more reliable as disparity becomes less reliable) - pattern of individual differences in disparity and texture estimations, which can be seen in single-cue conditions, carries over to integration - observed PSEs and JNDs were in line with those predicted by MLE, suggesting humans use a statistically optimal strategy for combining disparity and texture info

- proprioception and visual info - had to match position of seen right hand (above table) to position of unseen left hand (under table). varied reliability of visual info available using prism lenses - weights used to integrate cues were related to precision of each type of info results predicted by MLE model

- humans integrate cues in statistically optimal fashion, weighted according to individual cue reliability, in order to get most reliable estimate (MLE) - however, MLE assumes all sensory cues undergo uniform mandatory fusion, but some may interact more strongly than others e.g. disparity-texture more strongly coupled than visual-haptic - introduce bayesian coupling prior which determines strength of coupling between cues (uses probability distribution of mappings between signals) - if mapping never changes = cues fused - mapping changes / distribution is spread = cues not fused as tightly but may still interact - no mapping = flat coupling prior so independent cues don't fuse

cue integration Flashcards by Annie Morsi

Ernst + Banks (2002): LANDMARK

maximum likelihood estimate:

discriminate sizes of two objects presented sequentially (2-IFC task)
could either only be seen, only be felt, or both seen and felt at the same time
visual = random dot stereogram portraying a bar of a given size. varied reliability by adding noise to depth of dots that formed the pattern
haptic = two force-feedback devices (one for index finger and one for thumb)
good study as it employed single cue conditions first, to determine reliability of each estimate (using JNDs)
used variance info to construct MLE and make quantitative predictions
then crossmodal conditions: standard vs comparison stimulus = determine PSE and relative weights of each cue. how does this compare to predictions from model?
relative visual reliabilities = 0.78 for 0% noise (high visual weight so PSE close to visual standard), 0.48 for 133% noise, 0.16 for 200% noise (PSE close to haptic) etc
smooth transition from visual dominance to haptic dominance as noise increases
empirical JNDs match predicted ones = even more accurate measure of being able to optimally combine info than PSEs
visual dominance only occurs when variance of visual estimation is lower than that of the variance of haptic estimation
first study that made quant predictions
first study that proved that final estimates have lower variance so cue integration occurs to improve precision
can replace 2nd step of MWF

however:

correspondence problem? level of fusion? prior assumptions? not bayesian
assumes noise is not correlated but could be the case e.g. Landy 2001. but final estimate still better (oruc 2003)

How well did you know this?

Not at all

Perfectly

Gepshtein (2003)

judged distance between two parallel surfaces
random-element stereograms (parallel, oblique or perpendicular) and force-feedback devices
within-modality experiments: vision alone = precision of estimates varied with surface orientation, while haptic alone = precision didn’t vary with orientation
humans behave like ideal observer i.e. combine visual and haptic info, weighting them as function of orientation, so combined estimate is more precise and approaches statistical optimality
+ reassigns weights as reliability of cues change
= evidence for MLE
good study because it measured component reliabilities separately and used this info to make quantitative predictions about combined percept which can be compared to empirical observations
observed and predicted PSEs were very similar, although observed and predicted JNDs differed consistently. probably due to small amount of correlated noise which MLE model doesn’t take into account

How well did you know this?

Not at all

Perfectly

Shams (2000)

vision doesn’t always win during integration: no. of beeps often influenced no. of flashes reported
“auditory capture”
audition is more reliable for temporal judgements

How well did you know this?

Not at all

Perfectly

Alais & Burr (2004)

ventriloquism effect
one sense doesn’t simply dominate or “capture” the other as previously thought, but instead by a simple model of optimal cue integration of visual and auditory info
model combines visual and auditory cues, weighted by their reliability
e. g. image more blurry = ventriloquism effect weaker (more weight to audition)

How well did you know this?

Not at all

Perfectly

Ernst (2004) REVIEW

online determination of variance achieved with population codes
model assumes that noise distributions of sensory estimates are independent, however this is not always the case, e.g. Landy & Kojima (2001) could involve correlated noise estimates. this would reduce reliability of integrated estimates compared to if variances were independent, however there is still a benefit of integration even with correlated noise
correspondence problem: how are cues actually integrated? must be solved before integration can occur. usually if they occur at same spatial location / same time but not if spatial discrepancy too large or temporal sequence not appropriate. what are the limits for this?
Effect of top-down influences such as learning, memory and attention? MLE is bottom-up only. prior determined by gaussian distribution = bayesian model

How well did you know this?

Not at all

Perfectly

Hillis (2002)

oddity task: pick out “odd” stimulus when giving 2 examples of standard and 1 comparison
manipulated individual signals independently. harder to detect discrepancies for disparity/texture within vision than it is for visual/haptic (addresses correspondence problem)
cues from same modality are combined so lose single-cue info
cues from same modality fused together to form perceptual metamers (more strongly coupled!). observed metameric behaviour in disparity-texture condition where discrimination performance for combined stimuli is worse than for single cue condition

How well did you know this?

Not at all

Perfectly

Hospedales (2009)

MLE models derived from probabilistic perspective but not bayesian about uncertain correspondence between observations = mandatory fusion models that assume observations correspond
vs. causal inference models which also infer causal structure of multisensory observations

How well did you know this?

Not at all

Perfectly

Oruc (2003)

even if noise distributions of individual estimates show correlation, there is still benefit to combining sensory info as combined estimate will be more reliable

How well did you know this?

Not at all

Perfectly

Knill & Saunders (2003)

stereo and texture info integrated/optimally combined to form estimate
weighting of texture info depends on surface slant (more slanted = cue more reliable)
individual differences in subjects’ thresholds for discriminating slant from texture and stereo cues predict subjective cue weightings in each individual

How well did you know this?

Not at all

Perfectly

Dekker (2015)

cue integration involves specialised neural circuits that fuse signals from same or different modalities
fMRI study with children ages 6-12, viewed displays with two visual cues (binocular disparity and relative motion)
children >10.5 years = sensory fusion in V3B (visual area thought to integrate depth cues in adult brain)
children >10.5 = no evidence for sensory fusion in any visual area
also a shift in perceptual performance at age 10-11
brain circuits that fuse cues take a long time to develop (higher-level process)

How well did you know this?

Not at all

Perfectly

Gori (2008)

two-alternative forced choice procedure to measure discrimination thresholds: thresholds improved by 30% from 5 to 10 years
also used dual-modality condition where visual and haptic info were in conflict: PSEs in adults or older children were always shifted towards visual cue, but 5 year olds = PSE shifted towards haptic cue for size task (even though it’s more unreliable) and visual info for orientation task
before 8 years old integration of visual and haptic info is far from optimal
either vision or touch dominates totally, even if dominating sense is more unreliable
8-10 years = statistically optimal integration (follows MLE prediction)
during development perceptual systems are constantly being recalibrated, so cross-sensory comparison is crucial. using one sense to calibrate the other is initially more important than combination of the two sources

How well did you know this?

Not at all

Perfectly

Petrini (2014)

most developmental studies looking at multisensory integration in children use vision
what happens when visual cues not available? does integration of auditory-haptic cues also develop late? or even at same time?
by 8-10, children can optimally integrate visual-haptic cues
but by same age, children do not show optimal integration for auditory-haptic
suggests optimal integration of non-visual cues for object size discrimination might occur later in life
although 10-11 year olds have lower discrimination thresholds than 7-8 year olds their performance on the bimodal condition does not differ, suggesting the lack of development between these ages is specific to the ability to reduce variability when given both cues

How well did you know this?

Not at all

Perfectly

Hillis (2004)

texture and disparity cues
increased slant = weight of texture cue increases as it becomes more reliable
increased distance = disparity weight decreases (texture reliability does not change with distance but it becomes relatively more reliable as disparity becomes less reliable)
pattern of individual differences in disparity and texture estimations, which can be seen in single-cue conditions, carries over to integration
observed PSEs and JNDs were in line with those predicted by MLE, suggesting humans use a statistically optimal strategy for combining disparity and texture info

How well did you know this?

Not at all

Perfectly

van Beers (1999)

proprioception and visual info
had to match position of seen right hand (above table) to position of unseen left hand (under table). varied reliability of visual info available using prism lenses
weights used to integrate cues were related to precision of each type of info
results predicted by MLE model

How well did you know this?

Not at all

Perfectly

Ernst 2006

humans integrate cues in statistically optimal fashion, weighted according to individual cue reliability, in order to get most reliable estimate (MLE)
however, MLE assumes all sensory cues undergo uniform mandatory fusion, but some may interact more strongly than others e.g. disparity-texture more strongly coupled than visual-haptic
introduce bayesian coupling prior which determines strength of coupling between cues (uses probability distribution of mappings between signals)
if mapping never changes = cues fused
mapping changes / distribution is spread = cues not fused as tightly but may still interact
no mapping = flat coupling prior so independent cues don’t fuse

How well did you know this?

Not at all

Perfectly

Foss-Feig (2010)

Study These Flashcards

children with autism report flash-beep illusion more often than controls
suggests multisensory cue integration is impaired (extended temporal binding window)
issues with sensory integration could help explain impaired communication / social interaction etc - don’t combine social cues as normal? although experiment very low-level so hard to generalise

Moore + Fletcher (2012)

Study These Flashcards

cues integrate to form posterior = sense of agency
dopamine dysregulation results in excessive salience placed on external events (Kapur 2003)
could alter weightings applied in cue integration and explain +ve symptoms such as delusions of control or hallucinations

Fetsch (2011)

Study These Flashcards

during visual-vestibular cue conflict task, monkeys reweight individual sensory signals on a trial-by-trial basis depending on cue reliability
firing of neurons in dorsal MST corresponded to reweighting (neural weights in MST depended on stimulus reliability)

Seilheimer (2013)

Study These Flashcards

problem of causal inference i.e. whether two cues should be integrated (only if they come from same source)
accounted for by model of bayesian causal inference which first calculates the probability that two sensory cues have the same underlying cause, before bayesian cue integration takes place
also: little experimental evidence manipulating bayesian priors to see how they affect cue integration? without this, bayesian is only equal to MLE

Roitman (2002)

Study These Flashcards

neurons in monkey LIP accummulate sensory info until enough collected for decision to be made
could cue integration sometimes work the same way when there are multiple cues?

Gepshtein (2005)

Study These Flashcards

data only follows MLE prediction / cues only optimally combined when visual and haptic info spatially coincident (correspondence problem)

Weiss (2002)

Study These Flashcards

motion illusions can occur as result of system which tries to estimate local image velocities
standard estimation theory
assumption that slower motions are more likely to occur than faster ones (prior)
model could account for variety of motion illusions / percepts
low contrast grating is less reliable = appears to move more slowly (more reliance on prior)
narrow rhombus with low contrast appears to move diagonally (vs high contrast or fat rhombus which is perceived correctly as moving horizontally)
likelihood estimates less reliable so more reliance on prior = less precise posterior
cannot be fully explained by MLE

Landy (1995)

Study These Flashcards

MODIFIED WEAK FUSION

promotion: absolute depth cues can promote/disambiguate other cues which are missing a parameter e.g. binocular disparity only absolute when viewing distance specified, whereas motion parallax is absolute as it is specified by retinal velocity
e.g. stereo vision can promote texture and determine object as either convex/concave
robustness: sensory processing is fallible and can provide discrepant cues e.g. shading relies on prior assumption that light comes from above, but if this is not the case then cue will provide incorrect info. system discounts highly inconsistent cue
linearity: dynamic weighting of cues where weight is function of cue reliability
e.g. reduce contrast = texture cue becomes very unreliable, or increase viewing distance = stereo reliability decreases so will contribute less to weight of final estimate
linearity relies on ancillary cues: stereo reliability determined by convergence of eyes, motion parallax by self-movement signals from vestibular system = determine weights
only interactions needed for promotion are permitted before full weighted cue combination
weights corresponding to different depth cues should vary from location to location in a scene
= results in robust statistical estimator
evidenced by perturbation analysis 2IFC (cylinder with conflicting texture and motion cues)
overall, MWF very influential
consistent with many results
can make predictions about how weights will change, but not what weights should be (so no quantitative predictions)
also only focused on depth in vision; how is auditory info etc integrated?

Young 1993

Study These Flashcards

change weights of texture and motion by adding noise

Johnston 1994

change weights by viewing geometry i.e. downweight stereo by increasing viewing distance

Buckley 1993

stereo/texture cues: stereo cue given much higher weight with real objects compared to simulated video displays

Ernst 2000

modify weights through experience and feedback: training with stimuli either consistent with texture or disparity cue affected weighting in later cue estimation

cue integration Flashcards

(27 cards)