cue integration Flashcards
1
Q
Ernst + Banks (2002): LANDMARK
A
maximum likelihood estimate:
- discriminate sizes of two objects presented sequentially (2-IFC task)
- could either only be seen, only be felt, or both seen and felt at the same time
- visual = random dot stereogram portraying a bar of a given size. varied reliability by adding noise to depth of dots that formed the pattern
- haptic = two force-feedback devices (one for index finger and one for thumb)
- good study as it employed single cue conditions first, to determine reliability of each estimate (using JNDs)
- used variance info to construct MLE and make quantitative predictions
- then crossmodal conditions: standard vs comparison stimulus = determine PSE and relative weights of each cue. how does this compare to predictions from model?
- relative visual reliabilities = 0.78 for 0% noise (high visual weight so PSE close to visual standard), 0.48 for 133% noise, 0.16 for 200% noise (PSE close to haptic) etc
- smooth transition from visual dominance to haptic dominance as noise increases
empirical JNDs match predicted ones = even more accurate measure of being able to optimally combine info than PSEs - visual dominance only occurs when variance of visual estimation is lower than that of the variance of haptic estimation
- first study that made quant predictions
- first study that proved that final estimates have lower variance so cue integration occurs to improve precision
- can replace 2nd step of MWF
however:
- correspondence problem? level of fusion? prior assumptions? not bayesian
- assumes noise is not correlated but could be the case e.g. Landy 2001. but final estimate still better (oruc 2003)
2
Q
Gepshtein (2003)
A
- judged distance between two parallel surfaces
- random-element stereograms (parallel, oblique or perpendicular) and force-feedback devices
- within-modality experiments: vision alone = precision of estimates varied with surface orientation, while haptic alone = precision didn’t vary with orientation
- humans behave like ideal observer i.e. combine visual and haptic info, weighting them as function of orientation, so combined estimate is more precise and approaches statistical optimality
+ reassigns weights as reliability of cues change
= evidence for MLE - good study because it measured component reliabilities separately and used this info to make quantitative predictions about combined percept which can be compared to empirical observations
- observed and predicted PSEs were very similar, although observed and predicted JNDs differed consistently. probably due to small amount of correlated noise which MLE model doesn’t take into account
3
Q
Shams (2000)
A
- vision doesn’t always win during integration: no. of beeps often influenced no. of flashes reported
- “auditory capture”
- audition is more reliable for temporal judgements
4
Q
Alais & Burr (2004)
A
- ventriloquism effect
- one sense doesn’t simply dominate or “capture” the other as previously thought, but instead by a simple model of optimal cue integration of visual and auditory info
- model combines visual and auditory cues, weighted by their reliability
e. g. image more blurry = ventriloquism effect weaker (more weight to audition)
5
Q
Ernst (2004) REVIEW
A
- online determination of variance achieved with population codes
- model assumes that noise distributions of sensory estimates are independent, however this is not always the case, e.g. Landy & Kojima (2001) could involve correlated noise estimates. this would reduce reliability of integrated estimates compared to if variances were independent, however there is still a benefit of integration even with correlated noise
- correspondence problem: how are cues actually integrated? must be solved before integration can occur. usually if they occur at same spatial location / same time but not if spatial discrepancy too large or temporal sequence not appropriate. what are the limits for this?
- Effect of top-down influences such as learning, memory and attention? MLE is bottom-up only. prior determined by gaussian distribution = bayesian model
6
Q
Hillis (2002)
A
- oddity task: pick out “odd” stimulus when giving 2 examples of standard and 1 comparison
- manipulated individual signals independently. harder to detect discrepancies for disparity/texture within vision than it is for visual/haptic (addresses correspondence problem)
- cues from same modality are combined so lose single-cue info
- cues from same modality fused together to form perceptual metamers (more strongly coupled!). observed metameric behaviour in disparity-texture condition where discrimination performance for combined stimuli is worse than for single cue condition
7
Q
Hospedales (2009)
A
- MLE models derived from probabilistic perspective but not bayesian about uncertain correspondence between observations = mandatory fusion models that assume observations correspond
- vs. causal inference models which also infer causal structure of multisensory observations
8
Q
Oruc (2003)
A
even if noise distributions of individual estimates show correlation, there is still benefit to combining sensory info as combined estimate will be more reliable
9
Q
Knill & Saunders (2003)
A
- stereo and texture info integrated/optimally combined to form estimate
- weighting of texture info depends on surface slant (more slanted = cue more reliable)
- individual differences in subjects’ thresholds for discriminating slant from texture and stereo cues predict subjective cue weightings in each individual
10
Q
Dekker (2015)
A
- cue integration involves specialised neural circuits that fuse signals from same or different modalities
- fMRI study with children ages 6-12, viewed displays with two visual cues (binocular disparity and relative motion)
- children >10.5 years = sensory fusion in V3B (visual area thought to integrate depth cues in adult brain)
- children >10.5 = no evidence for sensory fusion in any visual area
- also a shift in perceptual performance at age 10-11
- brain circuits that fuse cues take a long time to develop (higher-level process)
11
Q
Gori (2008)
A
- two-alternative forced choice procedure to measure discrimination thresholds: thresholds improved by 30% from 5 to 10 years
- also used dual-modality condition where visual and haptic info were in conflict: PSEs in adults or older children were always shifted towards visual cue, but 5 year olds = PSE shifted towards haptic cue for size task (even though it’s more unreliable) and visual info for orientation task
- before 8 years old integration of visual and haptic info is far from optimal
- either vision or touch dominates totally, even if dominating sense is more unreliable
- 8-10 years = statistically optimal integration (follows MLE prediction)
- during development perceptual systems are constantly being recalibrated, so cross-sensory comparison is crucial. using one sense to calibrate the other is initially more important than combination of the two sources
12
Q
Petrini (2014)
A
- most developmental studies looking at multisensory integration in children use vision
- what happens when visual cues not available? does integration of auditory-haptic cues also develop late? or even at same time?
- by 8-10, children can optimally integrate visual-haptic cues
- but by same age, children do not show optimal integration for auditory-haptic
- suggests optimal integration of non-visual cues for object size discrimination might occur later in life
- although 10-11 year olds have lower discrimination thresholds than 7-8 year olds their performance on the bimodal condition does not differ, suggesting the lack of development between these ages is specific to the ability to reduce variability when given both cues
13
Q
Hillis (2004)
A
- texture and disparity cues
- increased slant = weight of texture cue increases as it becomes more reliable
- increased distance = disparity weight decreases (texture reliability does not change with distance but it becomes relatively more reliable as disparity becomes less reliable)
- pattern of individual differences in disparity and texture estimations, which can be seen in single-cue conditions, carries over to integration
- observed PSEs and JNDs were in line with those predicted by MLE, suggesting humans use a statistically optimal strategy for combining disparity and texture info
14
Q
van Beers (1999)
A
- proprioception and visual info
- had to match position of seen right hand (above table) to position of unseen left hand (under table). varied reliability of visual info available using prism lenses
- weights used to integrate cues were related to precision of each type of info
results predicted by MLE model
15
Q
Ernst 2006
A
- humans integrate cues in statistically optimal fashion, weighted according to individual cue reliability, in order to get most reliable estimate (MLE)
- however, MLE assumes all sensory cues undergo uniform mandatory fusion, but some may interact more strongly than others e.g. disparity-texture more strongly coupled than visual-haptic
- introduce bayesian coupling prior which determines strength of coupling between cues (uses probability distribution of mappings between signals)
- if mapping never changes = cues fused
- mapping changes / distribution is spread = cues not fused as tightly but may still interact
- no mapping = flat coupling prior so independent cues don’t fuse