Reading 3 Flashcards
computational models of visual word recognition
- dual route model
- interactive activation model
- PDP model
central issues of computational models of visual word recognition
- how is print converted to sound and meaning
- symbolic knowledge vs statistical learning
- how is correct word selected from similar alternatives
- how is lexical knowledge learnt
dual route model (colheart, 1978)
computational implementation: dual route cascade model (DRC)
two routes from print to sound:
- orthographic/lexical route
- grapheme-phoneme/non-lexical route
- -> pronunciation most frequently associated with each grapheme, used to predict regular pronunciation of nonwords
NEED two routes to explain:
(1) Exception words: eg colonel, pint, head
- -> this should use orthographic route
(2) Nonwords: eg slint, pead
- -> this should use grapheme-phoneme route
both routes operate in parallel, response determined by their combined influence on phoneme units in response buffer
lexical route operates more quickly esp. for high frequency words, low frequency word can complete slower and cause competition at response buffer
critical evidence for DRC model
- regularity x frequency interaction
high frequency word: no difference in RT for retrieving regular vs irregular words
low frequency word: slower in retrieving irregular word
- cognitive neuropsychology: acquired dyslexia
-phonological dyslexia:
words>nonwords
lexical route intact, GPC route damaged
-surface dyslexia:
nonwords>words
GPC route damaged, lexical route intact
- ->double dissociation
- ->independent systems
Interactive Activation (and Competition) Model (McClelland & Rumelhart, predecessor of TRACE)
• Hierarchical layers of interconnected nodes
(feature letter word)
- Parallel, interactive activation
- activated nodes send positive and negative activation to nodes at high and lower levels
- Identification occurs when activation in a node exceeds ‘threshold’
- Threshold depends on frequency: less evidence needed for common words
- Lateral inhibition within levels
- Competition between activated nodes to select best matching word
benchmark phenomena
- word frequency effects:
- identification threshold lower for common words - word superiority effect
- present a stimulus briefly, present two alternatives for last letter
- letters in words receive top-down support from word nodes, rather then when the stimulus is a non-sense word - pseudoword effect
- wordlike nonwords activate nodes for similar words - semantic priming: active
- word nodes activate their semantic features at concept level
- top-down effects from concept level
can IA model simulate regularity effects
Full model assumes that
spoken and written words
activate same word nodes
Positive and negative
connections between letters
and phonemes
➔phonological influences
on visual word recognition
➔‘regularity effects’ due to
consistency of pronunciation
of letters/graphemes NOT
non-lexical rules
rules vs statistics
regularity vs consistency
regularity: regular according to grapheme-phoneme rules e.g. ea in bean, a in came
consistency: statistically consistent, same letters combination in many similar words
e. g. all in fall
DRC model compared regular consistent words with irregular inconsistent words, confounds regularity and consistency
andrews (1982): no significant effects of regularity
significant effects of frequency, consistency, AND frequency x consistency interaction, consistency only affects low frequency words
DRC cannot explain why LF regular consistent (bean) and regular inconsistent (bead) should be different: both follow GPC rules
interactive activation vs PDP
both connectionist models
interactive activation: •‘Symbolic’ nodes for letters, words •Computational implementation: ‘hardwired’ lexical knowledge •Does not explain how the ‘nodes’ are learned
PDP:
•No hardwired knowledge
•Learn “distributed” representations
Parallel distributed processing (PDP) models
orthography - phonology - semantics
Connectionist (“neural”) networks
• set of interconnected processing nodes
• a “propagation rule” for spreading activation through the network
Learning in PDP networks
• learning algorithms (e.g., delta rule, back propagation)*
• Multi-level architecture: ‘hidden [internal] units’ facilitate learning of complex [non-linear] associations
• Extract statistical regularities between Orthography, Phonology, semantics
Memory structure is an emergent property
of the distributed dynamic processing
assumptions of PDP models
Spreading activation
• Based on analogy with neural firing.
• Each input to a unit has a level of activation and a weight on its connection – positive or negative
• Net input to a unit is sum of (activations) x (weights)
– neti = Σiwijaj
– Output is transformed by a (non-linear) transfer function
back propagation learning rules
Error correction learning
• connection weights initially random
• compute output for particular input pattern
• compare with desired output
–>adjust weights (a little bit) to reduce error
REPEAT
PDP model of lexical access
- PDP models provide an explanation of how knowledge is acquired (error correction learning eg back propagation) that is lacking from ‘symbolic’ models like dual route and IA model
- PDP model consistent with knowledge of neural mechanisms
- “Regularity effects” due to inconsistent O-P associations, as in IA model, NOT separate lexical and rule systems
➔ all knowledge represented in associative
networks NOT discrete rules
evidence for PDP model
- words with inconsistent O-P mappings are more difficult to learn
- -> yield weaker O-P connections
- -> PDP network shows graded (LF vs HF) effects of consistency of pronunciation
e. g. buoy > bear > bead > bean - semantic information contributes to resolving pronunciation of low frequency inconsistent words
- high imageability LF inconsistent words suffer less in human
- also simulated in PDP model
how does PDP explain surface dyslexia
PDP model trained without semantic network has specific difficulty learning low frequency exception words
➔ Semantic information helps to resolve inconsistencies eg bead, tread, pint
(how do you remember pint is pint if you don’t know what if means?)
➔semantic impairment selectively disrupts identification of inconsistent words
➔ Semantic dementia
DRC vs PDP models
percentage of GPC regular pronunciations lanch 92.5% pouth 69.6% jind 83% toup 10.4%
neither model accurately simulates human data
- DRC predicts regular pronunciations for nearly all items, underestimates influence of consistency
- PDP overestimates probability of irregular pronunciations eg lome, plove, bost
–> need to combine
Harm & Seidenberg (2004): Need to model children’s developmental trajectory
70000 learning trials weighted by frequency, 10% phonological, 10% semantic, 40% comprehension (phonology -> semantics), 40% production (semantics -> phonology)
simulates how children learn: reproducing phonological and semantic first, form semantic-phonology network first
orthographic associations bolted on to the semantic-phonology network structure
model more consistent with human’s pronunciation at naming non-words
Which model is best?
- Behavioural (and neuroimaging) data for healthy subjects favourconnectionist models (i.e. IA, PDP) BUT neuropsychological data
shows double dissociation predicted by dual route model
- PDP models CAN simulate selective disruption of word vs nonword reading but evidence debated
- Questions can be raised about clarity of acquired syndromes
BUT is evidence from acquired dyslexias relevant to reading development/dyslexia? cf neuroconstructivism (Karmiloff-Smith)
➔Modularity may emerge from experience
- Models designed to answer different questions
• Dual route and IA models are ‘hard-wired’: programmer ‘stipulates’ the content and form of lexical knowledge
• PDP models investigate how knowledge is acquired: How complex structures emerge from simple neurally plausible principles - Both models simulate an independent lexical module that provides the input for comprehension processes
cognitive neuroscience approaches
Neuroimaging methods
eg Positron Emission Tomography (PET)
Functional magnetic resonance imaging (fMRI)
• Measure changes in blood flow/oxygen
➔ link to cognitive activity is inferential
• Spatially precise BUT poor time resolution: PET~ 60 secs; fMRI: ~ 40-100 ms
• Use “subtraction methods” to isolate brain regions involved in cognitive processes
Pugh et al 2001 hierarchical fMRI
hierarchically structured tasks varying demands on phonological decoding and analysis
- visual-spatial processing
- line orientation matching - orthographic processing
- letter case matching bbBb-bbBb - simple phonological analysis
- simple letter rhyme p vs v - phonological assembly
- non-word rhyme eg leat and jete - lexical-semantic processing
- category judgement eg corn and rice from same category
use subtraction procedures to isolate cognitive processes
Pugh et al 2001 findings
difference between rhyme and case judgement task: phonological processing
(grey/white = R>C, black = C>R)
Reading disabled show more activation in rhyme relative to case judgement task in frontal areas, and show more bilateral activation than Non- Impaired readers (left dominant)
Rather than simply isolating “hot spots” at which groups differ, need to consider “relations among distinct brain regions that function co- operatively…during reading”
difference in brain activation between reading-disabled and non-impaired
- “Anterior”: Broca’s area/inferior frontal gyrus
•More active for NW than words
•More active in poor readers
•Function: output phonology
2.“Temporo-parietal (dorsal)”: Wernicke’s area/ inferior parietal
•More active for NW than words
•Late response
•More active in good readers •Function: O-P-S integration
- “Occipito-temporal (ventral)”: extrastriate/ occipito-temporal
•More active for words than NW in all tasks
•Early response
•More active in good readers •Function: visual word form “gateway” system
Pugh et al 2001 conclusions
Critical assumptions
•Specialised module for printed words: occipito-temporal region
•Compensatory processes:a nterior region
Normal reading development:
• Initially depends on temporo-parietal circuit to learn to decode and integrate phonology, orthography and lexical-semantics
• With practice/experience, occipito-temporal system (originally responsible for rapid visual and auditory integration) becomes “entrained” to linguistic structures in temporo-parietal system
➔fast automatic “word form system”: ‘the brain’s letterbox’
Reading disability:
- impairments in development of temporo-parietal system
impede establishment of adequate linkages between O-P-S and development of visual word form system
- Compensatory strategies include increased reliance on frontal processes and increased use of right hemisphere “homologues”
comprehension processes vs expert lexicon in skilled reading
comprehension processes •Shared with spoken language •Influenced by general reasoning abilities •Depend on efficiency of working memory •General across languages
expert lexicon
•Precise lexical codes that integrate orthography and phonology
•Autonomous modular processes
•Reading-specific skill •Language-specific
•Best indexed in English by spelling
Hulme et al (2012) training study
- 152 5 year-old children selected for low verbal ability
- 20 weeks’ training in letter knowledge and phoneme awareness in book reading compared with training in speaking/listening skills
- Benefits in training caused by changes in phoneme awareness and letter sound knowledge
Suggate (2010) meta-analysis
85 studies of the effects of reading interventions
Phonics = Strong focus on explicit instruction in grapheme- phoneme correspondence and phonemic awareness
Other = Meaning focus, limited phonemic awareness, mixed
before grade 2: phonics stronger intervention effect
after grade 2: other intervention better
fundamental cause of reading disability
“phonological core”: essential to link orthography with phonology and meaning
Failure to acquire these processes has reverberating consequences: “Matthew effects” (Stanovich)
– academic achievement, motivation, self-esteem etc
– vocabulary growth
– sequential, logical processing
- the problems become more general BUT may arise from the same fundamental causes
- Preventatively oriented instruction is essential to avoid reverberating consequences of early failure