Lecture 8 - Categorical Perception and Learning Flashcards
Statistical Learning
Through mere exposure, we seem to learn
what kinds of things go with other kinds of things.
we do learn contingencies over time
lines start to blur: associative or non-associative
Through perceptual learning, we seem to BUILD and STORE
specific stimulus distinctions.
• These stimulus features can be used to identify and
categorize different types of things.
- Once established, these feature categories become the basis for top-down perceptual processing (e.g. recognizing feathers on male and female chicks).
- have a store house of different objects and can filter than down to the environment
Example: Perceiving breaks between words
The segmentation problem
how do you find the breaks where theres always a continuous signal?
- there are no physical breaks in the continuous acoustic signal of speech. [High computational complexity]
– Top-down processing, including knowledge a listener has about a language,
affects perception of the incoming speech stimulus (parse the speech as it’s coming in).
– Segmentation is affected by context, meaning, and our knowledge of word structure.
non-associative learning
helps us see how we respond to and distinguish stimuli inn the environment and our responses
perceptual learning - we become better and better at telling things apart
associative learning
find different contingencies (two different stimuli - classical) (response and outcome - operant )
just building contingencies between two things (learning language)
What kind of learning reviewed so far seems specifically
useful for speech segmentation?
Statistical learning
helps us know when the breaks are coming: knowing the probabilities of when certain syllables tend to follow other syllables
Saffran, Aslin & Newport (1996)
demonstrated that
infants can detect word boundaries with
different transitional probabilities. [innate tendency]
- we have the innate ability to track different contingencies
• A continuous stream of sounds becomes segmented.
…bidakupadotigolabubidakutupiro…
…bidaku/padoti/golabu/bidaku/tupiro…
• And this should apply to natural speech.
…lookattheprettybaby…
…look/at/the/pretty/baby…
High likelihood PRE–>TTY
High likelihood BA –> BY
Low likelihood TTY–>BA
Perceiving features
In order to track probabilities, we need to first distinguish
basic features (e.g. syllables) of the stimulus.
have to be able to ID syllables and be able to build those categories up
Some feature detection seems to be innate.
contraints
• Frogs have ‘bug’ detectors: group of cells that detect the size and shape and movement pattern of bugs that induces them to flick out their tongues (Lettvin et al., 1959).
• Visual system has simple and complex edge detectors: straight lines, edges: occur as early as you can train the system
(Hubel & Wiesel, 1959, 1962).
• Babies have phonetic discrimination for all language
sounds up to 10 months of age.
But all of these feature detectors seem to be shaped by both experience and ‘topdown’ influences.
we have all these innate abilities to detect things in the environment but we can shape them with topdown knowledge
- experience dependent plasticity
- Critical periods (e.g. phonetic discrimination)
- Mere exposure and discrimination training
- we can form many many different types of representations
How do we (as babies) initially discriminate the different
phonemes (speech sounds) that make up syllables?
Acoustic Speech Waveform | V Phonemes [d] [da] [di] [du] Words Don dean dune
babies can make discrimination from the acoustic signals that make up syllables
we pull out phonemes (smallest perceived sound from a sound signal)
phonemes can be attached to vowel sounds which creates a syllable and those syllables create words
Sound spectrograms
are often used to show changes in frequency
and intensity for speech.
– These are plotted by frequency (and amplitude) over time.
– Formants are the enhanced
(darker) bands of frequencies.
Consonants
are produced by a constriction of
the vocal tract (using the articulators).
Formant transitions
rapid changes in frequency preceding or following
consonants as you’re producing a sound
when you produce a “duh” or “buh”
This results in production of the basic unit of
speech sound – the phone.
phone
speech signal
the basic unit of
speech sound