Ling 290 Midterm 2 Flashcards
When is the first stage of speech development complete? What does it consist of?
Age 3-4; longer SLVT (children have lower resonances than infants) and epiglottis can’t articulate with velum
When does larynx descent begin?
3 months
Describe puberty in boys (age and effects)
8-15 years; 2 major changes
1- SLVT: larynx descends further which doesn’t effect pitch but “filter” changes give lower resonances
2- laryngeal: vocal folds become up to 60% longer and become thicker (gaining mass) so pitch drops 1 octave
Describe puberty in girls (age and effects)
8-15 years; vocal cords become longer and gain some mass, pitch becomes slightly lower
Changes gradual and not very noticeable
Describe speech perception in a fetus
2 mos- ears begin to develop
6 mos- ears developed (including inner ear)
Speech/sound perception also requires brain development
What are uterine sounds?
Mothers heartbeat, breathing, digestion
How old is speech language therapy?
About 70 years old
Name six causes of speech disorders/inadequacies
Congenital malformation Diseases Accident/injury Surgery Behaviour related Idiopathic (unknown cause)
What does ALS stand for
Amyotrophic lateral sclerosis
Fancy name for stroke
Cerebrovascular accident
Four categories of disorders and describe
1- voice disorders (generally w/ larynge function)
2- articulation disorders (w/ teeth tongue lips producing sound)
3- phonologic disorders (systematic disorders that effect groups of sounds ex: voiceless sounds)
4- fluency disorders (w/ flow, etc)
Name this fluency disorder and describe it
Stuttering (UK=stammering)
Includes involuntary repetition, prolongation or cessation of sound
Describe fluent speech
Smooth, comfortable tempo; appropriate pausing (un filled and filled); few false starts or repetitions
Difference between filled or unfilled pause
Filled=filled with filler words or sounds such as um, ya know, etc
Unfilled=silence
What factors can differentiate a persons fluency at different times?
Fatigue
Drinking
Emotion
Etc
What is cluttering?
Rapid or irregular speaking rate with long breaks and spurts of speech; poorly-planned utterances and speaker often unaware of impairment
What disorder is 3-4x more common in boys?
Stuttering
What’s the percentage of children who can recover from stuttering by age 16 with therapy or spontaneously?
80%
Etiology of stuttering?
The prevailing view is that it’s neurophysical dysfunction that disrupts precise timing of speech; genetic as it runs in families and more common in identical twins
Why is stuttering hard to diagnose?
Bc it’s hard to distinguish between stuttering as a normal developmental dysfluency or as a paralogical one (repetition of whole words vs parts of words/prolongation of sounds)
Describe treatment of stuttering
Aimed at abnormal speech behaviour and emotional problems of the stutterer; timed syllabic speech (even stress), shadowing of therapist, delayed auditory feedback, Edinburgh masker, STAR therapy
What is an Edinburgh masker?
Device put into the throat with “stethoscope” so stutterer can’t hear themselves talk
What is STAR therapy?
Therapy for stuttering, stands for structuring, targeting, adjustment and regulation
What is dysarthria? 5 characteristics
Motoric dysfunction that impairs speech by making
- sounds repeated/longer
- breathy voice
- strained voice
- audible inspiration
- variable rate with short rushes of speech
Etiology of dysarthria?
Head injury, cerebral palsy, neurological disorders, CVA (stroke)
Describe Broca’s aphasia
A non-fluent aphasia, difficulty producing speech, may utter short phrases or words
-function words often omitted
-
What was Lenneberg’s hypothesis about language acquisition?
“Critical period” is up to the onset of puberty (WRONG)
When do infants lose the ability to distinguish between nonnative sounds?
6-12 months; reflects language-specific experience
Shift from language-general to language-specific perception happens at what age?
Between 6-12 mos
Describe the PAM (perceptual assimilation model)
Best tested to see if infants and adults could distinguish sounds that were nonnative but also didn’t fit into any English phonemic category and both adults and infants could do it
What information do vowels carry? (4 things)
- speakers identity
- emotional tone
- pragmatic context
- phonemic info
Effects of experience begin earlier for vowels or consonants?
Vowels
For vowel distinguishing, which age group performed better?
6-8 mos better than 10-12 mos
How does consonant perception differ from vowel perception?
Vowel perception organized differently, experience may not be as pronounced, and effects of experience begin earlier
Phonetics
The scientific study of speech sounds
Extra linguistic
Sounds that don’t fit into the linguistic systems of consonants, vowels, etc (important to keep in mind that sounds that are extralinguistic in English could be part of systems in other languages)
What does a scientific study consist of for phoneticians and what is it based on?
Attempt a comprehensive, systematic, and objective account of the speech data they are describing.
Descriptions based on auditory impressions of the phoneticians, but later translated to internationally agreed system of symbolization/Articulatory labels
Now using instruments to investigate speech as well
Pro and con of phoneticians using instrumentation to measure speech?
Con=uncertainties in results bc interpreting info is not always straightforward
Pro=eliminate subjective influence on interpretation of data
Three aspects of spoken communication in phonetics
Articulatory phonetics- use of vocal organs to produce sounds
Acoustic phonetics- study of sound waves of speech
Auditory phonetics- study of reception of speech sounds by the hearer
Perceptual phonetics
Branch of auditory phonetics looking at how the brain sorts out and interprets incoming signals decoded by the auditory system
Neurophonetics
Looks at phonetic plans that are created and implemented neurologically in order for a spoken message to take place
Clinical phonetics
Application of scientific study and description of speech sounds to speech sounds of people with speech problems
Velopharyngeal inadequacy
Incomplete formation of the palate (cleft palate) and other disorders that affect the ability to close off the nasal cavity from the oral cavity during speech
Effects of voice disorders
Not being able to distinguish between voiced and voiceless
Pitch differences (intonation patterns)
Could change voice quality (breathy, creaky, harsh quality thru using ventricular folds as well or instead of vocal cords)
Difficulty to transcribe their speech
Four main types of child speech disorders
Delayed normal
Consistent deviant
Inconsistent deviant
Developmental verbal dyspraxia (DVD)
What aspect of speech is not dealt with as thoroughly in phonetics courses?
Suprasegmentals
What group does aphasia belong to and what is it?
Acquired neurological disorders; acquired language disorder (has many subtypes)
Apraxia
Can be paired with aphasia; not due to problems with nervous system or muscles of speech, but with voluntary control over the system
Ex: can say a greeting at one time but later can’t
Diff between pre and post lingually deaf speech
Prelingually deaf- never had auditory feedback system
Post lingually deaf- witness gradual erosion of accuracy of speech over time but alternative feedback routes (ex: vision) can help, tendency for fully open vowels to become more open and vice versa
both
Non labials subject to deletion or replacement by glottal stop
Difficulties with fricatives
Simplification of clusters
Speech synthesis
Automatic generation of speech using linguistically salient acoustic or articulatory properties, or spoken units that are selected and controlled using computational commands
What two inventions improved upon the vocoder synthesizer
Parametric (or formant) synthesizers get control info by analyzing relevant acoustic parameters in speech or by rules operating on a character string (ex: text)
LPC synthesis uses representation of speech signals as set of coefficients that try to predict the signal from past values in the time domain (good with resonance and pitch movements but fails in natural voice quality bc invariant nature of glottal pulses)
What brought much improved naturalness and intelligibility to synthetic speech after the 1990s?
Widespread use of variable-length speech unit selection approach (selecting from a database of human sounds)
Mastery of phonemes is described how and when is mastery usually achieved
Percentage of correct production of sounds; going by 75% as mastery (meaning they’re pronounced right at least 70% of the time), most children will have achieved this by 7-8 years old
Why should we be cautious when looking at averages of when children master phonemes?
Differences among language and dialect must be taken into account and there is a great variability in ages
Is it easy to state the age when speech development stops?
No because speech undergoes gradual adjustments all the time
What does phonemic mastery rely on?
Perception of acoustic cues for phonemic recognition
Control over the muscles of the speech production system
Application of phonological regularities of the language
WHICH DEPEND ON
Maturation of nervous system/larynx/vocal tract
Describe the complexity of speech
Person can produce speech at 7-8 syllables per second (2+ phonemes per second)
Each phoneme has it’s on spatiotemporal characteristics
Involves more motor fibres and temporal precision than any other motor activity
Describe prenatal speech development
Hearing at 5 mos, fetus hears maternal sounds
Describe speech development at birth
Can discriminate all sounds of all languages
Transitions to breathing to support life
Vocalization begins
Describe speech development at birth to 1 mo
Vegetative/reflexive stage of phonemic development; vocalizations=fussing, crying, belches, hiccups, cooing
Describe speech development at 2-3 mos
Cooing stage: simple vocalizations made mostly of vowels but sometimes with limited consonants
Describe speech development at 4-5 mos
Expansion stage: vocal tract looks more adult
Phonemic repertoire increases
Describe speech development at 6-10 mos
Babbling stage: sequences of syllables like ba ba ba
Syllables more reliably structured
Prosodic patterns similar to adult speech
Vocalizations reflect 1st language
Describe speech development at 11-18 mos
Auditory discrimination of speech is tuned to ambient language and children may lose discrimination of some contrasts from other non native languages
Phonetic inventory growing but limited
Describe speech development at 19-24 mos
By age of 2, children have 10-20 consonants and sufficient phonetic ability to learn many new words
Establishing phonological principles that will guide further lexical acquisition
Describe speech development at 25-36 months
More growth in phonetic inventory + vocab/syntax
Stuttering first noticed at this age
Describe speech development at 3-4 years
Almost all vowels mastered with many consonants
Describe speech development at 4-6 years
Almost at phonetic mastery except for FRICATIVE noises
Describe speech development at 6-9 years
Phonemic mastery usually complete, refinements in speech production continue
Describe speech development at 9+ years
Speech development complete but developmental changes can be seen (voice change in puberty for example)
Dennis fry- what is homo loquen?
Talking animal
Name the 4 main divisions of the tongue
Tip
Blade
Dorsum
Root
Significance of fetus hearing mothers voice in womb
Creates auditory bias of mothers voice and dominant language used in the home
7 places of articulation according to Kent
Bilabial Labio-dental Lingua-dental Lingua-alveolar Lingua-palatal Lingua-velar Glottal
What causes the first vocalization at birth? Is it always crying?
Lungs filling with air for the first time; not always crying, sometimes cooing etc
Difference between neonate and adult vocal tract
Neonate: higher larynx
Tongue almost fills oral cavity
No teeth
Vocal tract on gentle angle from larynx to lips
Interstructural coupling
Ex: muscular interaction between lips and jaw
Did speech movements come from early movements for eating/swallowing?
Recent research indicates that no, they develop separately and have their own patterns of interstructural coupling
Could come from vegetative behaviours that essentially use the same musculature
What creates the new speech capabilities at 4 months?
Adjustments of articulatory system (tongue jaw lips) to produce more complex sound patterns + voice energy from larynx increases repertoire to resemble adult speech
Most common sounds heard around 4 mos (during remodelling of vocal tract anatomy)
Vowels sounds in bid, bed, bug
Consonant sounds: glottal and lingua-velar stops
When is the bridge between simple vocalizations of early infancy and complex sound patterns that lead to words?
6-10 mos
What happens to the vowel/consonant ratio in the 6-10 month period?
More consonants develop
Most prominent sounds in babbling? Which one has a universal appearance in infant vocalizations?
Bilabial and lingua-alveolar; bilabial= universal
When do phonemic differences between languages start to strongly influence vocalization?
After 6-7 mos
Three names for type of babbling like “ba ba ba”
Canonical babbling, multi-syllabic babbling, repetitive babbling
Babbling is a major stride towards speech in 2 ways
1- marks a connection between audition and motor control (as in child’s hearing status + phonetic patterns in ambient language)
2- signals motor accomplishment in that infant can produce reliable and regular syllable patterns that will lead to word formation
3 vowel and consonant positions most common during second half-year of life
Vowel: central, mid front, low front
Consonant: voiced stops, fricatives, glide
Consonants usually in syllable-initial position (3 types)
Voiced stops (b d g)
Nasals (m n)
Fricative (h)
Consonants usually in syllable-final position (4 sounds)
T H M S
Consonant inventory at 8-18 mos?
6
When does consonant inventory go up from the 6 at 8-18 mos?
18-22 mos with inventory of 10-20 consonants
Which grows faster: inventory of syllable-initial consonants or syllable-final?
Syllable-initial (voiced)
What is the phonological rule that expresses systematic relationship between child’s production and the adult target
X = Y/Z (x is replaced by or realized as Y in the environment Z
Final consonant devoicing
Consonants in word-final positions are produced as voiceless rather than voiced ex: pick for pig
Cluster reduction
Consonant clusters are reduced to similar forms such as single consonants ex: tick for stick
Stopping
Fricative consonants are replaced by stops ex: tip for sip
What sounds do 25-36 mos have trouble with still?
Fricatives (s and z)
Liquids (r and l)
Describe vowel mastery at 3-4 years old
Almost mastered all vowels except rhotic (r-coloured) vowel in the world bird
How does losing teeth after speech around 6 years?
Temporary period of fricative misarticulating because of lose of central incisors
Why is it harder for children to speak than adults?
Children need to build higher air pressure below vocal folds to be as loud as an adult
Slower speaking rates
Less precise in spatiotemporal patterns of articulation
Mutation
Also known as adolescent voice change
Happens between 10-12 years old
Change in frequency more pronounced in males
When were synthesizers first unveiled?
1939 (new york worlds fair)
How was vocoder synthesis improved upon over the 50 years after the first synthesizer was unveiled?
Parametric synthesis
Linear predictive coefficient (LPC) synthesis
What do parametric (formant) synthesizers do?
Control info by analyzing relevant a acoustic parameters in speech or by rules operating on a character string (ex: text)
What does LPC (linear predictive coefficient) synthesis do?
Uses representation of the speech signal as a set or coefficients that try to predict the signal from past values in the time domain
Accounts for resonance/pitch movements but fails in natural voice quality because of invariant nature of glottal pulses
TTS systems most widely used application of speech synthesis in: (2 things)
Information technology
Aids for the blind/people with other disabilities
5 modules of TTS systems
Text normalization Morphological analysis/parsing Lexicons Grapheme to phoneme rules Phonological rules
“Alex”? Describe
Most natural synthetic speech voice in American English; found on Apple computers
Can handle large amounts of users or workload without strain
Describe Nuance
Over 2 decades, came to dominate the American English speech technology market by connecting with or acquiring other TTS providers
Now supplies 2 TTS systems: RealSpeak and Vocalizer 5
Names of nuance’s two TTS systems
RealSpeak and Vocalizer 5
Describe Tellme’s synthetic voice
Zira; allows spoken query and TTS response for stocks, sports, news, etc
Used for general text output but sounds unnatural and prosody absent
Cereproc
Scottish-based; accents American, Scottish, Black Country, southern British English
European Acapela Group
TTS in 25 languages that use 50 standard voices or users own voice
Loquendo
TTS for all major European languages, Australian English, some minority Italian languages, and variety of South American dialects
Ivona
TTS in Romanian, Polish, American English and two British english voices developed with royal national institute of blind people UK
Neospeech
American English, Japanese, Mandarin Chinese, Korean, Latin-American Spanish
People who depend on speech synthesis
The blind, physically impaired, repetitive stress injuries, dyslexics, autistic children, people with language impairments
Why are apple computers good for blind or low vision people?
Has VoiceOver that offers more than TTS; uses speech to describe status of the comp and actions and activities as they occur
Provides additional keyboard commands/other input options, like Braille display input
Integrated into Mac operating system
What is Baldi?
A “talking head” created by Massaro and his colleagues; realistic visual speech and can read aloud web pages etc
Users control rate of speaking, facial expression and emotion
No wifi necessary
Who benefits from visible speech?
Deaf-oral children
Autistic children
Students of English as second or foreign language, reading-disabled (including dyslexics), first language learners
What is Vocabulary Wizard and what are it’s results?
It’s a talking agent that helps deaf kids who can hear a bit from helpers to improve speech production and perception and increase their vocabulary
Text normalization
Transforms text to make it consistent; it is performed before text is processed to generate synthesized speech
Makes correct interpretation of lower/upper case letters, removing punctuation and diacritics from letters and expanding abbreviations
What was created to help synthetic voices sound natural?
Loquendo created expressive cues which gets naturalness and accuracy through their unit selection
When was female speech finally synthesized better?
1990s
How have synthetic voices changed over the past two decades?
Represent more sexes, cultures, accents
Still lacking African American voices though
What does Van Santan claim that TTS systems require to sound perfectly natural?
Real world knowledge
What is important to know when using speech analysis software?
It doesn’t interpret the analysis or highlight what is important to the second language learner in the display of the program. You must have prior knowledge of speech acoustics to use these tools.
Compare old speech analysis software to today’s
Old: racks of specialized equipment and a huge computer for only the program
New: desktop or laptop with a good microphone, quiet space; only need external hardware for the more expensive programs
What and when does speech analysis software measure?
Measurements made of silences, frequencies of components of speech, transitions between sounds and duration of sound length.
Measurements can be made before or after teaching about one or more of these aspects in classroom or research setting.
What two techniques are used to count/measure silences and filled pauses in speech?
Spectrogram and waveform
Prosody
Rise and fall of pitch (intonation) in vowels, rhythm of direction of vowels and variations in loudness of the vowels across an utterance.
Called SUPRASEGMENTALS because pitch/rhythm/loudness change in patterns across the speech sound segments
SUPRASEGMENTALS (3)
Tone
Vowel length
Features like nasalization and aspiration
Global sentence intonation tracks
Replicate word stresses and phrase final pitch declination
Helps L2 English learners learn that there is a tendency in English to fall in pitch at the ends of phrases and that pitch rises on stressed syllables within words
Reflect only voiced speech as it reflects F0 (fundamental frequency) and voiceless sounds breaks the pitch track
What is the most studied characteristic of speech production in both L1 and L2 speech research?
VOT (voice onset time) = time from explosion of a stop consonant beginning a word to the beginning of voicing for the following vowel
Three types of voice onset time
Pre-voiced
Short-lag
Long-lag: voiceless stops in English
Across languages, what tends to be longer: vowels before voiced or voiceless consonants?
Voiced
Spectral characteristics
Related to frequency components of speech sounds (vowels and some consonants)
What’s measurable in spectrographic displays of speech and what’s more important?
Steady-state FORMANTS and transitions; transitions in and out of vowels could be more important
English uses how many vowels?
12
2 different spectral characters of consonants compared to vowels
Consonants are shorter and show more abrupt changes in speech signal
PRAAT
Free
PC and Mac
One of the most freely available programs
Does everything from basic waveform and spectrogram analysis to specialized complex analyses of interest to researchers
Organized differently so requires knowledge
KayPentax Computerized Speech Lab (CSL)
Commercial windows program
For controlled speech acquisition/analysis with no contamination of computer noise
Extensive set of analysis including waveform, spectrographic, and frequency analyses
Speech Filing System
Set of speech acquisition and analysis tools with tutorials
Free with windows
Has downloadable modules like WASP (waveform/spectrograph) and PROREC (records words or sentences)
Multi speech
Commercial; windows
Modification of computerized speech lab for use without the dedicated sound acquisition hardware
Uses the integrated computer sound card to perform same functions as CSL
TF32
Windows; commercial
Analyzes speech frequencies in diff ways
Displays waveform and spectrogram, determines centre of gravity for fricatives, will display pitch
SIL speech analyzer and wave surfer
Free, windows only
Basic software programs that offer waveform and spectrogram analysis as well as editing functions found in other programs
MRI can gather what info?
Looks at structures in oral cavity including soft tissue (tongue) and bone
Ultrasound shows what info?
Looks at soft tissue but nothing else; high frequency sound waves used to see movement of tongue
Electromagnetic articulography (EMA)
Developed for speech sound research
Uses sensors attached to tongue jaw lips to display tongue movements when makes sounds
Speaker sits under helmet containing the hardware
Motion sensor systems (optical tracking)
Sensors attached to body part that’s movement is being studied
Electroglottograph (EGG)
Instrument that examines vocal cold vibration for both voiced and voiceless sounds; may be used for analysis of singing voice
Electrodes placed on larynx registering place of contact between vocal folds
Electropalatography (EPG)
Pseudo-palate containing electrodes is used to see tongue to palate contact
4 segmental articulation errors
Deletion
Substitution
Insertion
Distortion
Treatment for aphasia
Language impairment based treatment= develops improved intonation, fluent reading/writing, improved word-filling
Dysphonia (can be spasmodic)
Rare voice disorder which affects laryngeal muscle control in speech, hoarse or creaky voice or none at all
Laryngectomy
Complete or partial surgical removal of larynx usually bc cancer; changes source not filter
4 types of laryngectomee speech
Oesophageal speech: swallowing air and burping it up
External voice prosthesis: hand held device that buzzes, held against throat
In-dwelling vocal prosthesis: hole in throat (trachea) made for breathing, silicone prosthesis inserted between trachea and lower pharynx
Laryngeal transplant: uncommon
Two important aspects of good speech
Intelligibility and naturalness
6 steps of TTS
1- text input 2- pre-processing (text normalization) 3- linguistic processing 4- phonetic processing 5- synthesis (SSR speech synthesis by rule) 6- speech output