Quiz 9 - Basic Audition, Speech Perception Flashcards
What does the SUPER outer ear do for our hearing
Gives vibration about the environment to the next part of the ear
Role of semicirculuar canals
- Balance/body position
- is side to side and up-down
For speech, which BIG parts of the ear do we usually focus on?
Outer ear
middle ear
cochles
Technically speaking, sound WAVES stop at what part in the ear?
the eardrum –> then fluid starts to move
What is the change from the middle ear to the cochlea (hint : MEIM)
Mechanical energy –> electrical pulses
Role of oval window (to further the sound)
- Gives pressure to the fluid INSIDE the cochlea
Basilar membrane is comprised of ______ (2)
base and apex
Base houses _______ frequencies like ____ and the apex houses ______ frequencies
high range; [s], low range
T or F : the basilar membrane is one long thin structure
false, base is NARROW and THICK and apex is WIDE and THICK
Which one, the base or the apex, is more FLEXIBLE
Apex
What is a specific characteristic of the basilar membrane and frequencies and WHY does it matter?
- NON-linear ordering
caused by differences in amplitude and frequency
Loudness of a sound is proportional to the amount of :
1. Pascales
2. pressure
3. decibels
4. Intensity
5. none of the above
- none of the above
Loudness of a sound is correlated with _____
energy
The other more sophisticated term for loudness : ______
intensity
Why does intensity rely on amplitude and the frequency?
because we care about the variation of the size of the variation and how fast the variation changes
Unit used to measure sound pressure
Pascals
Define sound pressure
The deviation of local pressure from the average atmospheric pressure
T or F : a microphone can measure the sound pressure in air
True
Unit for intensity
W/m2
Sound intensity is the ______ carried by sound waves per unit area in a direction ___________ to that area
- pressure; parralel
- resonator; parralel
- power; perpendicular
- pressure; perpendicular
- power; perpendicular
T or F : amplitude and frequency are related to the alternating compressing and rarefaction of particles
F - its sound pressure
Reference value for sound pressure for human hearing
P0 = 20uPa (20 micropascals)
T or F : the measure db SPL is the same as dB but it is only analyzed in speech language
F, it measures acoustic intensity, and is relative to a reference point
Unit measuring acoustic intensity
dB SPL
T or F : sones measure actual loudness
F - perceived loudness, because it is a subjective scale
Perceived loudness will be _____ the sone value
- 50% - 1/2
- 100% - 1
- 25% - 1/4
- 200% - 2
- 200% - 2
Decibels are typically measured on a ______ scale
log
Why are we more sensitive to lower frequencies than higher ones?
Apex is larger/longer
Theres a _____ amplitude in _____ frequency
- Lower; lower
- higher; higher
- higher; lower
- equal; equal
- higher; lower
Dif between Hertz and Bark scale (2 each)
- Hertz scale - linear frequency
- same distances mean acoustic distances in physical parameters - Bark - psychoacoustical
- same distances represent equal distances in PERCEPTION (so more sones)
T or F : linguistic categories are language specific
T
Two large methods for studying speech perception
- Categorization or Identification tasks
- Discrimination Tasks
Why do researchers add noise to speech signals in speech perception?
Researchers are interested in the errors made in speech studies caused by distractions
In signal to noise ratio, what does 0 dB SNR mean :
- There’s no speech in the signal
- There’s no noise in the signal
- There’s no noise or speech in the signal
- Speech and noise are equal
- Speech and noise are equal
If dB SNR is negative (e.g. -4), what does it mean?
It means noise is louder than speech in the signal
If dB SNR is positive (e.g. 4), what does it mean?
It means speech is louder than noise in the signal
in dB SNR, what sounds are the hardest to distinguish
voiceless and voiced fricatives
in dB SNR, what sounds are easier to distinguish and WHY
nasals and liquids, because of formant-like structure
According to data presented in the graph, which of the following sound is least intelligible even when the signal is 12 dB higher than the noise?
A. [g]
B. [z]
C. [n]
D. [l]
B. [z]
How does phonology affect perception?
Languages categorize phonemes depending on the phonological contrasts seen in a certain language
How do we get the outputs of the multi-dimensional scaling? and what does it show?
- We get it from the confusion matrix, where people’s perception of sounds is different from the reality of the sounds
- Shows how close the similarities are between them selves in the distance of the scale
In the MDS, if phonemes are close together, what does it mean?
That means that they are sounds confusable with eachother
In MDS, if you have to name the x and y axis, what would it be
- constriction size; voicing
- location of constriction; voicing
- manner of constriction; size of constriction
- voicing; location of constriction
- voicing; location of constriction
In acoustics and intelligibility, what does sine-wave speech show?
General idea of formants, very basic formant info
Which of the following statements about loudness are true? (Select two)
a. Loudness depends only on the amplitude of the sound wave.
b. dB SPL measures perceived loudness directly without any reference levels.
c. Loudness is influenced by both amplitude and frequency of the sound wave.
d. Sones are used to measure perceived loudness on a subjective scale.
c. Loudness is influenced by both amplitude and frequency of the sound wave.
d. Sones are used to measure perceived loudness on a subjective scale.
A researcher presents two stimuli in a categorization task:
Stimulus A: A sound with a voice onset time (VOT) of 10 ms. Stimulus B: A sound with a VOT of 30 ms. Based on what you know about how acoustic information is mapped onto linguistic categories, what would likely determine how listeners categorize these sounds?
a. Whether the listener is in a noisy or quiet environment.
b. The absolute acoustic differences between Stimulus A and Stimulus B.
c. The listener’s ability to perceive differences in amplitude.
d. The listener’s experience with distributions of VOTs in their native language.
d. The listener’s experience with distributions of VOTs in their native language.
Which of the following are key differences between a neurogram and a spectrogram? (Select two)
a. A neurogram represents frequency response patterns on a non-linear cochlear scale, while a spectrogram typically uses a linear Hertz scale.
b. A neurogram shows sound intensity over time, while a spectrogram shows sound frequency over time.
c. A neurogram replaces frequencies with cochlear positions and amplitude with neural activation.
d. A neurogram shows the same spectral details as a spectrogram but uses a different color scheme
a. A neurogram represents frequency response patterns on a non-linear cochlear scale, while a spectrogram typically uses a linear Hertz scale.
c. A neurogram replaces frequencies with cochlear positions and amplitude with neural activation.
Mary is studying the effect of different speech sounds on the spike rate of neural population in an under-described language. Which of the following sound sequence would involve least steep spike?
a. [usu]
b. [ana]
c. [aba]
d. [ubu]
b. [ana]
A researcher is investigating whether listeners can distinguish between two similar speech sounds, such as [p] and [b]. Participants hear three sounds in each trial:
the first sound is a [p] produced with a short VOT
the second sound is a [b] produced with a longer VOT
the lasr one is either [p] or [b], randomly selected
The participants’ task is to identify whether the middle sound matches the first or the second. What is the experimental design of this task?
a. Categorization
b. AX
c. ABX
d. AXB
c. AXB
T or F : log-scaling frequencies allows them to have more of a curved pattern, that way we have see the degree of steepness and sensitivity
F - log-scaling frequencies gives a more linear pattern
Which formant is associated with more cochlear space ?
F1
Between F1 and F2, which formant has a larger range?
F2 - range of 1500Hz
Two vowels with different F1 and F2 will be more sensitive to what formant change?
F1, because there’s more cochlear space
Difference between neurograms and spectrograms (2)
- Frequencies REPLACED by cochlear positions
- Amplitude REPLACED by neural activation
On a neurogram, we see neural _______
firing
T or F : Raw frequencies are the most reliable and meaningful frequencies to language users
F : they can misguide us
What is the solution for making raw frequencies more reliable and meaningful
More scales than just Hertz to represent more cochlear patterns and positions
Identity the 2 ways listeners have to look at pitch and HOW they get to pitch (hint : Taylor Swift)
- Temporal - tracking the duration of the LARGEST repeating cycle
- Spectral - calculated from the harmonics
T or F : the space between two adjacent harmonics can also be described as the F0
T
Why is a power spectrum not the best way to see pitch perception?
- Not linear so not clear
- Is too busy - has too many factors
- Doesn’t reads how the cochlea perceives every harmonic separately
- Shows equal spacing between harmonics, which isn’t how the cochlea deals with harmonics
- Shows equal spacing between harmonics, which isn’t how the cochlea deals with harmonics
Up until which harmonic in the low end can we see far enough spaced apart
The lowest 7 or 8
How does loudness affect frequency isolation?
Higher intensity sounds will make a bigger disruption in the membrane, meaning that it makes it harder to perceive and sense small differences in freqs
Which kind of frequency causes the most disruption and why?
Low freq sounds because they are processed in the apex and so the signal has to travel further, disrupting the sounds processed before it
T or F : the cochlea works best when it is being moved
F : when it is unmoved and preparing for the action of new waves
What does excitement mean in terms of the cochlea?
When sound waves aren’t startling the cochlea but rather are allowing it to function properly
Although processed in different areas, low and high frequencies share the greater need to be _______ : (choose the best)
- Dampened
- Amplified
- Kept equal
- Boosted
- Boosted
Subjective perception depends on ________
loudness
Our brain is constantly re - _______________ sounds
a. constructing
b. playing
c. adjusting
c. adjusting
On a neural response system scale, the x axis is time, and the y axis is spike rate, where spike rate = _______
neural firing
In change sensitivity with our neural response, which group of sounds have the least steep spike rate :
a. [ba], [ma], [ka]
b. [na], [ra], [la]
c. [la], [ma], [ka]
b. [na], [ra], [la] - nasals and approximants
Sensitivity to change : vowels are often transcribed with _____-point formant values
single
T or F : in some types of niche discrimination tasks, you will be asked to categorized sounds as 1st or 2nd, etc.
F - you are asked whether or not a sound is difference from another
The AX, ABX, and AXB tasks are known as __________ and _________ tasks
- Labelling; discriminative
- Non labelling; identification
- Non labelling; categorizing
- Non labelling; discriminative
- Non labelling; discriminative
Explain : AX task + what kinds of pairing of sounds presented
AX - 1 or 2 unique sounds, 2 sounds overall
AB, AA, BB, BA
Average ISI in perception tasks
500ms
Speeded AX task - explain the diff from AX
Response time needs to be faster to map onto your auditory mode of perception (initial stage) and not the language specific phonetics
Explain AXB
3 sounds, 2 unique (A and B) - is X the same as A or B?
Explain 2AFC - name + stuff
2 alternative forced choice
2 stimuli presented –> then played together?
is it AB or BA
The YES-NO (single sound given : did you hear a ‘t’?) task is part :
Identification
Discrimination
Identification
The Labeling (single sound given : was the sound like ‘th’ in these or ‘th’ in ‘things’) task is part :
Identification
Discrimination
Identification
How many unqiue sounds are played in a ABX task?
- 1
- 2
- 3
- 3
- 2
T of F : usually in naturally produced stimuli, you can do it off your computer at home
F - more likely to use a sound booth and a mic
T of F : usually in naturally produced stimuli, a human is making the sounds
T
T or F : Both synthetic and hybrid stimuli ONLY used software to generate sounds
F - synthetic uses software but humans can make hybrid stimuli by simply modifying already made sounds
What is the main downside between Praat MFC and PsychoPy+PsyToolKit
Praat MFC can only run experiments on a local machine, others are online
In perception experiments - the 4 main types of analysis (general)
- Counts
- Proportions
- d-prime
- Reaction time (RT) - like in RT speed