LING330: Quiz #5 Flashcards

1
Q

What do you have to do when you have interacting forces?

A

Add amplitudes of positive forces

Subtract amplitudes of the negative forces

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What info do we have to know to define a sinusoid wave?

A

Frequency + amplitude (not phase as its not important for speech sounds)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

A graph of frequency and amplitude of different waves (with the algebraic sum of the waves) is called a…

A

Spectrum

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Is a complex wave sinusoidal?

A

No, it’s periodic as its cycle repeats regularly (in a pattern)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the basic frequency (the rate at which the pattern repeats) of a complex wave called?

A

Fundamental frequency (f0)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What determines the pitch of the sound wave?

A

It’s fundamental frequency (f0)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Harmonics

A

Component frequencies; their different frequencies and amplitudes are what give a sound it’s quality (why the same note on a piano and a violin sound different)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Fundamental frequency is always equal to…

A

The greatest common factor of the component frequencies (this number is also where all the numbers of the different waves line up and start over again together)
**the more sinusoids you add together, the more fast changing and complex a pattern you can create

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

General description of devices for recording sound work

A

Transfer patterns of speech vibration from the air to a more durable medium

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Describe Edison’s phonograph (1877)

A

Used a stylus to magnify sound waves that came from sound vibrations and etch them into a revolving wax cylinder (later onto a plastic disk)
Stylus ran back over grooves=vibration replicated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Magnetic recorders (invented in 1898 and continued to improve over following decades)

A

Microphone membrane converts sound vibrations into voltage variation in an electric current
This electric current was then used to create a varying magnetic field
Metal wire or tape passed through the field was magnetized in the corresponding pattern
Playback=running tape back through the magnetic “heads” of the recorder, which converted the magnetic field back to electricity back to membrane variations in a speaker
THUS specific sound events could be preserved and replayed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Kymograph

A

Kymo=Greek word for wave
Talker speaks into mask connected to a tube
Other end of tube=pressure sensitive membrane connected to a stylus
Air pressure variations of speech caused the membrane and stylus to vibrate
Stylus rested on revolving drum covered with smoked paper
As drum revolved, stylus etched out a white line that directly recorded air pressure variations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is and what isn’t visible on Jone’s kymograph?

A

Can see:
Periodic vibrations of vowels
Weak vibration of voiced sounds
Duration of differences of sounds

Can’t see:
Complexity of vocalic wave form

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Oscilloscopes and sound spectrographs

A

Like tap recorders, used a microphone to transfer patterns of vibration in the air into patterns of variation in electrical current
Sound spectrograph:
Used principle similar to Edison’s revolving wax cylinder, but used an electronic filter to separate frequency bands
-could only analyze about 2 seconds of speech at a time (2 or 3 words)
-short speech sample recorded onto magnetic disk then sample replayed multiple times
-each time sample replayed = output passed through variable electronic filter (set to let pass only a specific range of electromagnetic frequencies, like the bass/treble knob on a radio)
-instead of releasing sound, output of electronic filter fed into a moving electric stylus that would etch a dark line onto chemically treated paper attached to a revolving drum
-darkness of burned line=amount of electricity coming through filter=amount of speech energy within that specific frequency range

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How was acoustic analysis done at the beginning of the 21st century?

A
By computer (fast, accurate, easily portable on laptop)
Disadvantage: can't handle analog signals because speech waves=analog signals and computers can only process info represented digitally (numbers)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Analog signal

A

Continuously varying wave (like second hand of a clock sweeping smoothly around an old fashioned clock face)
Speech waves=analog signals
Computers can’t process these

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

How is analog to digital (A to D) conversion done?

A

Through SAMPLING
Aka taking repeated measurements at regular intervals (ex: collecting temp every hour and connecting the dots, making an analog wave)
In speech sampling: microphone converts sound pressure wave into variation in electric current (with strength of current proportional to air pressure)
-sound card component in computer=measures the voltage of electric current at regular intervals and records the measurements
-record of measurements=digital representation of speech wave

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What two questions must be addressed to get a high-quality signal for a sample?

A

1- how often to sample (SAMPLING RATE)

2- how precisely to measure (QUANTIZATION)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

The higher the sampling rate…

A

The more info the digital representation will contain

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

How fast do you have to sample to detect the presence of a sinusoidal component in a complex wave?

A

TWICE as fast as the highest frequency you want to measure
This is because you have to capture a measurement twice within its period (once in positive phase and once in negative phase)

21
Q

The Nyquist limit

A

Highest frequency that can be captured at a given sampling rate
(Ex: Nyquist limit for a sampling rate of 44,000 is 22,000 hz)

22
Q

Explain aliasing

A

If frequencies above Nyquist limit are present in the signal being sampled, the wrong shape will appear because the info between the sampled points is lost and the connected dots will take the shape of a much lower frequency (a wave of a much longer period)
Basically: high frequency masquerades as a lower frequency
Result=distortion of digital signal

23
Q

What kind of wave do you get when you combine two simple sinusoids?

A

A complex wave

24
Q

How is aliasing avoided?

A

By removing all frequencies above the Nyquist limit from the sound signal before the analog-to-digital conversion takes place
Done automatically by a program when the signal is at the electrical stage by passing the current through a low-pass filter (all frequencies above Nyquist limit are blocked)

25
Q

Quantization

A

How precise a measurement is (the higher the sampling rate (more decimal places), the more space it takes up on a computer)
Must decide how much rounding error can be accepted
Computer audio systems will default to 16 bits per sample, but 8 bits doesn’t sound bad

26
Q

Why are digital recordings better quality than analog tape recordings?

A

In analog recordings: plastic or metal tape=stretches and distorts + noise from turning cogs and hissing tape travelling through heads could never be completely eliminated

27
Q

Signal-to-noise ratio (SNR)

A

Goal is to maximize this in recordings

Recording should be as clean and clear as possible (more signal, less noise)

28
Q

How do you get the most signal possible in a recording?

A

Take full advantage of the system’s DYNAMIC RANGE (adjust the range to match the sound)

29
Q

Quantization error

A

Background noise in a recording due to representation of the continuous analog signal as a series of discrete levels
**the higher the bit rate, the more levels available and the lower the quantization error

30
Q

Clipping

A

When the volume on the dynamic range is turned up too far and the amplitude peaks are cut off in a recording
Result: distortion

31
Q

Factors to remember when recording to fully utilize the dynamic range without clipping

A
  • outside noise in the enviro
  • speakers raise and lower their voices, turn their heads, shift their bodies
  • papers crinkle with scripts
  • *head mounted microphone set to the side of speakers lips can reduce variation + watch level meter
32
Q

Uni-directional vs omni-directional microphones

A

Uni: designed for single talker
Omni: pick up sound from all directions so best for recording multiple speakers on one channel

33
Q

The most basic representation of a speech file

A

A waveform

Aka a graph of changes in air pressure (amplitude) over time

34
Q

What type of sound has the highest relative amplitude in a waveform?

A

Vowels (bc mouth is open)
Also complex repeating pattern (periodic)
**diffs between absolute amplitude in vowels of different waveforms are just due to variation in how loudly each utterance is spoken

35
Q

Sonorant consonants like nasals and laterals look like vowels in waveforms. What’s the diff ?

A

Lower amplitude

Less complexity

36
Q

Waveforms of voiced stops

A

Periodic
Lower amplitude than vowels and sonorants (sound of vocal fold opening and closing is beating through closed vocal tract)
Transient burst when closure is released into the vowel
In American English: [b, d, g]=periodic energy dies down during the closure unless stop is between other voiced sounds

37
Q

Voiceless fricatives (waveform)

A

No repeating pattern, appear as random noise
Strident fricatives=high amplitude
Non strident= may have very low amplitude, may be hard to distinguish from voiceless stops (clue: fricatives not followed by burst)

38
Q

Voiceless stops (waveform)

A

Easy to spit bc silent during closure phase (no amplitude) so appear as flat line in waveform (unless there is background noise)
Usually followed by a burst
Aspirated stops: followed by aspiration noise

39
Q

Voiced fricatives (waveform)

A

Combine periodicity and noise

Voiced stops: periodicity can die out toward the end of the consonant

40
Q

Marking off segments based on points of closure and release etc in a waveform

A

Segmentation

Speech analysis programs allow for this

41
Q

How do you see the difference between aspirated and unaspirated consonants?

A

In differences in VOT (voice onset time) aka the amount of time that elapses between the release of the consonant and the onset of periodicity for the vowel

42
Q

Spectral analysis

A

Allows us to analyze segment quality (allows us to quantify, visualize and analyze component frequencies and thus to quantify, visualize and analyze the details of sound quality)
Involves algorithms which mathematically analyze the signal in order to accomplish what the electronic filters in the sound spectrograph did: to test the strength of diff frequencies that might be present

43
Q

Waveform of glottal vibration

A

A “sawtooth” wave aka steep upslope (because of pressure increase when the vocal folds are blown open) and then a gradual decrease (as they’re pulled together by the Bernoulli effect)
Periodic pattern

44
Q

How do harmonic frequencies of vocal fold vibration relate to the fundamental frequency?

A

Harmonic frequencies will always occur at integer multiples of the fundamental frequency (the period of each sub-vibration has to fit exactly into the period of the fundamental)
Ex: voice with f0 of 100 hz, harmonics occur at 200 hz, 300 hz, 400 hz etc

45
Q

The lower the voice = the more ___ the harmonics

A

The more dense the harmonics

46
Q

Why are women’s voices disadvantaged in spectral analysis?

A

Typically higher f0 = less harmonics present + more breathiness = harmonics that are present may have lower amplitude (especially at higher frequencies)

47
Q

White noise

A

APERIODIC sound

Pressure variations are totally random

48
Q

Wide/broad-band spectrogram vs a narrow band spectrogram

A

Wide/broad band spectrogram: formant frequencies (regions of high amplitude energy; reflecting changes in resonance frequencies as vocal tract articulators change position) show up as broad bands (in wide band=spectra taken from short windows of speech signal at frequent intervals so the changes over short time periods are evident)

Narrow band spectrogram: when windows at less frequent intervals are used; individual harmonics can be distinguished but time dimension is less precise