Post Midterm 2 Flashcards
Music is a form of _ conversation or _ sound
Emotional, organized
_: Quality of tones that range from high to low, often organized on a musical scale
Pitch
Pitch: Quality of _ that range from _ to _, often organized on a musical scale
tones, high, low
_: the experience of a sequence of pitches as belonging together
Melody
Melody: the experience of a _ of _ as belonging together
sequence, pitches
_ : Refers to various qualities of sound that differ across musical instruments
Timbre
_, _, _ (HCD): Qualities of sound (positive or negative) that emerge when multiple pitches are played together
Harmony, Consonance, Dissonance
_ refers to a temporal structure created by the inter-onset interval of notes (the time between the onset of notes, not the duration of those notes)
Rhythm
Some adaptive functions of music:
Laying a foundational role in the development of _ (Humans _ before they _)
Attracting _ partners
Playing a role in _ _ and _ _
language, sang, spoke
sexual
social bonding, group cohesion
On the other hand, music may have emerged as a biproduct of other systems that have adaptive function, such as _, _, _ (‘auditory cheesecake’, Pinker, 1997)
hearing, language, and emotion
Although musical conventions differ across culture, there are many universal aspects of it:
Music can give rise to various _
Sequences of notes close in pitch are _ _
_ sing to their _
People listening to music tend to start _ with various properties of music
Has a _ context
emotions
grouped together
Caregivers, infants
moving in sync
social
Music is associated with various positive outcomes:
Musical training improves _ (e.g. math, greater emotional sensitivity, language skills, timing perception, etc.)
Music produces _ feelings
Music evokes _
performance in other areas
positive
memories
MEAM
Music-Evoked Autobiographical Memory
The ability of music to evokes memories may be particularly useful for those experiencing various forms of _ _
El Haj et al. (2013) found listening to two minutes of familiar music lead to better _ _ in a group of _ patients, as compared to two minutes of silence
(healthy controls did/did not show this difference)
cognitive decline
memory retrieval, Alzheimer’s,
did not
reasons for music cog enhancement probably relates to the _ _ of the brain caused by listening to music, including but not limited to:
Amygdala and nucleus acumbens ( _ )
The hippocampus ( _ )
Cerebellum and motor cortex ( _ )
_ cortex (while reading music, watching performances), _ cortex (touch feedback while playing instruments) _ cortex (modelling the structure of a piece of music, generation expectations, etc.)
widespread activation
emotion, memory, movement, Visual, sensory, Prefrontal
The Beat: _ spaced intervals of _, which occur when there are no _
This creates a framework for _ components of music to _ _ (notes, rhythm, etc.)
Equally spaced intervals of time, which occur when there are no notes
other components of music to ‘fit into’ (notes, rhythm, etc.)
Equally spaced intervals of time, which occur when there are no notes
beat
Grahn and Rowe (2009) listened to either ‘beat’ or ‘non-beat’ stimuli (beat stimuli included shorter notes that fell directly on the beat, increasing awareness of it)
Found that _ _ activity was greater for _ stimuli, as compared to _ stimuli
Also found greater neural connectivity between _ _ areas and _ structures while listening to beat stimuli
Connectivity can be assessed by…
basal ganglia, beat, non-beat
cortical motor, subcortical
measuring how correlated activity across areas is, or in other words checking whether activity in one region can predict activity in another
Chen et al. (2008) found activity in the _ cortex increased across three conditions that involving listening to beat stimuli:
Tapping along, Listening with the intention of tapping along later, Passive listening
Activity increased the most in the _ _ condition (followed by _ , _
demonstrates a close link between the _ and _
premotor
‘tapping along’, (the listening with the intention of tapping along later)
beat and movement
Fujioka et al. (2012) measured brain waves while people listened to music and found they _ in time with the _ ,
This isn’t quite the same thing of course but, at least in a superficial way, resembles what happened during _ locking
oscillated in time with the beat,
phase
Meter: the organization of _ into _ or _ (often accenting the first beat in each bar)
Metrical structure can be created by _ in various ways (playing that note louder, with a stronger attack, etc.)
the organization of beats into bars or measures (often accenting the first beat in each bar)
accentuating
the organization of beats into bars or measures (often accenting the first beat in each bar)
meter
Rhythm refers to a _ structure created by the inter-onset _ of _ (the _ between the onset of notes, not the _ of those notes)
The beat can be likened to the pulse of music, creating a _ _ that the melody fits into to create a rhythmic pattern
temporal structure created by the inter-onset interval of notes (the time between the onset of notes, not the duration of those notes)
regular framework
temporal structure created by the inter-onset interval of notes (the time between the onset of notes, not the duration of those notes)
rhythm
Syncopation occurs when
This produces a violation of expectations of sorts, and can lead to decreased/increased neural activation
notes are played ‘off the beat’, creating what can be described as a ‘jumpiness’
increased
notes are played ‘off the beat’, creating what can be described as a ‘jumpiness’
syncopation
Iverson et al. (2009) recorded MEG activity when presenting the same stimuli but with instructions to either imagine accents on the first or second note
Our ability to change _ with our mind is reflected indirectly/directly by activity in the brain
meter, directly
Phillips-Silver and Trainor (2005) studied whether how we move influences our perception of meter - 7 month-old infants listened to an auditory stimulus with ambiguous meter
bounced up and down in either a duple or triple pattern
A head turning procedure was then used to test whether the infants now preferred listening to an auditory stimulus with either a duple or triple pattern of meter
Infants preferred listening to the pattern that was _ with how they were bounced
consistent
t/f: Different languages have different _ patterns
_ words are not stressed, whereas _ words are, across many languages
In English, content words typically come after function words, so the dominant stress pattern is _
e.g. the book
In Japanese, stress pattern:
stress
Function, content
short-long
long-short
Participants listened to a series of alternating short and long tones
the beginning, and end is ambiguous
Native English speakers more likely to perceive meter as _
Native Japanese speakers more likely to perceive meter as _
short-long
long-short
Auditory stream integration refers to grouping…
melody refers to the experience of a sequence of …
Similar to the Gestalt principle of similarity and/or proximity, tones that are close together (in pitch) should be more likely to be _ _ (i.e. as part of a single/coherent melody)
The most common interval size between notes within a melody is _ semitones
notes/tones/sounds, etc. together within a single perceived stream to form a coherent melody
pitches as belonging together
grouped together, 1-2
notes/tones/sounds, etc. together within a single perceived stream to form a coherent melody
Auditory stream integration
Certain trajectories of notes are also common, such as the _ _ (involving rising and falling tones)
In general, large ‘jumps’ in the distance between notes: more likely to _ in pitch
Often involve a melody ‘turning around’ to fill in the gap (referred to as _ _, i.e. the ‘missing’ notes between whatever notes were just played)
arch trajectory
increase
gap fill
Tonality refers to
Beginning/ending compositions with the _ note is a common practice
e.g. a song in the key of C might start and end on a C note
A tonal hierarchy can be specified, which indicates…
The tonic (1st note) has the greatest stability, then the _ note, the the _ note
organizing pitches around the note associated with the composition’s key (referred to as the tonic)
tonic
how well each notes fits into a scale:
5th, 3
Krumhansl and Kessler (1982) had participants rate how well a probe tone matched with a scale they just heard
Found the _ was rated as more compatible than other notes (‘tonal hierarchy’), which may be based on _ _
tonic
prior experience
We can contrast two general approaches to understanding the relationship between music and emotion:
The Cognitivist Approach:
The Emotivist Approach:
listeners can perceive the emotional meaning of a piece of music, but that they don’t actually feel the emotions
listeners emotional response to music involves actually feeling the emotions
listeners can perceive the emotional meaning of a piece of music, but that they don’t actually feel the emotions
listeners emotional response to music involves actually feeling the emotions
cognitive approach
emotive approach
Music is sometimes described as producing ‘thrills’ (Oxford dictionary definition:
Most commonly reported physical responses of musicians (Sloboda, 1991): shivers, laughter, lump in throat, tears
a nervous emotion or tremor caused by intense emotional excitement… producing a slight shudder of tightening through the body), or ‘chills’
Eerola et al. (2013) found that _ and _ had considerable effects on valence
Various other effects have been noted:
Greater Loudness: _ arousal, _ scary,_ peaceful
Higher Registers (pitch): _ scary, _ happy
Greater Dissonance: _ tension
key and tempo
Greater Loudness: + arousal, + scary, - peaceful
Higher Registers (pitch): - scary, + happy
Greater Dissonance: + tension
Many cognitive processes can be understood from the perspective of our system attempting to build (and constantly update) a model of what we think is _ _ to be happening in our environment
This involves making many _,
Violations of expectations can create weak/strong (emotional or otherwise) responses
most likely
Violations
strong
t/f: Like language, music has syntax (or ‘rules’) that govern how we expect the pieces are supposed to come together
In the context of language, the _ _ component is thought to index awareness of _ _
Researchers have tried measuring how the brain responds to violations of various kinds of _ _using similar methods
t
P600 ERP
syntax violations
musical expectations
Patel et al. (1998) recorded P600 activity for listeners of a phrase that was followed by one of three target chord:
Larger/smaller P600s found for the target chord from the _ key (and to a lesser extent, the _ key)
t/f: people recognize these violations in similar ways as with language
Larger
far, near
t
Electrical Responses: Unexpected notes can generate a responses that is referred to as the _ (ERAN), which occurs in the right hemisphere just a bit earlier/later than the P600 just discussed
Brain Scanning: particularly active brain areas while listening to music are the _ (associated with processing emotion), the _ (associated with reward), and the _ (associated with memories)
early right anterior negativity (ERAN)
earlier
amygdala, NAc, hippocampus
Salimpoor et al. (2011) asked participants to rate the intensity of ‘chills’ and pleasure while listening to music and found that both were positively/negatively associated with activity in the NAcc
This was interpreted as relating to _ activity (e.g. release)
Mallik et al. (2017) found that _ , an opiate antagonist, reduced/increased the emotional response to music
This may implicate _ in the emotional experience of music
positively, dopaminergic
naltrexone, reduced, endorphins
Amygdala damage has been found to:
_ the pleasurable musical ‘chill’ response (Griffiths et al., 2004),
Disrupt the ability to perceive the _ tone of a piece of music (Gosselin et al., 2005)
Patients with parahippocampal damage have been found to rate dissonant music as being slightly _ (in contrast to healthy controls, which rate it as _)
Reduce/prevent, emotional
pleasant, unpleasant
Although operating in a different modality, many of the challenges faced by the perceptual system when interpreting speech are similar in nature to those associated with _ perception
A common theme is trying to reduce _ to derive the most likely meaning
visual
ambiguity
The acoustic signal (or acoustic stimulus) is the term for the _
This is produced by air that is pushed up from the _ through the _ _ and then into the _ _
stimulus for speech
lungs, vocal cords, vocal tract
Vowels are produced by _ of the _ _ that accompany changes in the shape of the _ _
These changes are caused by moving _ (structures like the tongue, lips, teeth, jaw, and soft palate), which change the _ _ of the vocal system
change in resonance produces _ in pressure at a number of frequencies (referred to as _ )
Each vowel is associated with a characteristic series of _
vibration, vocal cords, vocal tract
articulators, resonant frequency
peaks, formants, formants
The formant with the lowest frequency is called the _ _ , the _ _ is the next highest, etc.
These can be visualized using _ _
first formant (F1), second formant (F2), sound spectrograms
Consonants are produced by _ of the _ _
Formant transitions: rapid changes in _ preceding or following consonants
constrictions, vocal tract
frequency
t/f: Although vowels and consonants are fundamental units of speech, these are not the smallest ‘building blocks’ of speech
Phoneme: the _ unit of speech capable of changing the _ of a word
e.g. ‘bit’ becomes ‘pit’ by changing the ‘b’ to a ‘p’
t
smallest, meaning
While words are ‘built’ by putting together _ in different combinations, the acoustic signal produced for any given phoneme is _
This can be referred to as a
Nevertheless, providing yet another example of perceptual constancies (e.g. colour constancy, size constancy, etc.), our perceptual systems can still recognize differing _ signals as representing the same _
phonemes, variable
lack of invariance
acoustic, phoneme
The sounds produced by a single phoneme can be different depending on what phoneme comes before and after it ( _ )
e.g. we perceive the ‘b’ sound in ‘bat’ and ‘boot’ as essentially being the same ‘b’ sound, although our mouth shape changes (which affects the acoustic signal)
coarticulation
additional sources of variability come from differences across _
e.g. It has been estimated that there are 50 ways to pronounce the word ‘the’ (Waldrop, 1988), related to such factors as pitch, accent, speed in speaking, etc,
speakers
Some sources of variability can differ between _ _ , as well as within the _ person (at different times)
‘ _ ’ pronunciation is one such example
e.g. the ’t’ in ‘best buy’ might not always be pronounced (‘bes buy’), or the second ‘d’ in ‘did you’ (‘dijoo), etc.
different people, same
Sloppy
_ perception occurs with speech, given that a wide range of acoustic cues results in the perception of a _ number of sound categories
e.g. we perceive wavelengths between about 450-479 nm as being blue, whereas wavelengths starting at about 480 nm get categorized as green
Categorical, limited
In speech perception, one continuous property that seems to be related to categorical perception is _ , which is the delay between when a speech sound begins and when the vocal cords start vibrating
Same acoustic signal can be perceived differently by varying the VOT
Thus, even though VOT is a continuous property, the listener perceives only two categories, depending on the VOT: /da/ on one side of the boundary, /ta/ on the other side
voice onset time (VOT)
voice onset time (VOT): delay between when a _ sound begins and when the _ _ start _
speech, vocal cords, vibrating
Like many things we perceive, speech perception can be influenced by _ _
Visual information provided is one such influence (and can be referred to as ‘ _ _ perception’)
The _ Effect was introduced in our very first lecture and involves visual input changing speech perception
multimodal integration, audiovisual speech
McGurk
Kriegstein (2005) presented participants with stimuli using both familiar and unfamiliar voices, while using fMRI
The _ was found to be activated for all speech stimuli (consistent with prior work associating it with speech perception), though _ (but not _ ) voices also activated the
Provides a physiological basis for a link between _ and _
superior temporal sulcus (STS)
familiar, un, fusiform face area (FFA)
speech perception and facial processing
Phonemes are less/more easily perceived when they appear in a _ context
Represents an influence of _ - _ perception
Rubin et al. (1976): Participants recognized phonemes more quickly when presented as part of _ words, as compared to ‘ _ ’ words (e.g. bat vs. baf)
more, meaningful
top-down
real, nonsense
Missing phonemes can also be ‘ _ _’ based on expectations (the _ _ effect)
Warren (1970): embedded a cough in a recording of a sentence and asked participants to report: 1: where in the sentence the cough occurred, and 2: where any phonemes were missing
Participants could not accurately place _ the cough occurred, nor recognize the _ _ that was removed
filled in, phonemic restoration
where, missing phoneme
Miller and Isard (1963) asked participants to ‘shadow’ (listen with headphones and repeat aloud what is heard) three kinds of sentences:
Grammatically correct sentences; Anomalous sentences that correctly follow grammatical rules but do not make sensee.g. gadgets kill passengers from the eyes
Ungrammatical strings of wordse.g. between gadgets highways passengers the steal
Accuracy decreased across _ type (89% for grammatically correct, 79% for anomalous, 56% for ungrammatical)
Adding background noise produced a _ , but more _, pattern (63% for grammatically correct, 22% for anomalous, 3% for ungrammatical)
These results show that arranging words in a meaningful pattern enhances our ability to _ them, which can demonstrate an effect of _ _ on the perception of linguistic representations
sentence
similar, extreme
recognize, prior knowledge
A similar problem affects visual perception (identifying individual objects, separating figure/ground, etc.)
As with visual perception, _ processing assets us with this based on factors such as context, meaning, prior knowledge, etc.
The _ problem refers to the fact that, because there are no physical breaks in the continuous acoustic signal, speech segmentation (perceiving individual words) can be a challenge
e.g. How do we distinguish ‘ice cream’ from ‘I scream’, and ‘big girl’ from ‘big Earl’?
Listening to a _ language makes this problem more obvious
top-down
segmentation
foreign
The segmentation problem refers to the fact that, because there are no _ breaks in the continuous _ signal, _ segmentation (perceiving individual words) can be a challenge
physical, acoustic, speech
As we learn a language, we _ begin to acquire expectations for how sounds/words are likely (or unlikely) to be _ _ in particular combinations, based on how frequently we hear those particular combinations
implicitly, put together
The chance that one sound will follow another in a language is referred to as _ _
A (very) general example of this would be that English speakers expect at least one vowel to occur after every few consonants or so
We learn these associations _, through _ learning
transitional probability
implicitly, statistical
Saffran et al. (1996) used a head-turning procedure to test whether eight month-old infants show evidence for statistical learning
Four nonsense words (e.g. bidaku, golabu, padoti, turpiro) were combined in random orders to create two minute-long strings of sounds (e.g. bidakugolabupadotiturpiro)
Transitional probability of two syllables within words = 100%e.g. 100% of time the infants heard ‘bi’, it was followed by ‘da’ (from bidaku)
Transitional probability of two syllables between words = 33%e.g. 33% of the time the syllable ‘ku’ (from bidaku) was presented, it was followed by ‘go’ (from golabu), because golabu followed bidaku on on third of trials
_ word: stimuli from pervious nonsense words
_ word: novel stimuli using combo of nonsense words
If the infants perceive the nonsense words as things that’ve heard before (i.e. if they recognize them), the whole words condition stimuli should seem more/less novel, which should make them more/less interesting than the more novel part-words
The results supported this prediction, suggesting that the infants were already using _ learning by that age
whole, part
less, less
statistical
Davis et al. (2005) used noice-vocoded speech (a method to add noise to an acoustic signal) and asked participants to identity the words they were perceiving in each sentence
Accuracy was close to _ % for the first sentence and gradually _ across sentence number
Additional information provided by the preceding sentences provides some context and can lead to _ effects, in which related words in later sentences are _ to identify
Speaks to the role of _ processing in perceiving language
0, increases, pop-out, easier, top-down
Although it has largely fallen ‘out of fashion’, Liberman et al. (1963, 1967) proposed a motor _ _ _ perception
This was partly developed as a response to the lack of _ (or, in simpler terms, the presence of _ !) associated with phonemes contained with the acoustic signal
The thinking was, even if the physical properties of the acoustic signal changes as a function of things like the speaker (e.g. accents), coarticulation, etc., if everyone produces phonemes using the same mouth movements, maybe our perceptual system can use _ representations as a ‘ _ ’ (invariant, i.e. not variable) form of phoneme representation
motor theory of speech
invariance, variability, motor, stable
D’Ausilio et al. (2009) used TMS to test the hypothesis that enhancing activity in _ regions may assist with the ability to recognize various _ of speech
Found that stimulation in areas associated with _ movements sped up/slowed down responses to sounds of speech involving the same/different areas (with a similar result for areas/stimuli related to tongue movements)
Recall that the _ organization of the somatosensory cortex allows for these kinds of precise manipulations
motor, sounds
lip, sped up, same
somatotopic
Silbert et al. (2014) used fMRI to check neural activity across two conditions:
Telling a story ( _ ), Listening to a story ( _ )
Found both _ and _
production, comprehension
differences and overlap
Note that some models consider networks supporting production and comprehension as _
separate
Broca’s aphasia results from damage to Broca’s area (in the _ lobe)
These patients have speech that is slow, laboured, and often involves jumbled sentences, which all seem to relate to a general deficit in processing sentence structure (grammar, syntax, etc.)
This general deficit may most obviously affect _, though it can also cause problems with comprehension
frontal
production
Wernicke’s aphasia: results from damage in Wernicke’s area (in the _ lobe)
Patients can speak fluently and form grammatically _ sentences, though the content is _ and not _
These patients have even more difficulty with _ (as compared to those with Broca’s aphasia)
May be associated with word deafness in extreme cases (inability to recognize words)
temporal, correct, disorganized, meaningful
comprehension
A ‘voice area’ in the _ has been identified that is activated more strongly by _ than other sounds (Berlin et al., 2000)
‘Voice cells’ in the _ lobe of monkeys have also been found which respond more strongly to recordings of _ calls than to calls of other animals (Perrodin et al., 2011)
Also in the temporal lobe, single-cell recordings have located neurons in humans that respond more strongly to _ (Mesgarani et al., 2014)
superior temporal sulcus (STS), voices
temporal, monkey, phonemes
Some of the neurons identified in Mesgarani et al. (2014) also seem tuned to selectively respond to more general phonetic features, such as:
_ _ articulation: what you actually do with your articulators (how you move your tongue, lips, etc. when pronouncing certain phonemes)
_ of articulation: where in your mouth the articulators are manipulated (back of the throat, front of mouth close to teeth, etc.)
Manner of
Place
mirror neuron implications?
The last few slides discussed parts of the temporal lobe which are involved with recognizing speech (which coincides with the general location of the ‘what’, or ventral, stream)
Recall that we saw in the previous chapter how sound localization seems to involve the where/how, or dorsal, stream
The dual stream model of speech perception proposes that the ventral and dorsal pathways are involved with identifying sounds of speech and representing movements associated with sounds of sounds, respectively
Eimas et al. (1971): tested habituation with different VOT’s using suckling (rather than looking) time
After habituating to a baseline phoneme, the infants _ to one kind of change (that adults would perceive as being a different phoneme) but not another (that adults would not perceive as being a different phoneme)
dishabituate
If first habituating to a particular acoustic signal with a VOT of 20 ms (sounds like ‘ba’ to an adult), they then _ to an increase in the VOT to 40 (sounds like ‘pa’ to an adult)
If first habituating to a particular acoustic signal with a VOT of 60 ms, they do/do not dishabituate to an increase in the VOT to 80 (both sound like ‘pa’ to an adult)
dishabituate
do not
While infants’ ability to perceive phonemes improves with time/experience, their ability to discern sounds not commonly used in their native language _
Kuhl et al. (2000) study:
At 6 months of age, American and Japanese infants are equally good at discriminating phonemes /ra/ and /la/
By about 12 months, the American infants have gotten _, while the Japanese infants have gotten _
diminishes
better, worse
Results of Kuhl (2000) demonstrate a typical trade-off related to experience-dependent plasticity: we get _ at what we practice, and become _ at what we don’t
The _ _hypothesis proposes that our brain ‘gates’ specific mechanisms that are important/required for speaking particular languages
better, worse
social gating