Human Perception (Auditory) Flashcards
How we perceive sound (from sound wave to auditory nerve)
Sound waves travel through the air and enter the ear, where they cause the eardrum to vibrate. These vibrations are then transmitted through the middle ear bones to the cochlea, a fluid-filled spiral-shaped organ in the inner ear. The cochlea converts the mechanical vibrations into neural signals, which are sent through the auditory nerve to the brain, where they are perceived as sound.
How changes in sound pressure level (dB) translate to perception
Changes in sound pressure level (dB) translate to perception in a logarithmic manner, meaning that a small increase in dB results in a relatively large increase in perceived loudness. For example, a 10 dB increase in sound pressure level is perceived as a doubling of loudness, while a 20 dB increase is perceived as a quadrupling of loudness. This is because the human ear is sensitive to a wide range of sound pressures, and it perceives changes in sound pressure level as a logarithmic function of the actual pressure level.
At which frequencies we can hear, and how hearing is frequency-dependent
Humans can typically hear frequencies in the range of 20 Hz to 20,000 Hz, with the greatest sensitivity around 2-4 kHz. Hearing is frequency-dependent as different parts of the cochlea are specialized for detecting different frequencies, with the basal end being most sensitive to low frequencies and the apical end being most sensitive to high frequencies. As a result, different frequencies are perceived differently by the ear and the brain, and some frequencies may be perceived as louder or quieter than others.
How we know from where in the room a signal is coming from – and even how large the room is
We can determine the location of a sound source in a room by using spatial cues, which include the difference in sound arrival time at the two ears, the difference in sound intensity at the two ears, and the filtering of sound by the head, ears, and body.
What masking is (temporal and frequency-dependent masking, including spatial release from masking)
Masking is the phenomenon where one sound makes it harder to hear another sound. Temporal masking occurs when a loud sound makes it difficult to hear a quieter sound that comes before or after it. Frequency-dependent masking occurs when a sound at a certain frequency makes it difficult to hear sounds at other frequencies. Spatial release from masking refers to the phenomenon where the masking effect of a sound is reduced when the source of the masking sound and the source of the target sound are in different locations.
Please describe the effect of inter-aural time differences. For which frequencies does it work and why is it limited to those frequencies?
Inter-aural time differences (ITD) is a way our brain uses to locate sound by measuring the time delay between the arrival of sound waves at each ear. It works best for sounds in the range of 1kHz to 4kHz because the head and ears can affect the arrival time of these frequencies.
Please describe the cone of confusion for spatial perception
The cone of confusion is an area where the brain has difficulty determining the location of a sound. It is caused by sounds that are similar in loudness and frequency, making it hard for the brain to distinguish between them.
We have learned about the concept of RT60 reverberation time. How can we estimate the RT60 time in a room that is not perfectly quiet?
Reverberation time RT60 is the time it takes for sound to decay 60 dB after the sound is stopped. To estimate RT60 time in a room that is not quiet, we use impulse sound method, where we measure the decay of sound after a loud, short burst of noise is produced in the room.
Explain overall about Localization of a sound sources
Localization of a sound source refers to the ability of the brain to determine the position of a sound in space. There are several cues that our brain uses to localize sound:
Interaural time difference (ITD) - This is the difference in time that sound waves take to reach each ear. It helps the brain determine the horizontal location of a sound source.
Interaural level difference (ILD) - This is the difference in loudness of a sound between the two ears. It helps the brain determine the vertical location of a sound source.
Spectral colouring - This is the difference in the frequency content of a sound between the two ears. It helps the brain determine the location of a sound source in the horizontal plane.
Direct-to-reverberation ratio - This is the ratio of the direct sound to the reflected sound in a room. It helps the brain determine the distance of a sound source.
Initial time gap - This is the time delay between the arrival of the direct sound and the first reflections in a room. It helps the brain determine the distance of a sound source.