Page 19 - Volume 12, Issue 2 - Spring 2012
P. 19

 SPATIAL RELEASE FROM MASKING
Ruth Y. Litovsky
University of Wisconsin-Madison, Waisman Center 1500 Highland Avenue, Madison Wisconsin 53705
 “When you next find yourself in a “cocktail party” environment, imagine what incredible processes the auditory system has to segregate speech from noise.”
Spatial release from masking in adults
In complex auditory environments
multiple sounds occur, such as peo-
ple uttering speech that is of interest,
as well as speech sounds with uninter-
esting content. Additionally, humans
spend a great deal of their awake hours
in social, work-related and learning
environments that contain maskers:
background noise, music and various
other environmental sounds, all of
which can vary in direction, amplitude and familiarity to the listener, and have the potential to interfere with information transmitted by the speech signal. To communicate using spo- ken language, listeners must be able to use auditory cues to attend to the speech source of interest and ignore other sounds. When you next find yourself in a “cocktail party” environment, imagine what incredible processes the auditory system has to segregate speech from noise.
The ability to segregate speech from maskers is deter- mined by a complex set of auditory computations. This prob- lem was named the “cocktail party effect” 60 years ago (Cherry, 1953; Pollack and Pickett, 1958) and has been the topic of dozens of studies since, in normal hearing adults and also in children. This topic has also become a focal point for populations of hearing impaired individuals, who often expe- rience difficulty when hearing speech in noisy situations. These populations include listeners with hearing loss who are fitted with hearing aids, and also individuals who are deaf and undergo surgical procedures to receive cochlear implants to be able to hear.
Regarding the analysis of acoustic inputs, the auditory mechanisms involved in source segregation either process information from each ear separately (monaural) or compare the information arriving from two ears and use the inter- aural (between-the-ear) differences (binaural). In addition,
in the process of segregating signal tar- get speech from competing sounds, the human brain engages in higher-order processes such as auditory attention and memory.
Concerning the acoustic cues in a normally functioning auditory system, when sounds reach the ears from a par- ticular location in space, the spherical shape of the head renders an important set of acoustic cues. Figure 1 provides a schematic of the directionally depend-
ent cues that would be potentially available to listeners in the horizontal plane for a brief signal such as a click. In the hor- izontal plane, sources presented from directly in front or behind reach the ears at the same time and with the same intensity. Sources that are displaced to the side will reach the near ear before reaching the far ear. Thus, a binaural cue known as inter-aural time difference (ITD) varies with spa- tial location; however, the auditory system is particularly sen- sitive to ITD at frequencies below 1,500 Hz. For amplitude- modulated signals such as speech, ITD cues are also available from differences in the timing of the envelopes (slowly vary- ing amplitude) of the stimuli. Inter-aural level difference (ILD) is a second binaural cue that results from the fact that the head creates an acoustic “shadow” so that the near ear receives a greater intensity than the far ear. ILDs are particu- larly robust at high frequencies and can be negligible at fre- quencies as low as 500 Hz.
When listening to speech in noise, spatial cues play an important role in improving speech understanding. The improvement arises when one compares conditions in which the signal and masker are co-located (for example both at 0 degrees in front of the listener) compared with a situation in which they are spatially separated (i.e., the target speech is at 0 degrees in front and the masker is at 90 degrees to the right). This example is illustrated in the schematic in Fig. 2.
  Fig. 1. In panel A, a schematic of a sound source presented at 45o to the left of the listener is depicted. Panel B shows the time waveforms of impulse responses recorded in the left (thick line) and right (thin line) ear canals, for that sound source. Panel C shows the amplitude spectra for the same source, also recorded in the left (thick line) and right (thin line) ear canals. The left ear response occurs sooner than the right ear response (see B), hence the interaural time difference (ITD). In addition, the left-ear response has greater amplitude (see C), hence the interaural level difference (ILD).
18 Acoustics Today, April 2012











































































   17   18   19   20   21