Page 40 - Spring2019
P. 40

Hearing in the Classroom
          Figure 1. This cartoon illustrates the cocktail party problem in the classroom. In this example, acoustic waveforms are produced by three sources: (1) noise is produced by a computer projector in the classroom; (2) speech is produced by the teacher; and (3) speech is produced by two classmates who are also talking. The fundamental problem is that the acoustic waveforms produced by all three sound sources combine in the air before arriving at the students’ ears. To fol- low the teacher’s voice, students must “hear out” and attend to their teacher while disregarding the sounds produced by all other sources.
Immaturity at any stage of processing can impact the extent to which students in the classroom hear and understand the target voice. For example, spectral resolution refers to the ability to resolve the individual frequency components of a complex sound. Degraded spectral resolution is one conse- quence of congenital hearing loss, specifically sensorineural hearing loss caused by damage to the outer hair cells in the cochlea. This degraded peripheral encoding may reduce au- dibility of the target speech, making it impossible for adults or children with sensorineural hearing loss to perform audi- tory scene analysis. Perhaps less obvious, immature central auditory processing could result in the same functional out- come in a child with normal hearing. For example, the per- ceptual consequence of a failure to selectively attend to the speech stream produced by the teacher, while ignoring class- mates’ speech, is reduced speech understanding, even when the peripheral encoding of the teacher’s speech provides all the cues required for recognition.
Maturation of Peripheral Encoding
Accurate peripheral encoding of speech is clearly a prerequi- site for speech recognition. However, sensory representation of the frequency, temporal, and intensity properties of sound does not appear to limit auditory scene analysis during the school-age years. The cochlea begins to function in utero, be- fore the onset of visual functioning (Gottlieb, 1991). Physio- logical responses to sound provide evidence that the cochlea 38 | Acoustics Today | Spring 2019
is mature by term birth, if not earlier (e.g., Abdala, 2001). Neural transmission through the auditory brainstem appears to be slowed during early infancy, but peripheral encoding of the basic properties of sound approaches the resolution observed for adults by about six months of age (reviewed by Eggermont and Moore, 2012; Vick, 2018).
A competing noise masker can interfere with the peripheral encoding of target speech if the neural excitation produced by the masker overlaps with the neural representation of the target speech. This type of masking can be more severe in children and adults with sensorineural hearing loss than in those with normal hearing. Sensorineural hearing loss is of- ten due to the loss of outer hair cells in the cochlea (reviewed by Moore, 2007). As mentioned above, outer hair cell loss degrades the peripheral encoding of the frequency, inten- sity, and temporal features of speech, which, in turn, impacts masked speech recognition. Indeed, multiple researchers have demonstrated an association between estimates of pe- ripheral encoding and performance on speech-in-noise tasks for adults with sensorineural hearing loss (e.g., Dubno et al., 1984; Frisina and Frisina, 1997).
Additional evidence that competing noise interferes with the perceptual encoding of speech comes from the results of studies evaluating consonant identification in noise by adults (e.g., Miller and Nicely, 1955; Phatak et al., 2008). Consonant identification is compromised in a systematic way across individuals with normal hearing when competing noise is present, presumably because patterns of excitation produced by the target consonants and masking noise overlap on the basilar membrane (Miller, 1947). In the classroom example shown in Figure 1, overlap in excitation patterns between speech produced by the teacher and noise produced by the projector can result in an impoverished neural representa- tion of the teacher’s spoken message, although this depends on the relative levels of the two sources and distance to the listener. The term energetic masking is often used to describe the perceptual consequences of this phenomenon (reviewed by Brungart, 2005).
Despite mature peripheral encoding, school-age chil- dren have more difficulty understanding speech in noise compared with adults. For example, 5- to 7-year-old chil- dren require a 3-6 dB more favorable signal-to-noise ratio (SNR) than adults to achieve comparable speech detection, word identification, or sentence recognition performance in a speech-shaped noise masker (e.g., Corbin et al., 2016). Speech-in-noise recognition gradually improves until 9-10 years of age, after which mature performance is generally ob-
 

























































































   38   39   40   41   42