Page 30 - Spring 2015
P. 30

Pushing the envelope of Auditory Research with Cochlear Implants
Figure 2. Basic speech processing performed by a CI. The Full Sentence stage shows the waveform and spectrogram of the sentence “A large size in stockings is hard to sell.” The Analysis Stage bandpass filters the signal into contiguous bands. For this example, there are 4 channels that cover between 200 and 8,000 Hz. The corner frequencies are logarithmically spaced. For each channel, the waveform and spectrogram are shown. Surrounding each waveform is a black line, which represents the envelope information of that channel. The Envelope Extraction Stage was extracted by using the Hilbert transform and low-pass filtering at 160 Hz. After Envelope Extraction, the envelopes are used to modulate the amplitude of electrical pulse trains for CI users as shown in the Carrier Stage. Alternatively, sine tones or narrowband noises can be modulated by the envelopes and then summed into a single waveform that would be a vocoded version of the original waveform and considered a CI simulation as shown in the Vocoder Synthesis Stage.
  lations rather than all the fast changes in the temporal fine structure. So the envelope is extracted for each channel (see Figure 2, Envelope Extraction Stage).
The last stage is the carrier stage. The envelope extracted from each channel is used to modulate the amplitude of a high-rate electrical carrier signal (see Figure 2, Carrier Stage). These modulated electrical pulse trains then bypass the dead hair cells in the cochlea to excite the spiral ganglia of the auditory nerve directly, which is interpreted as sound and speech. For this reason, people often say that CIs convey temporal but not spectral information. This is not quite cor- rect, however, because providing only one channel of tem- poral envelope information is not very intelligible (which is why the field moved away from single-electrode implants long ago). Rather, CIs present temporal envelope informa- tion on a relatively limited number of spectral channels. The number of spectral channels that we use is enough to con- vey vowel information (Laback et al., 2004), but many finer spectral profile analysis tasks seem inaccessible to CI users (Goupell et al., 2008).
28 | Acoustics Today | Spring 2015
Clearly, the total amount of information has been massive- ly reduced by this process compared with the information available in normal hearing. Fortunately, the primary pur- pose of a CI is to convey speech information. Speech is an incredibly robust signal with plenty of redundancy. It is this redundancy that allows cell phones to pass a very limited spectrum and total amount of information, which, in turn, avoids clogging the cell phone networks. Likewise, probably the only reason a CI works as well as it does is that we have an input signal that is so robust to degradation.
There is a wonderful visual analogy for how a CI works (Harmon and Julesz, 1973). Row A of Figure 3 shows two pictures that have very low resolution that gradually in- creases in resolution. On the far left, you can discriminate the pictures but you may not be able to identify them. On the far right, are the full resolution pictures. But much like a CI, you do not need very high resolution before you have a good guess about what is in the pictures. And with some practice, most everyone can identify the low-resolution presidents in Row B.




























































































   28   29   30   31   32