Summer 2018

Page 15 - Summer 2018

P. 15

Figure 4. Spectrographic images of two productions of the phrase “shelled egg,” patterned after productions examined in Tajima et al. (1997). Top: duration patterns from a nonnative speaker of Mandarin Chinese; bottom: duration patterns from a native speaker of American English. Angled lines, portions of the spectral images associated with the text at the bottom; red lines, mapping of timing patterns between the two produc- tions, by connecting parallel segments; blue lines, “extra” seg- ments in the nonnative productions, where there is a strong, vowel-like release to the last consonant in “shelled” and “egg.”
The intelligibility evaluation scores, shown in Figure 5, tell the story of the effect of the hybridization. Native produc- tions with native timing patterns (top left red circle) yielded intelligibility estimates around 90%, and nonnative produc- tions with nonnative timing patterns were much less intel- ligible, about 40% (bottom right blue circle). Once these timing patterns were artificially corrected, the speech of the hybrid nonnative speech with the “corrections” from the na- tive speech was substantially more intelligible, rising almost 20%. These intelligibility offsets went in both directions so that native speech, with almost 90% phrase accuracy in noise, lost nearly 10% when hybridized with the timing pat- terns of the nonnative speaker.
The point of these results is fairly clear: there is more to speak- ing the second language as a native than just getting the in- dividual consonant and vowel qualities right. The whole en- semble of consonants and vowels needs to be executed with the right timing patterns to be treated as native speech.
Figure 5. Estimates of intelligibility for native speakers of Mandarin and of English, with timing patterns imprinted from matched English and Mandarin productions. Data from Tajima et al. (1997).
A closer look at Tajima et al.’s (1997) results suggests how disastrously unintelligible accented speech can be when the speech occurs with background noise. (The data from Fig- ure 5 were obtained by embedding the speech with a −5-dB noise masking.) It also points out a particularly troublesome locus for the listeners’ errors, pointing to yet another aspect of different languages that presents a challenge to the non- native speaker. Many of the most troublesome errors oc- curred in cases in which the hybridization process had to either add or delete an acoustical segment completely. For example, as illustrated in Figure 4, one of the phrases was “shelled egg,” in which the native speakers neatly run the fi- nal ‘d’ in “shelled” into the vowel at the beginning of “egg.” Chinese native speakers, by contrast, produced a very robust consonant release for the ‘d’ in “shelled.” This nonnative pro- duction was heard by listeners as various things, including the phrase “shall I ask?” Although the intended form only has two syllables, “shelled” and “egg,” the perceived form ended up with three syllables: one, apparently the “I” cor- responding to the strong release of the ‘d’ consonant. There were quite a number of these sorts of errors, including, for example, the intended “limit contour” being heard as “leave it on the counter,” with “on” corresponding to the release of the ‘t’ in “limit” and the intended “change color” being heard as “twenty caller,” with the ‘ty’ on “twenty” corresponding to the ‘ge’ on “change.”
The reason for this heavy releasing effect seems to be related to the possible sequences of sounds in Mandarin Chinese and English. English allows words to end with consonants, such as the last one in “limit” and “change,” but Mandarin
Summer 2018 | Acoustics Today | 13

13 14 15 16 17