Page 12 - Summer 2018
P. 12

Foreign Accent
the outset is that the foreign accent views the speech signal from the point of view of the listener. A point that is high- lighted in this article is just how stringent listeners are; they are often very good at detecting a foreign accent.
Foreign Accent Perception:
The Devil Is in the Details?
Traditional research on speech over the last two centuries usually reduces the complexity of the speaking and listening processes by heavily simplifying the speech signal, suppress- ing much of the information available in the signal. Tradi- tionally, we viewed the signal at the level an alphabetic code similar to western writing systems. This tradition is so per- vasive that disciplines related to language and speech science have developed a standard writing system, the International Phonetic Alphabet. This written form of speech strips away a lot of information from the signal. It does so through the process of categorizing bits of the signal into a fairly small number of consonants and vowels.
However, research on perceiving the speech of nonnative speakers amply shows that a language’s dictates concerning what a speaker is expected to do is much more finely grained and exacting than a representation by transcription would lead one to believe. Nonnative speakers who are very pro- ficient may, in good listening conditions, produce the basic consonant and vowel categories of a language well enough to convey most of the words in a language. But even so, there is much in the speech signal that is easily detectable by listen- ers who are native to the community of the language as be- ing different from what is typical of other (native) speakers. Detecting this is termed perceiving a foreign accent, and it turns out that there is a lot of information in the speech sig- nal of the nonnative speaker that can lead listeners to believe that an individual is not a native speaker of the language.
A most striking demonstration of just how picky our expec- tations for native speech can be was presented at the Miami meeting of the ASA in 2008. Park et al. (2008) reported highlights of a series of perceptual experiments aimed at de- termining the locus of information that indicates that a par- ticular speaker is someone having grown up in Korea (re- ported in detail in Park, 2008). The logic of the experiments was to present various portions of speech with a variety of consonants and vowel combinations, thereby controlling the specific linguistic information available to the listeners. On one extreme were clips with fairly complicated two-syl- lable words, such as “blanket,” “breakfast,” and “razor,” giv- ing many opportunities for speakers to diverge from native
10 | Acoustics Today | Summer 2018
productions. Other productions were shorter, progressing down to specific syllables with various combinations of con- sonants. On the far extreme were productions of just the vowel “ah” (\\\\\\\[α\\\\\\\], the vowel in the word “pot”). Each speaker just produced the vowel by itself. These isolated vowel pro- ductions were included to find a minimal case where the na- tive and nonnative speakers converged to be indistinguish- able. Then, by comparing different added speech segments, Park (2008) sought to determine which segments were con- tributing most to the perception of a foreign accent.
Surprisingly, the perceptual responses showed that the na- tive and nonnative productions, even for these isolated vow- els, were distinguishable.
These results are explicated in Figure 1, which presents measures of discriminability for different types of speech. Park treated the perception of a foreign accent as a signal- detection task like the classic signal-monitoring tasks from the early 1950s. See MacMillan and Creelman (2005) for an extensive treatment of how to measure performance in such tasks. Here, the assumption is that a person’s nonnative origin is encoded as information embedded in the speech signal and that the listener’s task is to detect the presence of that information. The standard measure for such tasks is d', a statistic that evaluates the amount of errors the listeners make, errors in missing the fact that the speaker is nonna- tive and errors in falsely thinking that a native speaker is nonnative. Average d' values for the experiments are plotted in Figure 1. Values of d' of 0 would indicate that the listeners cannot detect the foreign accent. Values of d' for such things as native speakers identifying most consonants produced by native speakers in a quiet environment typically are above 2. Values around 1 indicate, generally, well above chance per- formance, although not with extremely high accuracy rates.
Not surprisingly, listeners were well above chance at detect- ing the foreign accent with the two-syllable words. With the one-syllable words, the accuracy rates were similar. What is most striking is that the estimates that speakers were non- native never converged to the chance level. Even in the cases where the productions included just a single vowel spoken by itself, listeners could still detect some sort of accent in the productions at rates greater than chance.
Various analyses following up on this result showed large dif- ferences between the various listeners in their ability to de- tect the accent in this vowel-only condition. Further analyses failed to reveal any obvious differences in measurements of the acoustics in these cases. Speech science has developed a






















































































   10   11   12   13   14