Page 12 - Fall 2007
P. 12

 tion and processing in this population so that we may better understand the mechanisms that underlie delays and deficits in language acquisition.
Neurobiological techniques for studying speech perception in children with ASD
A number of informative neurobiological techniques have recently been applied to the study of individuals with autism and related disorders, making the study of even severely affected individuals possible. Event-related poten- tials (ERP) provide a direct measure of neural processing that can help reveal any potential underlying processing dif- ferences in affected and unaffected individuals. ERP studies of speech and tone processing have identified a characteris- tic auditory waveform. Generally, in research focusing on ERP it has been found that verbal children with autism spectrum disorders are less attentive to speech than typical- ly developing controls, exhibiting differences in the P3, a brain component that reflects attention to stimuli in the environment (Courchesne et al., 1985; Dawson et al., 1988). Kuhl et al. (2005) used mismatch negativity (MMN), an ERP component that reflects discrimination of a stimulus change. They reported a correlation between degree of social deficits, expressive language skill and ability to dis- criminate a consonant-vowel syllable contrast in preschool- aged children with ASD. Further, Lepistö and colleagues (2005, 2006) have examined speech perception in individu- als with ASD using ERP techniques and discovered poorer discrimination of duration changes of speech sounds in affected individuals compared to typically developing con- trols, even in those with Asperger syndrome, in whom lan- guage development is relatively spared.
Magnetoencephalography (MEG), like ERP, is a non- invasive technique used to assess perceptual processing in individuals with ASD. MEG makes use of magnetic fields produced by electrical activity in the brain. Using MEG, researchers found that children with ASD have demonstrat- ed a significantly delayed electrophysiological response to a change in both speech (vowels) and non-speech stimuli as compared to typically developing controls (Oram Cardy et al., 2005). Overall, these studies reveal that children with autism spectrum disorders exhibit difficulty in discriminat- ing and/or recognizing sounds that can be detected at a basic level of processing.
Another method employed in speech perception research with individuals with autism and related disorders is function- al magnetic resonance imaging (fMRI). Using volumetric measurements of blood flow, fMRI can provide information about activation in areas of the brain implicated in processing various types of stimuli. Gervais and colleagues (2004) report- ed an atypical pattern of response to vocal sounds in verbal adult males with autism as measured with fMRI. As a group, the affected individuals showed significantly less activation of the superior temporal sulcus, an area associated with percep- tion of speech, and voices in particular (Belin et al., 2000). Using fMRI, Bigler et al. (2007) report a dissociation between language skill and superior temporal gyrus size in affected individuals as compared to typically developing controls.
 Further, fMRI scans of affected individuals studied by Herbert and colleagues (Herbert et al., 2002) revealed abnormal asym- metry in individuals with autism in brain regions associated with language.
Audiovisual speech perception
A major focus of research has been on detecting the sounds of speech in the development of language. This research studies language use in a non-communicative set- ting, involving a single language user—the subject. However, a great deal of our daily communication takes place in a face-to-face context. Accordingly, it is not sur- prising that visual information about speech has been shown to influence what typical listeners hear, assisting not just in the recognition of speech in noise (Sumby and Pollack, 1954), but in the perception of unambiguous speech as well (Desjardins et al., 1997; Reisberg et al., 1987). One powerful demonstration of the influence of visual information on what is heard is perceptual integration of mismatched audiovisual (AV) speech. McGurk and MacDonald (1976) first demonstrated this by presenting mismatching audio and video consonant-vowel (CVCV) tokens to perceivers. Perceivers watching these dubbed pro- ductions sometimes reported hearing consonants that com- bined the places of articulation of the visual and auditory tokens (e.g., visual /ba/ + auditory /ga/ heard as /bga/), or “fused” the two places (e.g., visual /ga/ + auditory /ba/ heard as /da/), or reflected the visual place information alone (visual /va/ + auditory /ba/ heard as /va/). This visual influence on mismatched speech, called the “McGurk effect” has been described as compelling for those per- ceivers who get the effect, occurring even when a perceiver is aware of how the stimuli have been manipulated (Massaro, 1987). In addition, the McGurk effect is robust, and has been demonstrated in the context of a number of manipulations, including asynchronous auditory and visual signals (Munhall et al., 1996), non-frontal views of the speaker’s face (Massaro, 1998), size reduction of visual stim- uli (Jordan and Sergeant, 1998), point-light displays of the articulators (Johnson and Rosenblum, 1996) and presenta- tion of very brief visual stimuli (Irwin et al., 2006). In addi- tion to natural speech, synthetic speakers have been devel- oped to allow for precise manipulation of the auditory and visual speech signals, e.g., Massaro, 1998; Rubin and Vatikiotis-Bateson, 1998). The ability to integrate audiovi- sual speech is thought to be present at birth (Meltzoff and Kuhl, 1994). Visual influence on heard speech has been demonstrated in infants as young as 5 months of age (Rosenblum et al., 1997).
The robustness of the visual influence on heard speech and the early age at which it occurs, e.g., Rosenblum et al., 1997, suggests that the use of visual speech information is a central part of typical perceptual development. In typical per- ceivers, sensitivity to visual speech information is evident in infancy (Rosenblum et al., 1997) and is thought to foster native language acquisition (Legerstee, 1990). In contrast, children with ASD show reduced social gaze to others’ faces, when speech is produced (Hobson et al., 1988). This reduc-
10 Acoustics Today, October 2007

























































































   10   11   12   13   14