Page 52 - Fall2019
P. 52

Sensory Modality and Speech Perception
Future research can be designed to test additional aspects of the theory. Fortunately, the theory makes some very spe- cific predictions. For example, if multisensory perception is actually a consequence of common supramodal informa- tion contained in both light and sound, then “integration” functionally occurs at the level of the stimulus input. If this is true, evidence of integration should be observed at the earli- est stages. Potentially, early integration is already evidenced by (1) visual modulation of early auditory brain areas and (2) crossmodal influences of low-level speech features, such as the voice onset timing distinguishing “p” from “b.” How- ever, other researchers have argued that the modalities stay separate up through the determination of words (Bernstein et al., 2004a). Future research will need to examine additional evidence for early versus later integration of the channels.
Relatedly, if as the supramodal approach claims, integration
is a function of the input itself, then integration should be “impenetrable” to other cognitive influences (e.g., higher level linguistics, attention). However, a number of studies have shown higher level lexical influences on the strength of the
McGurk effect (e.g., Brancazio, 2004), contrary to the predic- tion of the supramodal account. As intimated above, however, the McGurk effect is not a straight forward tool for measuring integration. Very recent research suggests that lexical influ- ences may actually bear on postintegration categorization of segments (Dorsi, 2019). Still, more research is needed to determine the degree to which multisensory integration is impenetrable to outside cognition.
Finally, although work has been conducted to discover supramodal information across audio and visual channels, similar principles may apply to the haptic channel as well. As discussed, the haptic channel seems to induce the same per- ceptual and neurophysiological cross-sensory modulations as audio and visual speech. It is less clear how an informational form in the haptic stream could be supramodal with the other channels (but see Turvey and Fonseca, 2014). Future research can address this question to explain the miraculous abili- ties of Rick Joy to provide his speech brain with articulatory information from a most surprising source.
References
Alsius, A., Paré, M., and Munhall, K. G. (2018). Forty years after hearing lips and seeing voices: the McGurk effect revisited. Multisensory Research 31(1-2), 111-144.
Altieri, N., Pisoni, D. B., and Townsend, J. T. (2011). Some behavioral and neurobiological constraints on theories of audiovisual speech integra- tion: A review and suggestions for new directions. Seeing and Perceiving 24(6), 513-539.
Arnold, P., and Hill, F. (2001). Bisensory augmentation: A speechreading advantage when speech is clearly audible and intact. British Journal of Psychology 92(2), 339-355.
Bernstein, L. E., Auer, E. T., Jr., and Moore, J. K. (2004a). Convergence or association? In G. A. Calvert, C. Spence, and B. E. Stein (Eds.), Handbook of Multisensory Processes. MIT Press, Cambridge, MA, pp. 203-220.
Bernstein, L. E., Auer, E. T., Jr., and Takayanagi, S. (2004b). Auditory speech detection in noise enhanced by lipreading. Speech Communication 44(1), 5-18.
Bertelson, P., and de Gelder, B. (2004). The psychology of multi-sensory per- ception. In C. Spence and J. Driver (Eds.), Crossmodal Space and Crossmodal
Attention. Oxford University Press, Oxford, UK, pp. 141-177.
Brancazio, L. (2004). Lexical influences in audiovisual speech perception. Journal of Experimental Psychology: Human Perception and Performance
30(3), 445-463.
Brancazio, L., and Miller, J. L. (2005). Use of visual information in speech
perception: Evidence for a visual rate effect both with and without a
McGurk effect. Perception & Psychophysics 67(5), 759-769.
Burnham, D., Ciocca, V., Lauw, C., Lau, S., and Stokes, S. (2000). Percep-
tion of visual information for Cantonese tones. Proceedings of the Eighth Australian International Conference on Speech Science and Technology, Australian Speech Science and Technology Association, Canberra,
December 5-7, 2000, pp. 86-91.
Callan, D. E., Callan, A. M., Kroos, C., and Vatikiotis-Bateson, E. (2001).
Multimodal contribution to speech perception reveled by independent component analysis: A single sweep EEG case study. Cognitive Brain Research 10(3), 349-353.
Callan, D. E., Jones, J. A., and Callan, A. (2014). Multisensory and modality specific processing of visual speech in different regions of the premotor cortex. Frontiers in Psychology 5, 389. https://doi.org/10.3389/ fpsyg.2014.00389.
Callan, D. E., Jones, J. A., Munhall, K., Callan, A. M., Kroos, C., and Vatikiotis-Bateson, E. (2003). Neural processes underlying perceptual enhancement by visual speech gestures. NeuroReport 14(17), 2213-2218. https://doi.org/10.1097/00001756-200312020-00016.
Calvert, G. A., Bullmore, E. T., Brammer, M. J., Campbell, R., Williams, S. C., McGuire, P. K., and David, A. S. (1997). Activation of auditory cortex during silent lipreading. Science 276(5312), 593-596. https://doi. org/10.1126/science.276.5312.593.
Delvaux, V., Huet, K., Piccaluga, M., and Harmegnies, B. (2018). The per- ception of anticipatory labial coarticulation by blind listeners in noise: A comparison with sighted listeners in audio-only, visual-only and audiovi- sual conditions. Journal of Phonetics 67, 65-77.
Dias, J. W., and Rosenblum, L. D. (2016). Visibility of speech articula- tion enhances auditory phonetic convergence. Attention, Perception, & Psychophysics 78, 317-333. https://doi.org/10.3758/s13414-015-0982-6.
Dorsi, J. (2019). Understanding Lexical and Multisensory Context Support of Speech Perception. Doctoral Dissertation, University of California, Riverside, Riverside.
Fowler, C. A., and Dekle, D. J. (1991). Listening with eye and hand: Cross- modal contributions to speech perception. Journal of Experimental Psychology 17(3), 816-828.
Fowler, C. A., Shankweiler, D., and Studdert-Kennedy, M. (2015). “Percep- tion of the speech code” revisited: Speech is alphabetic after all. Psychological Review 123(2), 125-150.
  52 | Acoustics Today | Fall 2019


































































   50   51   52   53   54