Page 27 - Fall 2011
P. 27
Fig. 2. (top) Schematic depictions of four kinds of “nonlinear phenomena”. Each vocalization begins in stable, har- monic form, then undergoes bifurcation to a different vocal-fold vibration regime. (bottom) A rhesus monkey scream that includes each of the nonlinear phenomena illustrated above. While less than a second in duration, the scream includes at least 22 bifurcations among qualitatively distinct vibration regimes. tively. A key point is that the two vocal folds influence one another when vibrating, and thereby constitute a coupled, nonlinear-dynamical system. All vocalizations are therefore technically “nonlinear” in nature, with the vocal folds exhibiting characteristic vibration regimes that represent attractor states familiar from classic chaos theory (Wilden et al., 1998; Fitch et al., 2002). It is nonetheless useful to differ- entiate between harmonic vocalizations and nonlinear phe- nomena, illustrated in Fig. 2. The former reflect regular, well- synchronized vibration, while the latter include abrupt fre- quency jumps, perceptually jarring spectral sidebands believed to be produced by laryngeal amplitude-modulation effects, and viscerally grating deterministic chaos (Riede et al., 2004). While not yet systematically documented, nonlinear phenomena are likely present in every mammalian vocal repertoire, specifically including primates. A critical implica- tion is that the biomechanics of the larynx itself can be pri- mary in determining the qualitatively distinct vocal-types a given species produces (Brown et al., 2003). In other words, whereas the vocalizer’s central nervous system determines global “system parameters” such as sub-glottal air pressure and laryngeal muscle tensions, the larynx itself is the ultimate arbiter of vocal-fold behavior. As in other nonlinear systems, the coupled vocal folds show “exquisite sensitivity” to minor changes in global parameters, with even very small changes potentially producing near-instantaneous bifurcation into qualitatively different vibratory regimes and associated acoustics. Humans as primates—Overall, it is clear that the human voice has ancient phylogenetic roots. Vocal-tract design is fundamentally similar across mammals, including humans, with cor- responding operating principles. As in primate and non-primate mammals alike, the human larynx is a nonlinear- dynamical system whose vibration regimes represent attractor states that give rise to a range of qualitatively dif- ferent source signals. Any such energy is subsequently shaped by supralaryn- geal cavities, including when the source is simply turbulence in the airflow. In the absence of species-specific modifi- cations, supralaryngeal filtering effects are expected to be similar in humans and larger-bodied mammals. Humans are also clearly mammal-like in being endowed with a repertoire of highly heritable, emotion-triggered signals such as spontaneous crying and laugh- ter (Owren and Goldstein, 2008). These sounds emerge in recognizable form very early in life, without appar- ent need for practice or even to first hear the sounds from others (Owren et al., 2011). Infant crying in particular is marked by chaotic vibration (Mende et al., 1990) resembling that observed in nonhuman primate screaming (Tokuda et al., 2002). Spontaneous, emotion-triggered vocalizations remain important even as the child gains increasing volitional con- trol over sound production and begins to speak. Humans do have their own specializations, of course, including a thick, highly mobile tongue used to flexibly alter supralaryngeal resonances, and an exceptional degree of voli- tional control over sound production (Owren et al., 2011). Because supralaryngeal filtering is largely static in nonhuman primates (although see Riede and Zuberbühler, 2003), their vocalizations can be characterized as fundamentally “laryn- geal” in nature. In other words, vocal quality is primarily determined by the laryngeal vibration regime involved, which is also the case for spontaneous crying and laughter in humans. In contrast, human speech is marked by a relative paucity of source-energy types—essentially, quasi-periodic phonation versus turbulent noise. In other words, production is importantly “supralaryngeal,” with the tongue, mandible, and lips used to flexibly and dynamically create the many sounds of each different language. Human vocal-fold structure and response also show important developmental changes (Schweinfurth and Thibeault, 2008; Hartnick et al., 2005). One evident conse- quence is that the vibration regimes underlying the psyche- shattering shrieks and screams characteristic of young children 26 Acoustics Today, October 2011