Page 43 - Winter 2020
P. 43

 One Singer, Two Voices
Johan Sundberg, Björn Lindblom, and Anna-Maria Hefele
   Introduction: Rendering Melodies with Overtones
A single singer but two voices? Experience that situa- tion by visiting and check the second movie with the title “Sehnsucht nach dem Frühlinge (Mozart) — Anna-Maria Hefele (AMH). There, coauthor AMH sings a song by Mozart, first with her singing voice and then with two simultaneous voices, a drone (a low-pitched, continuously sounding tone) plus a whistle-like high-pitched tone that renders the melody. How is this possible? That is the question that we pose here. Let us start by recalling how sounds are created by the instrument AMH is playing, the human voice.
Vocal Sound Generation
Figure 1 shows a frame from the movie mentioned above. It shows a magnetic resonance imaging (MRI) with the various parts of the voice organ labeled. Voice production is the summed result of three processes: (1) compression
©2021 Acoustical Society of America. All rights reserved.
of air below the vocal folds; (2) vocal fold vibration, quasi- periodically chopping airflow from the subglottal region; and (3) filtering of the acoustic signal of this pulsatile airflow.
The overpressure of air below the folds throws them apart, thus allowing air to pass through the slit between them. Then, aerodynamic conditions reduce the air pressure along the folds, which, together with the elasticity of their tissue, closes the slit. The same pattern is then repeated,
thus generating vocal fold vibration.
The vibration generates a pulsatile airflow as seen in Figure 2A, producing sound, the voice source. The pitch is deter- mined by the vibration frequency, whereas the waveform is far from sinusoidal. Hence, this airflow signal is com- posed of a number of harmonic partials. In other words, the frequency of a partial number (n) = n × fo, where fo is the frequency of the lowest partial, the fundamental or vibration frequency. The amplitudes of the partials tend to decrease with their frequency; the amplitude of n tends to be something like 12 dB stronger than the amplitude of n × 2. The spectrum envelope of the voice source is rather smooth and has a negative slope as seen in Figure 2B.
The voice source is injected into the vocal tract (VT), which is a resonator. Hence it possesses resonances at certain frequencies. Partials with frequencies close to a VT resonance frequency are enhanced and partials fur- ther away are attenuated (see Figure 2C). Therefore, the spectrum envelope of the sound radiated from the lip opening (Figure 2A) contains peaks at the VT resonance frequencies and valleys in-between them. In this sense, the VT resonances form the spectrum envelope of the sound emitted to the free air. Probably for this reason, VT resonances are frequently referred to as formants.
The frequencies of the formants are determined by the shape of the resonator composed of the pharynx and the mouth cavities, the VT. For example, the VT length has
  Figure 1. Magnetic resonance (MR) image of Anna-Maria Hefele’s (AMH’s) head and throat, taken from the video where she performs a Mozart melody in overtone singing technique.
 Volume 17, issue 1 | Spring 2021 • Acoustics Today 43

   41   42   43   44   45