Page 30 - Summer 2006
P. 30

 ception of the spectral timbre of a single
unmodulated component (sine wave) is
different depending on how long the sig-
nal was played, so long as the length of
time exceeds a minimum auditory thresh-
old of timbre perception. Additionally,
the human auditory system has the ability
to track the changing pitch of a frequency
modulated sinusoid over time, while
retaining the perception of sine wave tim-
bre just as if there were no frequency
modulation. Neither of these attributes is
true of the classical Fourier spectrum of
such sounds—a short sine wave suppos-
edly has a larger and louder band of extra
frequency components than a longer one,
and a chirp has a similar band of such
“components” surrounding the single modulated one we know is there. So, the timbre analysis that is performed by the ear and brain is probably not like a Fourier analysis, since we do not auditorily experience the spectral smearing of frequencies when they are modulated.
Components, by any other frame, would sound complete
The conventional spectrogram, as with all so-called
time-frequency representations, provides us with a picture of
how the signal’s energy is distributed in time and frequen-
cy—Fourier’s frequency, that is. This is the root of its prob-
lems. Rather than trying to improve this representation,
which decades of signal processing research have shown to
be impossible to any significant extent, a few independent
thinkers worked to develop a new kind of image that would
show the time course of the instantaneous frequencies of
the components in a multicomponent signal. The seminal
papers which put forth the original technique—called the
modified moving window method—were written by
Kunihiko Kodera and his colleagues C. de Villedary and R.
7,8
Their jumping off point was a paper published by Rihaczek9 which demonstrated the connection between instantaneous frequency and the argument (phase) of a complex analytic signal—basically a complex version of a real signal, where the nonphysical imaginary part is related by the Hilbert transform to the physical real part. The main insight added by Kodera et al. was the recognition that the short-time Fourier transform—a 2D function having com- plex values whose magnitude yields the spectrogram—can be regarded as a “channelizing” of the signal into a bunch of complex analytic signals, one for each Fourier spectral fre- quency, from which the instantaneous frequencies of line components in the signal can be computed.
Gendrin.
Unfortunately their method was almost perfectly ignored by the signal processing and acoustics communities at that time and for many years afterward, an all-too-com- mon attribute of truly original work. More recently, howev- er, the idea has been revived through alternative meth- ods,10,11 and also followed up with theoretical improve- ments,12 and these developments have led to the adoption of the reassigned (also more wordily called the time-corrected
 “One especially useful improvement on the spectrogram was devised and implemented as s oftware thirty years ago, but it was somehow drowned in the signal processing hubbub.”
 instantaneous frequency) spectrogram by a small number of applied researchers.
To understand the new approach, one must consider the signal as a super- position, not of pure sine waves as Fourier taught us, but rather of the gener- alized line components already men- tioned, which may have amplitude or fre- quency modulation. The objective now is to compute the instantaneous frequencies of these line components as the signal progresses through time. Because of a duality within the procedure, it is also possible to compute a sort of “instanta- neous time point” for the excitation of each line component along the way, which is technically the group delay asso-
ciated with each digital time index.
Note that since there is no mathematically unique
way to decompose a signal into line components, the decomposition always depends upon the analysis frame length, with longer frames able to capture lower-frequency components and also to resolve multiple components which are close in frequency. Yet, each different decompo- sition of a signal is an equally valid representation of its physical nature—it is up to the analyst to decide whether a long or short frame analysis is appropriate to highlight the physical aspects that are of interest in a particular case.
As Kodera et al. demonstrated, the partial derivative in
time of the complex short-time Fourier transform (STFT)
argument defines the channelized instantaneous frequencies,
so that in the digital domain each quantized frequency bin
is assumed to contain, at most, one line component, and the
instantaneous frequency of that component is thus comput-
ed as the time derivative of the complex angle in that fre-
10
quency bin.
Dually, the partial derivative in frequency of
the STFT phase defines the local group delay, which pro-
vides a time correction that pinpoints the precise occur-
rence time of the excitation of each of the line compo-
10
In a reassigned spectrogram (e.g., Figs. 2 and 4) the computed line components are plotted on the time-fre- quency axes, with the magnitude of the STFT providing the third dimension just as in the conventional spectrogram. One literally reassigns the time-frequency location of each point in the spectrogram to a new location given by the channelized instantaneous frequency and local group delay, whereas in the conventional spectrogram the points are plotted on a simple grid, at the locations of the Fourier fre- quencies and the time indices. A by product of this is that the reassigned spectrogram is no longer a time-frequency representation, nor even the graph of a function; the images presented here are 3D scatterplots with their z-axis values shown by a colormap. This article will not delve into the particular algorithms that may be used to compute a reas- signed spectrogram—the problem there is in essence how one computes the time and frequency derivatives of the com- plex STFT argument. An historical and technical review of
nents.
the subject, complete with a variety of algorithms for compu-
tation, is provided elsewhere by the authors.
13
28 Acoustics Today, July 2006





































   28   29   30   31   32