Spring2020

Page 27 - Spring2020

P. 27

the use of deep neural networks for producing sounds that approximate the salient features of the tonal qualities of an instrument). Physics-based modeling relies on solving equa- tions comprising a mathematical model of the processes at work in an instrument and the body of the musician playing it. Beyond the application of tone production alone, physics- based modeling can be used to provide specific and detailed predictions that can be compared with real phenomena to test the validity and relevance of the physical model itself or to determine whether additional phenomena need to be included in the model. This can contribute to a greater under- standing in the field of musical acoustics overall.
A large class of instances of “applied” musical acoustics use cases involves only the production of instrument sounds, for example, as virtual instruments for musicians and composers. In these use cases, “convincing” sounds need not be derived from detailed physics simulations. Methods for synthesizing musical instrument sounds from a reduced set of salient factors have been available for decades and are continually improv- ing in quality; however, these are usually carefully crafted by human experts with domain-specific knowledge. An alter- native approach seeing increasing popularity and success in recent years is to have a deep neural network “automatically learn” its own representation by using only a large corpus of audio recordings. Creating a NAS-based instrument model may not yield physical insight but also requires none and thus can be developed by those lacking domain-specific physi- cal insight. Such models may require hundreds of hours of recorded audio and days of computation to train. However, once trained, they can execute quickly enough to be used in real time to generate arbitrary numbers of variations without requiring a new physics simulation of all processes involved.
Both physics-based modeling and NAS provide viable path- ways to synthesizing realistic musical instrument sounds, and the choice of which approach to employ will likely depend on the experience and preferences of the developers. Currently, we are unaware of a direct comparison of sound quality between neural-generated and physically modeled sounds; however, this is an avenue we hope to see more of in the future. Also, we are intrigued by the possibility of developers using physically modeled sounds to train neural network systems. Such an approach would mirror other areas of physics in which neural network systems trained on large-scale physics simulations (e.g., of earthquake propaga- tion) are being used to produce suitable approximate results for novel system parameters at a fraction of the time and
expense needed to rerun a full physical model. Thus, it is evident that physics-based modeling and NAS approaches can be seen as complementary methods to advance the field of musical acoustics.
References
Bilbao, S., Torin, A., and Chatziioannou, V. (2015). Numerical modeling of collisions in musical instruments. Acta Acustica united with Acustica 101, 155-173. Available at https://tinyurl.com/btznum.
Çakir, E., and Virtanen, T. (2018). Musical instrument synthesis and mor- phing in multidimensional latent space using variational, convolutional recurrent autoencoders. Proceedings of the Audio Engineering Society 145th International Conference, New York, October 17-19, 2018. Available at https://tinyurl.com/morphae.
Campbell, M., Greated, C., and Myers, A. (2004). Musical Instruments:
History, Technology, and Performance of Instruments of Western Music.
Oxford University Press, Oxford, UK.
Chatziioannou, V., Schmutzhard, S., Pàmies-Vilà, M., and Hofmann, A.
(2019). Investigating clarinet articulation using a physical model and an artificial blowing machine. Acta Acustica united with Acustica 105, 682- 694. Available at https://tinyurl.com/chatz19.
Chatziioannou, V., and van Walstijn, M. (2015). Energy conserving schemes for the simulation of musical instrument contact dynamics. Journal of Sound and Vibration 339, 262-279. Available at https://tinyurl.com/chatz2015.
Dalmont, J.-P., Gilbert, J., and Ollivier, S. (2003). Nonlinear characteristics of single-reed instruments: Quasistatic volume flow and reed opening measurements. The Journal of the Acoustical Society of America 114(4), 2253-2262. Available at https://tinyurl.com/dalmont03.
Défossez, A., Zeghidour, N., Usunier, N., Bottou, L., and Bach, F. (2018). SING: Symbol-to-Instrument Neural Generator. Conference on Neural Information Processing Systems (NeurIPS), Montreal, QC, Canada, Decem- ber 2-8, 2018. Available at https://tinyurl.com/defossez.
Donahue, C., McAuley, J., and Puckette, M. (2019). Adversarial audio syn- thesis. International Conference on Learning Representations, New Orleans, LA, May 6-9, 2019. Available at https://tinyurl.com/waveganpaper.
Engel, J. (2017). Making a Neural Synthesizer Instrument. Available at https://tinyurl.com/nsynthbuild.
Engel, J. H., Agrawal, K. K., Chen, S., Gulrajani, I., Donahue, C., and Rob- erts, A. (2019). GANSynth: Adversarial neural audio synthesis. Computing Research Repository, arXiv:1902.08710.
Engel, J. H., Resnick, C., Roberts, A., Dieleman, S., Norouzi, M., Eck, D., and Simonyan, K. (2017). Neural audio synthesis of musical notes with WaveNet autoencoders. Proceedings of the 34th International Conference on Machine Learning-Volume 70, Sydney, NSW, Australia, August 6-11, 2017, pp. 1068-1077.
Fletcher, N., and Rossing, T. (1998). The Physics of Musical Instruments. Springer-Verlag, New York.
Gabrielli, L., Tomassetti, S., Zinato, C., and Piazza, F. (2018). End-to-end learning for physics-based acoustic modeling. IEEE Transactions on Emerging Topics in Computational Intelligence, 2(2), 160-170. Available at https://tinyurl.com/gabrielli18.
Griffin, D., and Lim, J. (1984). Signal estimation from modified short-time Fourier transform. IEEE Transactions on Acoustics, Speech, and Signal Processing 32, 236-243. Available at https://tinyurl.com/griffinlim.
Hawley, S. H., Colburn, B. L., and Mimilakis, S. I. (2019). Profiling audio compressors with deep neural networks. Proceedings of the Audio Engi- neering Society 147th International Conference, New York, October 16-19, 2019. Available at https://tinyurl.com/signaltrainpap.
Spring 2020 | Acoustics Today | 27

25 26 27 28 29