Page 55 - Winter2021
P. 55

are varied but include techniques such as drawing atten- tion to portions of the input signal that were responsible for strong activations of a neural network.
Another exciting avenue of machine learning research is to utilize systems that take advantage of physical knowl- edge. An example of this can be seen in Raissi et al. (2019) who trained deep neural networks with priors that were grounded in the physics of problem domains. One can envision acoustic systems that have prior knowledge about e.g., transmission loss and channel characteristics, and such systems may be a promising area for future research.
We hope that this “gentle” introduction to machine learn- ing will inspire readers to dig deeper into the possible uses of machine learning in their own acoustics problems. The Journal of the Acoustical Society published a special issue in 2021 on the use of machine learning in acoustics, and this collection of papers provides a wide range of exam- ple applications including medical applications, speech, oceanography, bioacoustics, and music. We hope that this collection stimulates the wider adoption of machine learning within the field of acoustics. There are a growing number of published acoustics papers that use these tech- niques, and it is likely that machine learning will become a valuable component in the acoustician’s toolkit.
This work was supported by Office of Naval Research Awards N00014-17-1-2867 and N00014-20-1-2029. We thank Arthur Popper for his valuable suggestions in the
development of this manuscript.
Amodei, D., Ananthanarayanan, S., Anubhai, R., Bai, J., Battenberg, E., Case, C., Casper, J., Catanzaro, B., Cheng, Q., Chen, G., and Chen, J. (2016). Deep speech 2: End-to-end speech recognition in English and Mandarin. In Proceedings of The 33rd International Conference on Machine Learning, New York, NY, June 19-24, 2016, vol. 48, pp. 173-182.
Benson, J., Chapman, N. R., and Antoniou, A. (2000). Geoacoustic model inversion using artificial neural networks. Inverse Problems 16, 1627-1639.
Bianco, M. J., Gerstoft, P., Traer, J., Ozanich, E., Roch, M. A., Gannot, S., and Deledalle, C. A. (2019). Machine learning in acoustics: Theory and applications. The Journal of the Acoustical Society of America 146, 3590-3628.
Bishop, C. M. (2006). Pattern Recognition and Machine Learning. Springer-Verlag, New York, NY.
Breiman, L. (2001). Random forests. Machine Learning 45, 5-32.
Chakrabarty, S., and Habets, E. A. P. (2019). Multi-speaker DOA esti- mation using deep convolutional networks trained with noise signals. IEEE Journal of Selected Topics in Signal Processing 13, 8-21.
Cowan, J. D., and Sharp, D. H. (1988). Neural nets and artificial intel- ligence. Daedalus 117, 85-121.
Davis, J., and Goadrich, M. (2006). The relationship between precision- recall and ROC curves. In Proceedings of the International. Conference on Machine Learning, Pittsburgh, PA, June 25-29, 2006, pp. 233-240.
Fawcett, T. (2006). An introduction to ROC analysis. Pattern Recognition Letters 27, 861-874.
Frederick, C., Villar, S., and Michalopoulou, Z.-H. (2020). Seabed classification using localized forward modeling and deep learning. The Journal of the Acoustical Society of America 148, 2730.
Godino-Llorente, J. I., and Gomez-Vilda, P. (2004). Automatic detection of voice impairments by means of short-term cepstral parameters and neural network based detectors. IEEE Transactions on Biomedical Engi- neering 51, 380-384.
Goodfellow, I., Bengio, Y., and Courville, A. (2016). Deep Learning. The MIT Press, Cambridge, MA.
Hastie, T., Tibshirani, R., and Friedman, J. H. (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer- Verlag, New York, NY.
Healy, E. W., Delfarah, M., Johnson, E. M., and Wang, D. (2019). A deep learning algorithm to increase intelligibility for hearing-
impaired listeners in the presence of a competing talker and reverberation. The Journal of the Acoustical Society of America 145, 1378.
Hinton, G., Li, D., Dong, Y., Dahl, G. E., Mohamed, A., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T. N., and Kingsbury, B. (2012). Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal Processing
Magazine 29, 82-97. Hubert, L., and Arabie, P. (1985). Comparing partitions. Journal of
Classification 2, 193-218. LeCun, Y., Bengio, Y., and Hinton, G. (2015). Deep learning. Nature
521, 436-444.
Lewis, J. M., Ackerman, M., and de Sa, V. R. (2012). Human cluster evaluation and formal quality measures: A comparative study. In
Proceedings of the 34th Annual Meeting of the Cognitive Science Society.
Sapporo, Japan, August 1-4, 2012, pp. 1870-1875.
Linardatos, P., Papastefanopoulos, V., and Kotsiantis, S. (2021). Explainable AI:
A review of machine learning interpretability methods. Entropy 23, 18. Madhusudhana, S., Shiu, Y., Klinck, H., Fleishman, E., Liu, X., Nosal, E. M., Helble, T., Cholewiak, D., Gillespie, D., Širović, A., and Roch,
M. A. (2021). Improve automatic detection of animal call sequences with temporal context. Journal of the Royal Society Interface 18, 20210297.
Martin, A., Doddington, G., Kamm, T., Ordowski, M., and Przybocki, M. (1997). The DET curve in assessment of detection task performance. In Proceedings of the 5th European Conference on Speech Communication and Technology, Rhodes, Greece. September 22-25, 1997, vol. 4, pp. 1895-1898.
McInnes, L., Healy, J., and Melville, J. (2018). UMAP: Uniform manifold approximation and projection for dimension reduction. Available at Accessed June 1, 2020.
Niu, H., Reeves, E., and Gerstoft, P. (2017). Source localization in an ocean waveguide using supervised machine learning. The Journal of the Acous- tical Society of America 142, 1176.
              Winter 2021 • Acoustics Today 55

   53   54   55   56   57