Page 29 - Winter Issue 2018
P. 29

Miyoshi, H., Sa.ito, Y., Takamichi, S., and Saruwata.ri, H. (2017). Voice of the 30th International Conference on Neural Information Processing Sys-
conversion using sequence-to-sequence learning of context posterior tems, Long Beach, CA, December 4-9, 2017.
probabilities. Proceedings of the International Speech Communication As- van Doremalen, I. Boves, L. Colpaert, 1., Cucchiarini, C., and Strik, H.
sociation (Interspeech 2017), Stockholm, Sweden, August 20-24, 2017, pp. (2016). Evaluating automatic speech recognition-based language learning
1268-1272. systems: a case study. ComputerAssisted Language Learning 29, 833-851.
Morise, M., Yokomori, F., and Ozawa, K. (2016) WORLD: A vocoder-based Warschauer, M., and Healey, D. (1998). Computers and language learning:
high-quality speech synthesis system for real-time applications. IEICE An overview. Language Teaching31, 57-71.
Transactions on Information and Systems 7, 1877-1884. Wik, P. (2011). The Virtual Language Teacher: Models and Applications for
Neu.rneyer, L., Franco, H., Digala.kis, V., and Weintraub, M. (2000). Automatic Language Learning Using Embodied Conversational Agents. Doctoral Dis-
scoring of pronunciation quality. Speech Communication 30, 83-93. sertation, KTH Royal Institute of Technology, Stockholm, Sweden.
Pickett, I. M., and Pollack, I. (1963). Intelligibility of excerpts from fluent Witt, S. M. (2012). Automatic error detection in pronunciation training:
speech: Effects of rate of utterance and duration of excerpt. Language and Where we are and where we need to go. Proceedings of the International
Speech, 6, 151-164. Symposium on the Automatic Detection of Errors in Pronunciation Train-
Prabhavalkar, R., Rao, K., Sainath, T. N., Li, B., Iohnson, L., and Iaitly, N. ing, Stockholm, Sweden, lune 6-8. 2012, pp. 1-8.
(2017). A comparison of sequence-to-sequence models for speech recogni- Witt, S. M., and Young, S. I. (2000). Phone-level pronunciation scoring
tion. Proceedings of the International Speech Communication Association (In- and assessment for interactive language learning. Speech Communication
terspeech 2017), Stockholm, Sweden, August 20-24, 2017, pp. 939-943. 30, 98-108.
Sakoe, H., and Chiba, S. (1978). Dynamic programming algorithm opti- Yu, D., and Li, D. (2015). Automatic Speech Recognition: A Deep Learning
mization for spoken word recognition. IEEE Transactions on Acoustics, Approach. Springer-Verlag, London.
Speech and Signal Processing 26, 43-49. Zeng, Y. (2000) Dynamic Time Warping Digit Recognizer. MS Thesis, Uni-
Schmidhuber, I. (2015). Deep learning in neural networks: An over- versity of Mississippi, Oxford.
view. Neural Networks 61, 85-117. Zue, V. W., and Seneff, S. (1988). Transcription and alignment of the TIM-
Salvador, S., and Chan, P. (2007). Toward accurate dynamic time warping in IT database. Recent Research Towards Advanced Man-Machine Interface
linear time and space. Iournal of Intelligent Data Analysis 11, 561-580. Through Spoken Language, pp. 515-525.
Stevens, K. N. (2002) Toward a model for lexical access based on acoustic
landmarks and distinctive features. The Iournal of the Acoustical Society of Eiasketch
America 111, 1872-1891.  
Su, P.-H., Wang, Y.-B., Yu, T.-H., and Lee, L.-S. (2013). A dialogue game Steven G1-eenbel-g worked on SRPS Au-
framework with personalized training using reinforcement learning for tograder Project in the early 19905. M or e
computer-assisted language learning. Proceedings of the 2013 IEEE Inter-
national Conference on Acoustics, Speech and Signal Processing (ICASSP), leCelltlY> he has collaborated on the ‘level’
Vancouver, BC, Canada, May 26-31, 2013, pp. 8213-8217. opment of Transparent Language’s Every-
Sun, L., Kang, S., Li, K., and Meng, H. (2015). Voice conversion using deep VoiceTM technology He has been a visiting
bidirectional long short-term memory based recurrent neural networks. f . th C t f A 1. d H _
Proceedings ofthe 2015 IEEE International Conference on Acoustics, Speech to essor In 6 en er or . PP 16 . eiu
and Signal Processing (ICASSP), Brisbane, QLD, Australia, April 19-24, mg Research at the Technlcal UHIVCYSHY
2015,1212. 4869-4873. of Denmark, Kongens Lyngby, as well as a senior scientist
T°‘l“’ T“ Chen’ L“H" Salm’ D" Vlll‘“’l°e“°l°’ F“ west“ M" W“’ Z" Yam‘ and research faculty at the International Computer Science
agishi, I. (2016). The voice conversion challenge 2016. Proceedings of the , _ ,
International Speech Communication Association (Interspeech 2016), San Instltute In Berkeley’ CA‘ He was a research Professor In the
Francisco, CA, september 3.12, 2015, pp, 15334535. Department of Neurophysiology, University of Wisconsin,
Van den 0019» A» Li» Y» Babuschkin» 1- 5im°flY““» K-- ViflYalS» 0- Kawlkc‘ Madison, and headed a speech laboratory in the Depart-
“°g1“’K"D"eSSChe’G‘V D" L°Ckha,n’ E" Coho’ L‘ C" Snmberg’ E’ Cam’ ment of Linguistics, University of California-Berkeley. He is
grande, N., Grewe, D., Noury, S., Dieleman, S., Elsen, E., Kalchbrenner, , _ , , _
N., Zen, H., Graves, A., King, H., Walters, T., Belov, T., and Hassabis, D. Presldellt of Slllcoll Speech! a Collsultlllg company based lll
(2017). Parallel WaveNet: Fast high-fidelity speech synthesis. Proceedings northern California.
Winter 2018 | Acuuseics Thclay | 27




















































   27   28   29   30   31