Employement: |
Professional Activities |
|
Invited Talks |
Research and Teaching |
Education |
Publications |
Y. Kubo and M. Bacchiani, ``Joint Phoneme-Grapheme Model for End-To-End Speech Recognition,'' In The International Conference on Acoustics,Speech and Signal Processing 2020.
R. Haeb-Umbach, S. Watanabe, T. Nakatani, M. Bacchiani, B. Hoffmeister, M. L. Seltzer, H. Zen and M. Souden, ``Speech Processing for Digital Home Assistants: Combining signal processing with deep-learning techniques,'' In IEEE Signal Processing Magazine, Volume 36, Issue 6, pp. 111-124, October 2019.
S. Watanabe, S. Araki, M. Bacchiani, R. Haeb-Umbach and M. L. Seltzer ``Introduction to the Issue on Far-Field Speech Processing in the Era of Deep Learning: Speech Enhancement, Separation, and Recognition,'' In IEEE Journal of Selected Topics in Signal Processing, Volume 13, pp. 785-786, August 2019.
J. Shen, P. Nguyen, Y. Wu, Z. Chen, M. X. Chen, Y. Jia, A. Kannan, T. Sainath, Y. Cao, C-C. Chiu, Y. He, J. Chorowski, S. Hinsu, S. Laurenzo, J. Qin, O. Firat, W. Macherey, S. Gupta, A. Bapna, S. Zhang, R. Pang, R. J. Weiss, R. Prabhavalkar, Q. Liang, B. Jacob, B. Liang, H. Lee, C. Chelba, S. Jean, B. Li, M. Johnson, R. Anil, R. Tibrewal, X. Liu, A. Eriguchi, N. Jaitly, N. Ari, C. Cherry, P. Haghani, O. Good, Y. Cheng, R. Alvarez, I. Caswell, W-N. Hsu, Z. Yang, K-C. Wang, E. Gonina, K. Tomanek, B. Vanik, Z. Wu, L. Jones, M. Schuster, Y. Huang, D. Chen, K. Irie, G. Foster, J. Richardson, K. Macherey, A. Bruguier, H. Zen, C. Raffel, S. Kumar, K. Rao, D. Rybach, M. Murray, V. Peddinti, M. Krikun, M. Bacchiani, T. B. Jablin, R. Suderman, I. Williams, B. Lee, D. Bhatia, J. Carlson, S. Yavuz, Y. Zhang, I. McGraw, M. Galkin, Q. Ge, G. Pundak, C. Whipkey, T. Wang, U. Alon, D. Lepikhin, Y. Tian, S. Sabour, W. Chan, S. Toshniwal, B. Liao, M. Nirschl and P. Rondon, ``Lingvo: a modular and scalable framework for sequence-to-sequence modeling,'' In arXiv preprint arXiv:1902.08295, February 2019.
P. Haghani, A. Narayanan, M. Bacchiani, G. Chuang, N. Gaur, P. Moreno, R. Prabhavalkar, Z. Qu and Austin Waters, ``From Audio to Semantics: Approaches to end-to-end spoken language understanding,'' In IEEE Spoken Language Technology Workshop (SLT), pp. 720-726, December 2018.
A. Narayanan, A. Misra, K-C. Sim, G. Pundak, A. Tripathi, M. Elfeky, P. Haghani, T. Strohman and Michiel Bacchiani, ``Toward domain-invariant speech recognition via large scale training,'' In IEEE Spoken Language Technology Workshop (SLT), pp. 441-447, December 2018.
M. Bacchiani and E. Fosler-Lussier, ``An Overview of the IEEE SPS Speech and Language Technical Committee,'' In IEEE Signal Processing Magazine, Volume 35, Issue 6, pp. 125-126, November 2018.
J. Heymann, M. Bacchiani and T. Sainath, ``Performance of mask based statistical beamforming in a smart home scenario,'' In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6722-6726, April 2018.
B. Li, T. Sainath, K. Sim, M. Bacchiani, E. Weinstein, P. Nguyen, Z. Chen, Y. Wu and K. Rao, ``Multi-Dialect Speech Recognition With a Single Sequence-to-Sequence Model'' Submitted to The International Conference on Acoustics,Speech and Signal Processing 2018.
C. Chiu, T. Sainath, Y. Wu, R. Prabhavalkar, P. Nguyen, Z. Chen, A. Kannan, R. Weiss, K. Rao, K. Gonina, N. Jaitly, B. Li, J. Chorowski and M. Bacchiani, ``State-of-the-art Speech Recognition With Sequence-to-Sequence Models,'' Submitted to The International Conference on Acoustics,Speech and Signal Processing 2018.
E. Variani, T. Bagby, K. Lahouel, E. McDermott and M. Bacchiani, ``SAMPLED CONNECTIONIST TEMPORAL CLASSIFICATION,'' Submitted to The International Conference on Acoustics,Speech and Signal Processing 2018.
C. Kim, T. Sainath, A. Narayanan, A. Misra, R. Nongpiur and M. Bacchiani, ``SPECTRAL DISTORTION MODEL FOR TRAINING PHASE-SENSITIVE DEEP-NEURAL NETWORKS FOR FAR-FIELD SPEECH RECOGNITION,'' Submitted to The International Conference on Acoustics,Speech and Signal Processing 2018.
C. Kim, A. Menon, M. Bacchiani and R. Stern, ``SOUND SOURCE SEPARATION USING PHASE DIFFERENCE AND RELIABLE MASK SELECTION SELECTION,'' Submitted to The International Conference on Acoustics,Speech and Signal Processing 2018.
M. Bacchiani, F. Beaufays, A. Gruenstein, P. Moreno, J. Schalkwyk, T. Strohman and H. Zen, ``Speech Research at Google to Enable Universal Speech Interfaces,'' chapter in New Era for Robust Speech Recognitino: Exploiting Deep Learning, S. Watanabe, M. Delcroix, F. Metze and J. Hershey eds. 2017.
T. Sainath, R. Weiss, K. Wilson, B. Li, A. Narayanan, E. Variani, M. Bacchiani, I. Shafran, A. Senior, K. Chin, A. Misra and C. Kim, ``Raw Multichannel Processing Using Deep Neural Networks,'' chapter in New Era for Robust Speech Recognitino: Exploiting Deep Learning, S. Watanabe, M. Delcroix, F. Metze and J. Hershey eds. 2017.
K. Sim, A. Narayanan, T. Bagby, T. Sainath and Michiel Bacchiani, ``IMPROVING THE EFFICIENCY OF FORWARD-BACKWARD ALGORITHM USING BATCHED COMPUTATION IN TENSORFLOW,'' In Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop December 2017, Okinawa, Japan. [pdf]
E. Variani, T. Bagby, E. McDermott and M. Bacchiani, ``End-to-End Training of Acoustic Models for Large Vocabulary Continuous Speech Recognition with TensorFlow,'' In Proceedings of Interspeech September 2017, Stockholm, Sweden. [pdf]
B. Li, T. Sainath, A. Narayanan, J. Caroselli, M. Bacchiani, A. Misra, I. Shafran, H. Sak, G. Punduk, K. Chin, K. Sim, R. Weiss, K. Wilson, E. Variani, C. Kim, O. Siohan, M. Weintraub, E. McDermott, R. Rose and M. Shannon, ``Acoustic Modeling for Google Home,'' In Proceedings of Interspeech September 2017, Stockholm, Sweden. [pdf]
C. Kim, A. Misra, K. Chin, T. Hughes, A. Narayanan, T. Sainath and Michiel Bacchiani, ``Generation of large-scale simulated utterances in virtual rooms to train deep-neural networks for far-field speech recognition in Google Home,''In Proceedings of Interspeech September 2017, Stockholm, Sweden. [pdf]
T. Sainath, R. Weiss, K. Wilson, B. Li, A. Narayanan, E. Variani, M. Bacchiani, I. Shafran, A. Senior, K. Chin, A. Misra and C. Kim, ``Multichannel Signal Processing with Deep Neural Networks for Automatic Speech Recognition,'' In IEEE Transactions on Audio, Speech, and Language Processing, vol. 25 (2017), pp. 965 - 979. [preprint pdf]
T. Sainath, A. Narayanan, R. Weiss, E. Variani, K. Wilson, M. Bacchiani and I. Shafran, ``Reducing the Computational Complexity of Multimicrophone Acoustic Models with Integrated Feature Extraction,'' In Proceedings of Interspeech September 2016, San Francicso, CA. [pdf]
B. Li, T. Sainath, R. Weiss, K. Wilson and M. Bacchiani, ``Neural Network Adaptive Beamforming for Robust Multichannel Speech Recognition,'' In Proceedings of Interspeech September 2016, San Francicso, CA. [pdf]
E. Variani, T. Sainath, I. Shafran and Michiel Bacchiani, ``Complex Linear Projection (CLP): A Discriminative Approach to Joint Feature Extraction and Acoustic Modeling,'' In Proceedings of Interspeech September 2016, San Francicso, CA. [pdf]
T. Sainath, R. Weiss, K. Wilson, A. Narayanan and M. Bacchiani, ``Factored Spatial and Spectral Multichannel Raw Waveform CLDNNs,'' In Proceedings of the International Conference on Acoustics,Speech and Signal Processing April 2016, Shanghai, China. [pdf]
T. Sainath, R. Weiss, K. Wilson, A. Narayanan, M. Bacchiani and A. Senior, ``Speaker Location and Microphone Spacing Invariant Acoustic Modeling from Raw Multichannel Waveforms,'' In Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop December 2015, Scottsdale AZ. [pdf]
H. Liao, G. Pundak, O. Siohan, M. Carroll, N. Coccaro, Q. Jiang, T. Sainath, A. Senior, F. Beaufays and M. Bacchiani, ``Large Vocabulary Automatic Speech Recognition for Children,'' In Proceedings of Interspeech September 2016, Dresden, Germany. [pdf]
M. Bacchiani, A. Senior and G. Heigold, ``Asynchronous, Online, GMM-free Training of a Context Dependent Acoustic Model for Speech Recognition,'' In Proceedings of the European Conference on Speech Communication and Technology September 2014, Singapore. [pdf]
E. McDermott, G. Heigold, P. Moreno, A. Senior and M. Bacchiani, ``Asynchronous Stochastic Optimization for Sequence Training of Deep Neural Networks: Towards Big Data,'' In Proceedings of the European Conference on Speech Communication and Technology September 2014, Singapore. [pdf]
C. Kim, K. Chin, M. Bacchiani and R. Stern, ``Robust speech recognition using temporal masking and thresholding algorithm,'' In Proceedings of the European Conference on Speech Communication and Technology September 2014, Singapore. [pdf]
M. Bacchiani and D. Rybach, ``Context Dependent State Tying for Speech Recognition using Deep Neural Network Acoustic Models, '' In Proceedings of the International Conference on Acoustics,Speech and Signal Processing May 2014, Florence, Italy. [pdf]
G. Heigold, E. McDermott, V. Vanhoucke, A. Senior and M. Bacchiani, ``Asynchronous Stochastic Optimization for Sequence Training of Deep Neural Networks,'' In Proceedings of the International Conference on Acoustics,Speech and Signal Processing May 2014, Florence, Italy. [pdf]
A. Senior, G. Heigold, M. Bacchiani and H. Liao, ``GMM-Free DNN Training'', In Proceedings of the International Conference on Acoustics,Speech and Signal Processing May 2014, Florence, Italy. [pdf]
O. Siohan and M. Bacchiani, ``iVector-based Acoustic Data Selection,'' In Proceedings of the European Conference on Speech Communication and Technology September 2013, Lyon, France. [pdf]
M. Bacchiani, ``Rapid Adaptation for Mobile Speech Applications,'' In Proceedings of the International Conference on Acoustics,Speech and Signal Processing May 2014, Vancouver, Canada. [pdf]
C. Alberti and M. Bacchiani, ``Discriminative Features for Language Identification,'' In Proceedings of the European Conference on Speech Communication and Technology September 2011, Florence, Italy. [pdf]
Z. Liu and M. Bacchiani, ``TechWare: Mobile Media Search Resources [Best of the Web],'' In IEEE Signal Processing Magazine, Volume 28, Issue 4, pp. 142-145, July 2011. [pdf]
H. Liao, C. Alberti, M. Bacchiani and O. Siohan, ``Decision Tree State Clustering with Word and Syllable Features,'' In Proceedings of the European Conference on Speech Communication and Technology September 2010, Chiba, Japan. [pdf]
C. Alberti, M. Bacchiani, A. Bezman, C. Chelba, A. Dorfa, H. Liao, P. Moreno, T. Power, A. Sahuguet, M. Shugrina and O. Siohan, ``An Audio Indexing System for Election Video Material,'' In Proceedings of the International Conference on Acoustics,Speech and Signal Processing May 2009, Taipei, Taiwan. [pdf]
A. Gravano, M. Jansche and M. Bacchiani, ``Restoring Punctuation and Capitalization in Transcribed Speech,'' In Proceedings of the International Conference on Acoustics,Speech and Signal Processing May 2009, Taipei, Taiwan. [pdf]
M. Bacchiani, F. Beaufays, J. Schalkwyk, M. Schuster, B. Strope, ``Deploying GOOG-411: Early Lessons in Data, Measurement and Testing,''In Proceedings of the International Conference on Acoustics,Speech and Signal Processing May 2008, Las Vegas, USA. [pdf]
C. Gollan and M. Bacchiani, ``Confidence Scores for Acoustic Model Adaptation,'' In Proceedings of the International Conference on Acoustics,Speech and Signal Processing May 2008, Las Vegas, USA. [pdf]
M. Bacchiani, M. Riley, B. Roark and R. Sproat, ``MAP adaptation of stochastic grammars,'' In Computer Speech and Language Vol. 20, Issue 1, January 2006, pp. 41-68. [pdf]
O. Siohan and M. Bacchiani, ``Fast Vocabulary-Independent Audio Search Using Path-Based Graph Indexing,'' to appear in Proceedings of the European Conference on Speech Communication and Technology September 2005, Lisbon, Portugal. [pdf]
M. Bacchiani, B. Roark and M. Saraclar, ``Language model adaptation with MAP estimation and the perceptron algorithm,'' In Proceedings of the HLT-NAACL May 2004, Boston, USA. [pdf]
M. Bacchiani and B. Roark,``Meta-data Conditional Language Modeling,'' In Proceedings of the International Conference on Acoustics,Speech and Signal Processing May 2004, Montreal, Canada. [pdf]
S. Maskey, M. Bacchiani, B. Roark and R. Sproat,``Improved name recognition with meta-data dependent name networks,'' In Proceedings of the International Conference on Acoustics,Speech and Signal Processing May 2004, Montreal, Canada. [pdf]
B. Roark and M. Bacchiani, ``Supervised and unsupervised PCFG adaptation to novel domains,'' In Proceedings of the HLT-NAACL pp. 205-212, July 2003, Edmonton, Canada. [pdf]
M. Bacchiani and B. Roark, ``Unsupervised Language Model Adaptation,'' In Proceedings of the International Conference on Acoustics,Speech and Signal Processing 2003. [pdf]
M. Bacchiani, ``Combining Maximum Likelihood and Maximum A Posteriori Estimation for Detailed Acoustic Modeling of Context Dependency,'' In Proceedings of the International conference on Spoken Language Processing pp. 2593-2596, September 2002, Denver, USA. [pdf]
S. Whittaker, J. Hirschberg, B. Amento, L. Stark, M. Bacchiani, P. Isenhour, L. Stead, G. Zamchick and A. Rosenberg, ``SCANMail: a voicemail interface that makes speech browsable, readable and searchable,'' In Proceedings of the conference on Computer Human Interaction April 2002, Minneapolis, USA. [pdf]
Julia Hirschberg, Michiel Bacchiani, Phil Isenhour, Aaron Rosenberg, Larry Stead, Steve Whittaker, Gary Zamchick, ``Audio Browsing and Search in the Voicemail Domain,'' In Proceedings of NLPRS-2001 November 2001, Tokyo, Japan. [pdf]
J. Hirschberg, M. Bacchiani, D. Hindle, P. Isenhour, A. Rosenberg, L. Stark, L. Stead, S. Whittaker and G. Zamchick,``SCANMail: Browsing and Searching Speech Data by Content,'' In Proceedings of the European Conference on Speech Communication and Technology September 2001, Aalborg, Denmark. [pdf]
A. Rosenberg, J. Hirschberh, M. Bacchiani, S. Parthasarathy, P. Isenhour and L. Stead,``Caller Identification for the SCANMail Voicemail Browser,'' In Proceedings of the European Conference on Speech Communication and Technology September 2001, Aalborg, Denmark. [pdf]
M. Bacchiani,``Automatic Transcription of Voicemail at AT\&T,'' In Proceedings of the International Conference on Acoustics,Speech and Signal Processing May 2001, Salt Lake City, USA. [pdf]
M. Bacchiani, J. Hirschberg, A. Rosenberg, S. Whittaker, D. Hindle. P. Isenhour, M. Jones and L. Stark, ``SCANMail: Audio Navigation in the Voicemail Domain,'' In Proceedings of the workshop on Human Language Technology March 2001, San Diego, USA. [pdf]
M. Bacchiani,``Using Maximum Likelihood Linear Regression for Segment Clustering and Speaker Identification,'' In Proceedings of the International conference on Spoken Language Processing Vol. IV, pp. 536-539, October 2000, Beijing, China. [pdf]
A.Singhal, S.Abney, M.Bacchiani, M.Collins, D.Hindle and F.Pereira, ``AT&T at TREC-8,'' In Proceedings of the Eighth Text REtrieval Conference (TREC-8), pp. 317-330, November 1999, Gaitherburg, USA. [pdf]
M.Bacchiani and M. Ostendorf, ``Joint Lexicon, Acoustic Unit Inventory and Model Design,'' In Speech Communication no. 29, pp. 99-114, 1999. [pdf]
M.Bacchiani and M. Ostendorf, ``Using Automatically-Derived Acoustic Sub-word Units in Large Vocabulary Speech Recognition,'' In Proceedings of the International conference on Spoken Language Processing November 1998, Sydney, Australia. [pdf]
M. Bacchiani and M. Ostendorf, ``Joint Acoustic Unit Design and Lexicon Generation,'' In Proceedings ESCA Workshop on Modeling Pronunciation Variation for Automatic Speech Recognition pp. 7-12, May 1998, Kerkrade, The Netherlands. [pdf]
M. Ostendorf, B. Byrne, M. Bacchiani, M. Finke, A. Gunawardana, K. Ross, S. Roweis, E. Shriberg, D. Talkin, A. Waibel, B. Wheatley and T. Zeppenfeld, ``Modeling Systematic Variations in Pronunciation via a Language-Dependent Hidden Speaking Mode,'' In Proceedings of the International conference on Spoken Language Processing October 1996, Philadelphia, USA. [pdf]
T. Fukada, M. Bacchiani, K. Paliwal and Y. Sagisaka, ``Speech Recognition Based on Acoustically Derived Segment Units,'' In Proceedings of the International conference on Spoken Language Processing pp. 1077-1080, October 1996, Philadelphia, USA. [pdf]
M. Bacchiani, M. Ostendorf, Y. Sagisaka and K. Paliwal, ``Design of a Speech Recognition System based on Non-Uniform Segmental Units,'' In Proceedings of the International Conference on Acoustics, Speech and Signal Processing pp. 443-446, May 1996, Atlanta, USA. [pdf]
M. Bacchiani, M. Ostendorf, Y. Sagisaka and K. Paliwal, ``Unsupervised Learning of Non-Uniform Segmental Units for Acoustic Modeling in Speech Recognition,'' In Proceedings of the IEEE workshop on Automatic Speech Recognition pp. 141-142, December 1995, Snowbird, USA. [pdf]
K.K. Paliwal, M. Bacchiani and Y. Sagisaka, ``Minimum Classification Error Training Algorithm for Feature Extractor and Pattern Classifier in Speech Recognition,'' Eurospeech '95, Vol. 1, pp. 541-544, September 1995, Madrid, Spain. [pdf]
K.K. Paliwal, M. Bacchiani and Y. Sagisaka, ``Simultaneous Design of Feature Extractor and Pattern Classifier using the Minimum Classification Error Training Algorithm,'' In Proceedings of the IEEE workshop on Neural Networks for Signal Processing pp. 67-76, September 1995, Boston, USA. [pdf]
Michiel Bacchiani and Kiyoaki Aikawa, ``Optimization of time-frequency masking filters using the minimum error classification criterion,'' In Proceedings of the International Conference on Acoustics, Speech and Signal Processing Vol. 1, pp 485-488, April 1994, Adelaide, Australia. [pdf]