
Tara N. Sainath
Person information
- affiliation: Google Inc., New York, NY, USA
- affiliation: IBM T. J. Watson Research Center, Yorktown Heights, NY, USA
Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2021
- [i38]Ke Hu, Ruoming Pang, Tara N. Sainath, Trevor Strohman:
Transformer Based Deliberation for Two-Pass Speech Recognition. CoRR abs/2101.11577 (2021) - [i37]Harsh Shrivastava, Ankush Garg, Yuan Cao, Yu Zhang, Tara N. Sainath:
Echo State Speech Recognition. CoRR abs/2102.09114 (2021) - 2020
- [c113]Tara N. Sainath, Yanzhang He, Bo Li, Arun Narayanan, Ruoming Pang, Antoine Bruguier, Shuo-Yiin Chang, Wei Li, Raziel Alvarez, Zhifeng Chen, Chung-Cheng Chiu, David Garcia, Alexander Gruenstein, Ke Hu, Anjuli Kannan, Qiao Liang, Ian McGraw, Cal Peyser, Rohit Prabhavalkar, Golan Pundak, David Rybach, Yuan Shangguan, Yash Sheth, Trevor Strohman, Mirkó Visontai, Yonghui Wu, Yu Zhang, Ding Zhao:
A Streaming On-Device End-To-End Model Surpassing Server-Side Conventional Model Quality and Latency. ICASSP 2020: 6059-6063 - [c112]Bo Li, Shuo-Yiin Chang, Tara N. Sainath, Ruoming Pang, Yanzhang He, Trevor Strohman, Yonghui Wu:
Towards Fast and Accurate Streaming End-To-End ASR. ICASSP 2020: 6069-6073 - [c111]Tara N. Sainath, Ruoming Pang, Ron J. Weiss, Yanzhang He, Chung-Cheng Chiu, Trevor Strohman:
An Attention-Based Joint Acoustic and Text on-Device End-To-End Model. ICASSP 2020: 7039-7043 - [c110]Cal Peyser, Tara N. Sainath, Golan Pundak:
Improving Proper Noun Recognition in End-To-End Asr by Customization of the Mwer Loss Criterion. ICASSP 2020: 7789-7793 - [c109]Ke Hu, Tara N. Sainath, Ruoming Pang, Rohit Prabhavalkar:
Deliberation Model Based Two-Pass End-To-End Speech Recognition. ICASSP 2020: 7799-7803 - [c108]Zelin Wu, Bo Li, Yu Zhang, Petar S. Aleksic, Tara N. Sainath:
Multistate Encoding with End-To-End Speech RNN Transducer Network. ICASSP 2020: 7819-7823 - [c107]Shuo-Yiin Chang, Bo Li, David Rybach, Yanzhang He, Wei Li, Tara N. Sainath, Trevor Strohman:
Low Latency Speech Recognition Using End-to-End Prefetching. INTERSPEECH 2020: 1962-1966 - [c106]Tara N. Sainath, Ruoming Pang, David Rybach, Basi García, Trevor Strohman:
Emitting Word Timings with End-to-End Models. INTERSPEECH 2020: 3615-3619 - [c105]Cal Peyser, Sepand Mavandadi, Tara N. Sainath, James Apfel, Ruoming Pang, Shankar Kumar:
Improving Tail Performance of a Deliberation E2E ASR Model Using a Large Text Corpus. INTERSPEECH 2020: 4921-4925 - [i36]Ke Hu, Tara N. Sainath, Ruoming Pang, Rohit Prabhavalkar:
Deliberation Model Based Two-Pass End-to-End Speech Recognition. CoRR abs/2003.07962 (2020) - [i35]Tara N. Sainath, Yanzhang He, Bo Li, Arun Narayanan, Ruoming Pang, Antoine Bruguier, Shuo-Yiin Chang, Wei Li, Raziel Alvarez, Zhifeng Chen, Chung-Cheng Chiu, David Garcia, Alexander Gruenstein, Ke Hu, Minho Jin, Anjuli Kannan, Qiao Liang, Ian McGraw, Cal Peyser, Rohit Prabhavalkar, Golan Pundak, David Rybach, Yuan Shangguan, Yash Sheth, Trevor Strohman, Mirkó Visontai, Yonghui Wu, Yu Zhang, Ding Zhao:
A Streaming On-Device End-to-End Model Surpassing Server-Side Conventional Model Quality and Latency. CoRR abs/2003.12710 (2020) - [i34]Chung-Cheng Chiu, Arun Narayanan, Wei Han, Rohit Prabhavalkar, Yu Zhang, Navdeep Jaitly, Ruoming Pang, Tara N. Sainath, Patrick Nguyen, Liangliang Cao, Yonghui Wu:
RNN-T Models Fail to Generalize to Out-of-Domain Audio: Causes and Solutions. CoRR abs/2005.03271 (2020) - [i33]Cal Peyser, Tara N. Sainath, Golan Pundak:
Improving Proper Noun Recognition in End-to-End ASR By Customization of the MWER Loss Criterion. CoRR abs/2005.09756 (2020) - [i32]Cal Peyser, Sepand Mavandadi, Tara N. Sainath, James Apfel, Ruoming Pang, Shankar Kumar:
Improving Tail Performance of a Deliberation E2E ASR Model Using a Large Text Corpus. CoRR abs/2008.10491 (2020) - [i31]Jiahui Yu, Wei Han, Anmol Gulati, Chung-Cheng Chiu, Bo Li, Tara N. Sainath, Yonghui Wu, Ruoming Pang:
Universal ASR: Unify and Improve Streaming ASR with Full-context Modeling. CoRR abs/2010.06030 (2020) - [i30]Jiahui Yu, Chung-Cheng Chiu, Bo Li, Shuo-Yiin Chang, Tara N. Sainath, Yanzhang He, Arun Narayanan, Wei Han, Anmol Gulati, Yonghui Wu, Ruoming Pang:
FastEmit: Low-latency Streaming ASR with Sequence-level Emission Regularization. CoRR abs/2010.11148 (2020) - [i29]Peidong Wang, Tara N. Sainath, Ron J. Weiss:
Multitask Training with Text Data for End-to-End Speech Recognition. CoRR abs/2010.14318 (2020) - [i28]Arun Narayanan, Tara N. Sainath, Ruoming Pang, Jiahui Yu, Chung-Cheng Chiu, Rohit Prabhavalkar, Ehsan Variani, Trevor Strohman:
Cascaded encoders for unifying streaming and non-streaming ASR. CoRR abs/2010.14606 (2020) - [i27]Bo Li, Anmol Gulati, Jiahui Yu, Tara N. Sainath, Chung-Cheng Chiu, Arun Narayanan, Shuo-Yiin Chang, Ruoming Pang, Yanzhang He, James Qin, Wei Han, Qiao Liang, Yu Zhang, Trevor Strohman, Yonghui Wu:
A Better and Faster End-to-End Model for Streaming ASR. CoRR abs/2011.10798 (2020) - [i26]Rohit Prabhavalkar, Yanzhang He, David Rybach, Sean Campbell, Arun Narayanan, Trevor Strohman, Tara N. Sainath:
Less Is More: Improved RNN-T Decoding Using Limited Label Context and Path Merging. CoRR abs/2012.06749 (2020)
2010 – 2019
- 2019
- [j6]Hendrik Purwins
, Bo Li
, Tuomas Virtanen
, Jan Schlüter
, Shuo-Yiin Chang, Tara N. Sainath
:
Deep Learning for Audio Signal Processing. IEEE J. Sel. Top. Signal Process. 13(2): 206-219 (2019) - [c104]Chung-Cheng Chiu, Anjuli Kannan, Rohit Prabhavalkar, Zhifeng Chen, Tara N. Sainath, Yonghui Wu, Wei Han, Yu Zhang, Ruoming Pang, Sergey Kishchenko, Patrick Nguyen, Arun Narayanan, Hank Liao, Shuyuan Zhang:
A Comparison of End-to-End Models for Long-Form Speech Recognition. ASRU 2019: 889-896 - [c103]Arun Narayanan, Rohit Prabhavalkar, Chung-Cheng Chiu, David Rybach, Tara N. Sainath, Trevor Strohman:
Recognizing Long-Form Speech Using Streaming End-to-End Models. ASRU 2019: 920-927 - [c102]Bo Li, Tara N. Sainath, Ruoming Pang, Zelin Wu:
Semi-supervised Training for End-to-end Models via Weak Distillation. ICASSP 2019: 2837-2841 - [c101]Bo Li, Yu Zhang, Tara N. Sainath, Yonghui Wu, William Chan:
Bytes Are All You Need: End-to-end Multilingual Speech Recognition and Synthesis with Bytes. ICASSP 2019: 5621-5625 - [c100]Shuo-Yiin Chang, Rohit Prabhavalkar, Yanzhang He, Tara N. Sainath, Gabor Simko:
Joint Endpointing and Decoding with End-to-end Models. ICASSP 2019: 5626-5630 - [c99]Jinxi Guo, Tara N. Sainath, Ron J. Weiss:
A Spelling Correction Model for End-to-end Speech Recognition. ICASSP 2019: 5651-5655 - [c98]Antoine Bruguier, Rohit Prabhavalkar, Golan Pundak, Tara N. Sainath:
Phoebe: Pronunciation-aware Contextualization for End-to-end Speech Recognition. ICASSP 2019: 6171-6175 - [c97]Yanzhang He, Tara N. Sainath, Rohit Prabhavalkar, Ian McGraw, Raziel Alvarez, Ding Zhao, David Rybach, Anjuli Kannan, Yonghui Wu, Ruoming Pang, Qiao Liang, Deepti Bhatia, Yuan Shangguan, Bo Li, Golan Pundak, Khe Chai Sim, Tom Bagby, Shuo-Yiin Chang, Kanishka Rao, Alexander Gruenstein:
Streaming End-to-end Speech Recognition for Mobile Devices. ICASSP 2019: 6381-6385 - [c96]Uri Alon, Golan Pundak, Tara N. Sainath:
Contextual Speech Recognition with Difficult Negative Training Examples. ICASSP 2019: 6440-6444 - [c95]Ding Zhao, Tara N. Sainath, David Rybach, Pat Rondon, Deepti Bhatia, Bo Li, Ruoming Pang:
Shallow-Fusion End-to-End Contextual Biasing. INTERSPEECH 2019: 1418-1422 - [c94]Anjuli Kannan, Arindrima Datta, Tara N. Sainath, Eugene Weinstein, Bhuvana Ramabhadran, Yonghui Wu, Ankur Bapna, Zhifeng Chen, Seungji Lee:
Large-Scale Multilingual Speech Recognition with a Streaming End-to-End Model. INTERSPEECH 2019: 2130-2134 - [c93]Ke Hu, Antoine Bruguier, Tara N. Sainath, Rohit Prabhavalkar, Golan Pundak:
Phoneme-Based Contextualization for Cross-Lingual Speech Recognition in End-to-End Models. INTERSPEECH 2019: 2155-2159 - [c92]Cal Peyser, Hao Zhang, Tara N. Sainath, Zelin Wu:
Improving Performance of End-to-End ASR on Numeric Sequences. INTERSPEECH 2019: 2185-2189 - [c91]Tara N. Sainath, Ruoming Pang, David Rybach, Yanzhang He, Rohit Prabhavalkar, Wei Li, Mirkó Visontai, Qiao Liang, Trevor Strohman, Yonghui Wu, Ian McGraw, Chung-Cheng Chiu:
Two-Pass End-to-End Speech Recognition. INTERSPEECH 2019: 2773-2777 - [i25]Jinxi Guo, Tara N. Sainath, Ron J. Weiss:
A spelling correction model for end-to-end speech recognition. CoRR abs/1902.07178 (2019) - [i24]Jonathan Shen, Patrick Nguyen, Yonghui Wu, Zhifeng Chen, Mia Xu Chen, Ye Jia, Anjuli Kannan, Tara N. Sainath, Yuan Cao, Chung-Cheng Chiu, Yanzhang He, Jan Chorowski, Smit Hinsu, Stella Laurenzo, James Qin, Orhan Firat, Wolfgang Macherey, Suyog Gupta, Ankur Bapna, Shuyuan Zhang, Ruoming Pang, Ron J. Weiss, Rohit Prabhavalkar, Qiao Liang, Benoit Jacob, Bowen Liang, HyoukJoong Lee, Ciprian Chelba, Sébastien Jean, Bo Li, Melvin Johnson, Rohan Anil, Rajat Tibrewal, Xiaobing Liu, Akiko Eriguchi, Navdeep Jaitly, Naveen Ari, Colin Cherry, Parisa Haghani, Otavio Good, Youlong Cheng, Raziel Alvarez, Isaac Caswell, Wei-Ning Hsu, Zongheng Yang, Kuan-Chieh Wang, Ekaterina Gonina, Katrin Tomanek, Ben Vanik, Zelin Wu, Llion Jones, Mike Schuster, Yanping Huang, Dehao Chen, Kazuki Irie, George F. Foster, John Richardson, Klaus Macherey, Antoine Bruguier, Heiga Zen, Colin Raffel, Shankar Kumar, Kanishka Rao, David Rybach, Matthew Murray, Vijayaditya Peddinti, Maxim Krikun, Michiel Bacchiani, Thomas B. Jablin, Robert Suderman, Ian Williams, Benjamin Lee, Deepti Bhatia, Justin Carlson, Semih Yavuz, Yu Zhang, Ian McGraw, Max Galkin, Qi Ge, Golan Pundak, Chad Whipkey, Todd Wang, Uri Alon, Dmitry Lepikhin, Ye Tian, Sara Sabour, William Chan, Shubham Toshniwal, Baohua Liao, Michael Nirschl, Pat Rondon:
Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling. CoRR abs/1902.08295 (2019) - [i23]Hendrik Purwins, Bo Li, Tuomas Virtanen, Jan Schlüter, Shuo-Yiin Chang, Tara N. Sainath:
Deep Learning for Audio Signal Processing. CoRR abs/1905.00078 (2019) - [i22]Ke Hu, Antoine Bruguier, Tara N. Sainath, Rohit Prabhavalkar, Golan Pundak:
Phoneme-Based Contextualization for Cross-Lingual Speech Recognition in End-to-End Models. CoRR abs/1906.09292 (2019) - [i21]Cal Peyser, Hao Zhang, Tara N. Sainath, Zelin Wu:
Improving Performance of End-to-End ASR on Numeric Sequences. CoRR abs/1907.01372 (2019) - [i20]Tara N. Sainath, Ruoming Pang, David Rybach, Yanzhang He, Rohit Prabhavalkar, Wei Li, Mirkó Visontai, Qiao Liang, Trevor Strohman, Yonghui Wu, Ian McGraw, Chung-Cheng Chiu:
Two-Pass End-to-End Speech Recognition. CoRR abs/1908.10992 (2019) - [i19]Anjuli Kannan, Arindrima Datta, Tara N. Sainath, Eugene Weinstein, Bhuvana Ramabhadran, Yonghui Wu, Ankur Bapna, Zhifeng Chen, Seungji Lee:
Large-Scale Multilingual Speech Recognition with a Streaming End-to-End Model. CoRR abs/1909.05330 (2019) - [i18]Arun Narayanan, Rohit Prabhavalkar, Chung-Cheng Chiu, David Rybach, Tara N. Sainath, Trevor Strohman:
Recognizing long-form speech using streaming end-to-end models. CoRR abs/1910.11455 (2019) - [i17]Chung-Cheng Chiu, Wei Han, Yu Zhang, Ruoming Pang, Sergey Kishchenko, Patrick Nguyen, Arun Narayanan, Hank Liao, Shuyuan Zhang, Anjuli Kannan, Rohit Prabhavalkar, Zhifeng Chen, Tara N. Sainath, Yonghui Wu:
A comparison of end-to-end models for long-form speech recognition. CoRR abs/1911.02242 (2019) - 2018
- [c90]Bo Li, Tara N. Sainath, Khe Chai Sim, Michiel Bacchiani, Eugene Weinstein, Patrick Nguyen, Zhifeng Chen, Yanghui Wu, Kanishka Rao:
Multi-Dialect Speech Recognition with a Single Sequence-to-Sequence Model. ICASSP 2018: 4749-4753 - [c89]Chung-Cheng Chiu, Tara N. Sainath, Yonghui Wu, Rohit Prabhavalkar, Patrick Nguyen, Zhifeng Chen, Anjuli Kannan, Ron J. Weiss, Kanishka Rao, Ekaterina Gonina, Navdeep Jaitly, Bo Li, Jan Chorowski
, Michiel Bacchiani:
State-of-the-Art Speech Recognition with Sequence-to-Sequence Models. ICASSP 2018: 4774-4778 - [c88]Rohit Prabhavalkar, Tara N. Sainath, Yonghui Wu, Patrick Nguyen, Zhifeng Chen, Chung-Cheng Chiu, Anjuli Kannan:
Minimum Word Error Rate Training for Attention-Based Sequence-to-Sequence Models. ICASSP 2018: 4839-4843 - [c87]Shubham Toshniwal, Tara N. Sainath, Ron J. Weiss, Bo Li, Pedro J. Moreno, Eugene Weinstein, Kanishka Rao:
Multilingual Speech Recognition with a Single End-to-End Model. ICASSP 2018: 4904-4908 - [c86]Shuo-Yiin Chang, Bo Li, Gabor Simko, Tara N. Sainath, Anshuman Tripathi, Aäron van den Oord, Oriol Vinyals:
Temporal Modeling Using Dilated Convolution and Gating for Voice-Activity-Detection. ICASSP 2018: 5549-5553 - [c85]Chanwoo Kim, Tara N. Sainath, Arun Narayanan, Ananya Misra, Rajeev C. Nongpiur, Michiel Bacchiani:
Spectral Distortion Model for Training Phase-Sensitive Deep-Neural Networks for Far-Field Speech Recognition. ICASSP 2018: 5729-5733 - [c84]Anjuli Kannan, Yonghui Wu, Patrick Nguyen, Tara N. Sainath, Zhifeng Chen, Rohit Prabhavalkar:
An Analysis of Incorporating an External Language Model into a Sequence-to-Sequence Model. ICASSP 2018: 5824-5828 - [c83]Tara N. Sainath, Rohit Prabhavalkar, Shankar Kumar, Seungji Lee, Anjuli Kannan, David Rybach, Vlad Schogol, Patrick Nguyen, Bo Li, Yonghui Wu, Zhifeng Chen, Chung-Cheng Chiu:
No Need for a Lexicon? Evaluating the Value of the Pronunciation Lexica in End-to-End Models. ICASSP 2018: 5859-5863 - [c82]Tara N. Sainath, Chung-Cheng Chiu, Rohit Prabhavalkar, Anjuli Kannan, Yonghui Wu, Patrick Nguyen, Zhifeng Chen:
Improving the Performance of Online Neural Transducer Models. ICASSP 2018: 5864-5868 - [c81]Jahn Heymann, Michiel Bacchiani, Tara N. Sainath:
Performance of Mask Based Statistical Beamforming in a Smart Home Scenario. ICASSP 2018: 6722-6726 - [c80]Ruoming Pang, Tara N. Sainath, Rohit Prabhavalkar, Suyog Gupta, Yonghui Wu, Shuyuan Zhang, Chung-Cheng Chiu:
Compression of End-to-End Models. INTERSPEECH 2018: 27-31 - [c79]Khe Chai Sim, Arun Narayanan, Ananya Misra, Anshuman Tripathi, Golan Pundak, Tara N. Sainath, Parisa Haghani, Bo Li, Michiel Bacchiani:
Domain Adaptation Using Factorized Hidden Layer for Robust Automatic Speech Recognition. INTERSPEECH 2018: 892-896 - [c78]Ian Williams, Anjuli Kannan, Petar S. Aleksic, David Rybach, Tara N. Sainath:
Contextual Speech Recognition in End-to-end Neural Network Systems Using Beam Search. INTERSPEECH 2018: 2227-2231 - [c77]Shubham Toshniwal, Anjuli Kannan, Chung-Cheng Chiu, Yonghui Wu, Tara N. Sainath, Karen Livescu:
A Comparison of Techniques for Language Model Integration in Encoder-Decoder Speech Recognition. SLT 2018: 369-375 - [c76]Golan Pundak, Tara N. Sainath, Rohit Prabhavalkar, Anjuli Kannan, Ding Zhao:
Deep Context: End-to-end Contextual Speech Recognition. SLT 2018: 418-425 - [i16]Shubham Toshniwal, Anjuli Kannan, Chung-Cheng Chiu, Yonghui Wu, Tara N. Sainath, Karen Livescu:
A Comparison of Techniques for Language Model Integration in Encoder-Decoder Speech Recognition. CoRR abs/1807.10857 (2018) - [i15]Golan Pundak, Tara N. Sainath, Rohit Prabhavalkar, Anjuli Kannan, Ding Zhao:
Deep context: end-to-end contextual speech recognition. CoRR abs/1808.02480 (2018) - [i14]Uri Alon, Golan Pundak, Tara N. Sainath:
Contextual Speech Recognition with Difficult Negative Training Examples. CoRR abs/1810.12170 (2018) - [i13]Yanzhang He, Tara N. Sainath, Rohit Prabhavalkar, Ian McGraw, Raziel Alvarez, Ding Zhao, David Rybach, Anjuli Kannan, Yonghui Wu, Ruoming Pang, Qiao Liang, Deepti Bhatia, Yuan Shangguan, Bo Li, Golan Pundak, Khe Chai Sim, Tom Bagby, Shuo-Yiin Chang, Kanishka Rao, Alexander Gruenstein:
Streaming End-to-end Speech Recognition For Mobile Devices. CoRR abs/1811.06621 (2018) - [i12]Bo Li, Yu Zhang, Tara N. Sainath, Yonghui Wu, William Chan:
Bytes are All You Need: End-to-End Multilingual Speech Recognition and Synthesis with Bytes. CoRR abs/1811.09021 (2018) - 2017
- [j5]Tara N. Sainath, Ron J. Weiss, Kevin W. Wilson, Bo Li
, Arun Narayanan, Ehsan Variani, Michiel Bacchiani, Izhak Shafran, Andrew W. Senior, Kean K. Chin, Ananya Misra, Chanwoo Kim:
Multichannel Signal Processing With Deep Neural Networks for Automatic Speech Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 25(5): 965-979 (2017) - [j4]I-Hsin Chung, Tara N. Sainath, Bhuvana Ramabhadran, Michael Picheny, John A. Gunnels, Vernon Austel, Upendra V. Chaudhari, Brian Kingsbury:
Parallel Deep Neural Network Training for Big Data on Blue Gene/Q. IEEE Trans. Parallel Distributed Syst. 28(6): 1703-1714 (2017) - [c75]Khe Chai Sim, Arun Narayanan, Tom Bagby, Tara N. Sainath, Michiel Bacchiani:
Improving the efficiency of forward-backward algorithm using batched computation in TensorFlow. ASRU 2017: 258-264 - [c74]Chanwoo Kim, Ananya Misra, Kean K. Chin, Thad Hughes, Arun Narayanan, Tara N. Sainath, Michiel Bacchiani:
Generation of Large-Scale Simulated Utterances in Virtual Rooms to Train Deep-Neural Networks for Far-Field Speech Recognition in Google Home. INTERSPEECH 2017: 379-383 - [c73]Bo Li, Tara N. Sainath, Arun Narayanan, Joe Caroselli, Michiel Bacchiani, Ananya Misra, Izhak Shafran, Hasim Sak, Golan Pundak, Kean K. Chin, Khe Chai Sim, Ron J. Weiss, Kevin W. Wilson, Ehsan Variani, Chanwoo Kim, Olivier Siohan, Mitchel Weintraub, Erik McDermott, Richard Rose, Matt Shannon:
Acoustic Modeling for Google Home. INTERSPEECH 2017: 399-403 - [c72]Rohit Prabhavalkar, Kanishka Rao, Tara N. Sainath, Bo Li, Leif Johnson, Navdeep Jaitly:
A Comparison of Sequence-to-Sequence Models for Speech Recognition. INTERSPEECH 2017: 939-943 - [c71]Bo Li, Tara N. Sainath:
Reducing the Computational Complexity of Two-Dimensional LSTMs. INTERSPEECH 2017: 964-968 - [c70]Golan Pundak, Tara N. Sainath:
Highway-LSTM and Recurrent Highway Networks for Speech Recognition. INTERSPEECH 2017: 1303-1307 - [c69]Tara N. Sainath, Vijayaditya Peddinti, Olivier Siohan, Arun Narayanan:
Annealed f-Smoothing as a Mechanism to Speed up Neural Network Training. INTERSPEECH 2017: 3542-3546 - [c68]Rohit Prabhavalkar, Tara N. Sainath, Bo Li, Kanishka Rao, Navdeep Jaitly:
An Analysis of "Attention" in Sequence-to-Sequence Models. INTERSPEECH 2017: 3702-3706 - [c67]Shuo-Yiin Chang, Bo Li, Tara N. Sainath, Gabor Simko, Carolina Parada:
Endpoint Detection Using Grid Long Short-Term Memory Networks for Streaming Speech Recognition. INTERSPEECH 2017: 3812-3816 - [p1]Tara N. Sainath, Ron J. Weiss, Kevin W. Wilson, Arun Narayanan, Michiel Bacchiani, Bo Li, Ehsan Variani, Izhak Shafran, Andrew W. Senior, Kean K. Chin, Ananya Misra, Chanwoo Kim:
Raw Multichannel Processing Using Deep Neural Networks. New Era for Robust Speech Recognition, Exploiting Deep Learning 2017: 105-133 - [i11]Shubham Toshniwal, Tara N. Sainath, Ron J. Weiss, Bo Li, Pedro J. Moreno, Eugene Weinstein, Kanishka Rao:
Multilingual Speech Recognition With A Single End-To-End Model. CoRR abs/1711.01694 (2017) - [i10]Bo Li, Tara N. Sainath, Khe Chai Sim, Michiel Bacchiani, Eugene Weinstein, Patrick Nguyen, Zhifeng Chen, Yonghui Wu, Kanishka Rao:
Multi-Dialect Speech Recognition With A Single Sequence-To-Sequence Model. CoRR abs/1712.01541 (2017) - [i9]Chung-Cheng Chiu, Tara N. Sainath, Yonghui Wu, Rohit Prabhavalkar, Patrick Nguyen, Zhifeng Chen, Anjuli Kannan, Ron J. Weiss, Kanishka Rao, Katya Gonina, Navdeep Jaitly, Bo Li, Jan Chorowski, Michiel Bacchiani:
State-of-the-art Speech Recognition With Sequence-to-Sequence Models. CoRR abs/1712.01769 (2017) - [i8]Tara N. Sainath, Chung-Cheng Chiu, Rohit Prabhavalkar, Anjuli Kannan, Yonghui Wu, Patrick Nguyen, Zhifeng Chen:
Improving the Performance of Online Neural Transducer Models. CoRR abs/1712.01807 (2017) - [i7]Rohit Prabhavalkar, Tara N. Sainath, Yonghui Wu, Patrick Nguyen, Zhifeng Chen, Chung-Cheng Chiu, Anjuli Kannan:
Minimum Word Error Rate Training for Attention-based Sequence-to-Sequence Models. CoRR abs/1712.01818 (2017) - [i6]Tara N. Sainath, Rohit Prabhavalkar, Shankar Kumar, Seungji Lee, Anjuli Kannan, David Rybach, Vlad Schogol, Patrick Nguyen, Bo Li, Yonghui Wu, Zhifeng Chen, Chung-Cheng Chiu:
No Need for a Lexicon? Evaluating the Value of the Pronunciation Lexica in End-to-End Models. CoRR abs/1712.01864 (2017) - [i5]Anjuli Kannan, Yonghui Wu, Patrick Nguyen, Tara N. Sainath, Zhifeng Chen, Rohit Prabhavalkar:
An analysis of incorporating an external language model into a sequence-to-sequence model. CoRR abs/1712.01996 (2017) - 2016
- [c66]Tara N. Sainath, Ron J. Weiss, Kevin W. Wilson, Arun Narayanan, Michiel Bacchiani:
Factored spatial and spectral multichannel raw waveform CLDNNs. ICASSP 2016: 5075-5079 - [c65]Zhiyun Lu, Vikas Sindhwani, Tara N. Sainath:
Learning compact recurrent neural networks. ICASSP 2016: 5960-5964 - [c64]Golan Pundak, Tara N. Sainath:
Lower Frame Rate Neural Network Acoustic Models. INTERSPEECH 2016: 22-26 - [c63]Ehsan Variani, Tara N. Sainath, Izhak Shafran, Michiel Bacchiani:
Complex Linear Projection (CLP): A Discriminative Approach to Joint Feature Extraction and Acoustic Modeling. INTERSPEECH 2016: 808-812 - [c62]Tara N. Sainath, Bo Li:
Modeling Time-Frequency Patterns with LSTM vs. Convolutional Architectures for LVCSR Tasks. INTERSPEECH 2016: 813-817 - [c61]Tara N. Sainath, Arun Narayanan, Ron J. Weiss, Ehsan Variani, Kevin W. Wilson, Michiel Bacchiani, Izhak Shafran:
Reducing the Computational Complexity of Multimicrophone Acoustic Models with Integrated Feature Extraction. INTERSPEECH 2016: 1971-1975 - [c60]Bo Li, Tara N. Sainath, Ron J. Weiss, Kevin W. Wilson, Michiel Bacchiani:
Neural Network Adaptive Beamforming for Robust Multichannel Speech Recognition. INTERSPEECH 2016: 1976-1980 - [c59]Rubén Zazo, Tara N. Sainath, Gabor Simko, Carolina Parada:
Feature Learning with Raw-Waveform CLDNNs for Voice Activity Detection. INTERSPEECH 2016: 3668-3672 - [i4]Zhiyun Lu, Vikas Sindhwani, Tara N. Sainath:
Learning Compact Recurrent Neural Networks. CoRR abs/1604.02594 (2016) - 2015
- [j3]Tara N. Sainath, Brian Kingsbury, George Saon, Hagen Soltau, Abdel-rahman Mohamed, George E. Dahl, Bhuvana Ramabhadran:
Deep Convolutional Neural Networks for Large-scale Speech Tasks. Neural Networks 64: 39-48 (2015) - [c58]Tara N. Sainath, Ron J. Weiss, Kevin W. Wilson, Arun Narayanan, Michiel Bacchiani, Andrew W. Senior:
Speaker location and microphone spacing invariant acoustic modeling from raw multichannel waveforms. ASRU 2015: 30-36 - [c57]Andrew W. Senior, Hasim Sak, Felix de Chaumont Quitry, Tara N. Sainath, Kanishka Rao:
Acoustic modelling with CD-CTC-SMBR LSTM RNNS. ASRU 2015: 604-609 - [c56]Tara N. Sainath, Oriol Vinyals, Andrew W. Senior, Hasim Sak:
Convolutional, Long Short-Term Memory, fully connected Deep Neural Networks. ICASSP 2015: 4580-4584 - [c55]Rohit Prabhavalkar, Raziel Alvarez, Carolina Parada, Preetum Nakkiran, Tara N. Sainath:
Automatic gain control and multi-style training for robust small-footprint keyword spotting with deep neural networks. ICASSP 2015: 4704-4708 - [c54]Guoguo Chen, Carolina Parada, Tara N. Sainath:
Query-by-example keyword spotting using long short-term memory networks. ICASSP 2015: 5236-5240 - [c53]Tara N. Sainath, Ron J. Weiss, Andrew W. Senior, Kevin W. Wilson, Oriol Vinyals:
Learning the speech front-end with raw waveform CLDNNs. INTERSPEECH 2015: 1-5 - [c52]Yu-hsin Chen, Ignacio Lopez-Moreno, Tara N. Sainath, Mirkó Visontai, Raziel Alvarez, Carolina Parada:
Locally-connected and convolutional neural networks for small footprint speaker recognition. INTERSPEECH 2015: 1136-1140 - [c51]Tara N. Sainath, Carolina Parada:
Convolutional neural networks for small-footprint keyword spotting. INTERSPEECH 2015: 1478-1482 - [c50]Hank Liao, Golan Pundak, Olivier Siohan, Melissa K. Carroll, Noah Coccaro, Qi-Ming Jiang, Tara N. Sainath, Andrew W. Senior, Françoise Beaufays, Michiel Bacchiani:
Large vocabulary automatic speech recognition for children. INTERSPEECH 2015: 1611-1615 - [c49]Vikas Sindhwani, Tara N. Sainath, Sanjiv Kumar:
Structured Transforms for Small-Footprint Deep Learning. NIPS 2015: 3088-3096 - [i3]Vikas Sindhwani, Tara N. Sainath, Sanjiv Kumar:
Structured Transforms for Small-Footprint Deep Learning. CoRR abs/1510.01722 (2015) - 2014
- [c48]Po-Sen Huang, Haim Avron, Tara N. Sainath, Vikas Sindhwani, Bhuvana Ramabhadran:
Kernel methods match Deep Neural Networks on TIMIT. ICASSP 2014: 205-209 - [c47]Vijayaditya Peddinti, Tara N. Sainath, Shay Maymon, Bhuvana Ramabhadran, David Nahamoo, Vaibhava Goel:
Deep Scattering Spectrum with deep neural networks. ICASSP 2014: 210-214 - [c46]Hagen Soltau, George Saon, Tara N. Sainath:
Joint training of convolutional and non-convolutional neural networks. ICASSP 2014: 5572-5576 - [c45]Tara N. Sainath, Brian Kingsbury, Abdel-rahman Mohamed, George Saon, Bhuvana Ramabhadran:
Improvements to filterbank and delta learning within a deep neural network framework. ICASSP 2014: 6839-6843 - [c44]Tara N. Sainath, Vijayaditya Peddinti, Brian Kingsbury, Petr Fousek, Bhuvana Ramabhadran, David Nahamoo:
Deep scattering spectra with deep neural networks for LVCSR tasks. INTERSPEECH 2014: 900-904 - [c43]Tara N. Sainath, I-Hsin Chung, Bhuvana Ramabhadran, Michael Picheny, John A. Gunnels, Brian Kingsbury, George Saon, Vernon Austel, Upendra V. Chaudhari:
Parallel deep neural network training for LVCSR tasks using blue gene/Q. INTERSPEECH 2014: 1048-1052 - [c42]I-Hsin Chung, Tara N. Sainath, Bhuvana Ramabhadran, Michael Picheny, John A. Gunnels, Vernon Austel, Upendra V. Chaudhari, Brian Kingsbury:
Parallel Deep Neural Network Training for Big Data on Blue Gene/Q. SC 2014: 745-753 - 2013
- [j2]Tara N. Sainath, Brian Kingsbury, Hagen Soltau, Bhuvana Ramabhadran:
Optimization Techniques to Improve Training Speed of Deep Neural Networks for Large Speech Tasks. IEEE Trans. Speech Audio Process. 21(11): 2267-2276 (2013) - [c41]Tara N. Sainath, Brian Kingsbury,