


Остановите войну!
for scientists:


default search action
George Saon
Person information

Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2022
- [c107]Thomas Bohnstingl, Ayush Garg, Stanislaw Wozniak, George Saon, Evangelos Eleftheriou, Angeliki Pantazi
:
Speech Recognition Using Biologically-Inspired Neural Networks. ICASSP 2022: 6992-6996 - [c106]Hong-Kwang Jeff Kuo, Zoltán Tüske, Samuel Thomas, Brian Kingsbury, George Saon:
Improving End-to-end Models for Set Prediction in Spoken Language Understanding. ICASSP 2022: 7162-7166 - [c105]Samuel Thomas, Hong-Kwang Jeff Kuo, Brian Kingsbury, George Saon:
Towards Reducing the Need for Speech Training Data to Build Spoken Language Understanding Systems. ICASSP 2022: 7932-7936 - [c104]Samuel Thomas, Brian Kingsbury, George Saon, Hong-Kwang Jeff Kuo:
Integrating Text Inputs for Training and Adapting RNN Transducer ASR Models. ICASSP 2022: 8127-8131 - [c103]Zvi Kons, Hagai Aronowitz, Edmilson da Silva Morais, Matheus Damasceno, Hong-Kwang Kuo, Samuel Thomas, George Saon:
Extending RNN-T-based speech recognition systems with emotion and language classification. INTERSPEECH 2022: 546-549 - [c102]Jiatong Shi, George Saon, David Haws, Shinji Watanabe, Brian Kingsbury:
VQ-T: RNN Transducers using Vector-Quantized Prediction Network States. INTERSPEECH 2022: 1656-1660 - [c101]Andrea Fasoli, Chia-Yu Chen, Mauricio J. Serrano, Swagath Venkataramani, George Saon, Xiaodong Cui, Brian Kingsbury, Kailash Gopalakrishnan:
Accelerating Inference and Language Model Fusion of Recurrent Neural Network Transducers via End-to-End 4-bit Quantization. INTERSPEECH 2022: 2038-2042 - [c100]Xiaodong Cui, George Saon, Tohru Nagano, Masayuki Suzuki, Takashi Fukuda, Brian Kingsbury, Gakuto Kurata:
Improving Generalization of Deep Neural Network Acoustic Models with Length Perturbation and N-best Based Label Smoothing. INTERSPEECH 2022: 2638-2642 - [c99]Takashi Fukuda, Samuel Thomas, Masayuki Suzuki, Gakuto Kurata, George Saon, Brian Kingsbury:
Global RNN Transducer Models For Multi-dialect Speech Recognition. INTERSPEECH 2022: 3138-3142 - [c98]Takuma Udagawa, Masayuki Suzuki, Gakuto Kurata, Nobuyasu Itoh, George Saon:
Effect and Analysis of Large-scale Language Model Rescoring on Competitive ASR Systems. INTERSPEECH 2022: 3919-3923 - [i31]Hong-Kwang Jeff Kuo, Zoltán Tüske, Samuel Thomas, Brian Kingsbury, George Saon:
Improving End-to-End Models for Set Prediction in Spoken Language Understanding. CoRR abs/2201.12105 (2022) - [i30]Samuel Thomas, Brian Kingsbury, George Saon, Hong-Kwang Jeff Kuo:
Integrating Text Inputs For Training and Adapting RNN Transducer ASR Models. CoRR abs/2202.13155 (2022) - [i29]Samuel Thomas, Hong-Kwang Jeff Kuo, Brian Kingsbury, George Saon:
Towards Reducing the Need for Speech Training Data To Build Spoken Language Understanding Systems. CoRR abs/2203.00006 (2022) - [i28]Xiaodong Cui, George Saon, Tohru Nagano, Masayuki Suzuki, Takashi Fukuda, Brian Kingsbury, Gakuto Kurata:
Improving Generalization of Deep Neural Network Acoustic Models with Length Perturbation and N-best Based Label Smoothing. CoRR abs/2203.15176 (2022) - [i27]Takuma Udagawa, Masayuki Suzuki, Gakuto Kurata, Nobuyasu Itoh, George Saon:
Effect and Analysis of Large-scale Language Model Rescoring on Competitive ASR Systems. CoRR abs/2204.00212 (2022) - [i26]Andrea Fasoli, Chia-Yu Chen, Mauricio J. Serrano, Swagath Venkataramani, George Saon, Xiaodong Cui, Brian Kingsbury, Kailash Gopalakrishnan:
Accelerating Inference and Language Model Fusion of Recurrent Neural Network Transducers via End-to-End 4-bit Quantization. CoRR abs/2206.07882 (2022) - [i25]Zvi Kons, Hagai Aronowitz, Edmilson da Silva Morais, Matheus Damasceno, Hong-Kwang Kuo, Samuel Thomas, George Saon:
Extending RNN-T-based speech recognition systems with emotion and language classification. CoRR abs/2207.13965 (2022) - [i24]Jiatong Shi, George Saon, David Haws, Shinji Watanabe, Brian Kingsbury:
VQ-T: RNN Transducers using Vector-Quantized Prediction Network States. CoRR abs/2208.01818 (2022) - 2021
- [j16]Xiaodong Cui
, Wei Zhang, Abdullah Kayi, Mingrui Liu, Ulrich Finkler, Brian Kingsbury
, George Saon, David S. Kung:
Asynchronous Decentralized Distributed Training of Acoustic Models. IEEE ACM Trans. Audio Speech Lang. Process. 29: 3565-3576 (2021) - [c97]George Saon, Zoltán Tüske, Daniel Bolaños, Brian Kingsbury:
Advancing RNN Transducer Technology for Speech Recognition. ICASSP 2021: 5654-5658 - [c96]Samuel Thomas, Hong-Kwang Jeff Kuo, George Saon, Zoltán Tüske, Brian Kingsbury, Gakuto Kurata, Zvi Kons, Ron Hoory:
RNN Transducer Models for Spoken Language Understanding. ICASSP 2021: 7493-7497 - [c95]Jatin Ganhotra, Samuel Thomas, Hong-Kwang Jeff Kuo, Sachindra Joshi, George Saon, Zoltán Tüske, Brian Kingsbury:
Integrating Dialog History into End-to-End Spoken Language Understanding Systems. Interspeech 2021: 1254-1258 - [c94]Xiaodong Cui, Brian Kingsbury, George Saon, David Haws, Zoltán Tüske:
Reducing Exposure Bias in Training Recurrent Neural Network Transducers. Interspeech 2021: 1802-1806 - [c93]Gakuto Kurata, George Saon, Brian Kingsbury, David Haws, Zoltán Tüske:
Improving Customization of Neural Transducers by Mitigating Acoustic Mismatch of Synthesized Audio. Interspeech 2021: 2027-2031 - [c92]Zoltán Tüske, George Saon, Brian Kingsbury:
On the Limit of English Conversational Speech Recognition. Interspeech 2021: 2062-2066 - [c91]Andrea Fasoli, Chia-Yu Chen, Mauricio J. Serrano, Xiao Sun, Naigang Wang, Swagath Venkataramani, George Saon, Xiaodong Cui, Brian Kingsbury, Wei Zhang, Zoltán Tüske, Kailash Gopalakrishnan:
4-Bit Quantization of LSTM-Based Speech Recognition Models. Interspeech 2021: 2586-2590 - [i23]George Saon, Zoltán Tüske, Daniel Bolaños, Brian Kingsbury:
Advancing RNN Transducer Technology for Speech Recognition. CoRR abs/2103.09935 (2021) - [i22]Samuel Thomas, Hong-Kwang Jeff Kuo, George Saon, Zoltán Tüske, Brian Kingsbury, Gakuto Kurata, Zvi Kons, Ron Hoory:
RNN Transducer Models For Spoken Language Understanding. CoRR abs/2104.03842 (2021) - [i21]Zoltán Tüske, George Saon, Brian Kingsbury:
On the limit of English conversational speech recognition. CoRR abs/2105.00982 (2021) - [i20]Jatin Ganhotra, Samuel Thomas, Hong-Kwang Jeff Kuo, Sachindra Joshi, George Saon, Zoltán Tüske, Brian Kingsbury:
Integrating Dialog History into End-to-End Spoken Language Understanding Systems. CoRR abs/2108.08405 (2021) - [i19]Xiaodong Cui, Brian Kingsbury, George Saon, David Haws, Zoltán Tüske:
Reducing Exposure Bias in Training Recurrent Neural Network Transducers. CoRR abs/2108.10803 (2021) - [i18]Andrea Fasoli, Chia-Yu Chen, Mauricio J. Serrano, Xiao Sun, Naigang Wang, Swagath Venkataramani, George Saon, Xiaodong Cui, Brian Kingsbury, Wei Zhang, Zoltán Tüske, Kailash Gopalakrishnan:
4-bit Quantization of LSTM-based Speech Recognition Models. CoRR abs/2108.12074 (2021) - [i17]Thomas Bohnstingl, Ayush Garg, Stanislaw Wozniak, George Saon, Evangelos Eleftheriou, Angeliki Pantazi:
Towards efficient end-to-end speech recognition with biologically-inspired neural networks. CoRR abs/2110.02743 (2021) - [i16]Xiaodong Cui, Wei Zhang, Abdullah Kayi, Mingrui Liu, Ulrich Finkler, Brian Kingsbury, George Saon, David S. Kung:
Asynchronous Decentralized Distributed Training of Acoustic Models. CoRR abs/2110.11199 (2021) - 2020
- [j15]Xiaodong Cui
, Wei Zhang, Ulrich Finkler, George Saon, Michael Picheny, David S. Kung:
Distributed Training of Deep Neural Network Acoustic Models for Automatic Speech Recognition: A comparison of current training strategies. IEEE Signal Process. Mag. 37(3): 39-49 (2020) - [c90]Wei Zhang, Xiaodong Cui, Abdullah Kayi, Mingrui Liu, Ulrich Finkler, Brian Kingsbury, George Saon, Youssef Mroueh, Alper Buyuktosunoglu, Payel Das, David S. Kung, Michael Picheny:
Improving Efficiency in Large-Scale Decentralized Distributed Training. ICASSP 2020: 3022-3026 - [c89]George Saon, Zoltán Tüske, Kartik Audhkhasi:
Alignment-Length Synchronous Decoding for RNN Transducer. ICASSP 2020: 7804-7808 - [c88]Zoltán Tüske, George Saon, Kartik Audhkhasi, Brian Kingsbury:
Single Headed Attention Based Sequence-to-Sequence Model for State-of-the-Art Results on Switchboard. INTERSPEECH 2020: 551-555 - [c87]Gakuto Kurata, George Saon:
Knowledge Distillation from Offline to Streaming RNN Transducer for End-to-End Speech Recognition. INTERSPEECH 2020: 2117-2121 - [i15]Zoltán Tüske, George Saon, Kartik Audhkhasi, Brian Kingsbury:
Single headed attention based sequence-to-sequence model for state-of-the-art results on Switchboard-300. CoRR abs/2001.07263 (2020) - [i14]Wei Zhang, Xiaodong Cui, Abdullah Kayi, Mingrui Liu, Ulrich Finkler, Brian Kingsbury, George Saon, Youssef Mroueh, Alper Buyuktosunoglu, Payel Das, David S. Kung, Michael Picheny:
Improving Efficiency in Large-Scale Decentralized Distributed Training. CoRR abs/2002.01119 (2020) - [i13]Xiaodong Cui, Wei Zhang, Ulrich Finkler, George Saon, Michael Picheny, David S. Kung:
Distributed Training of Deep Neural Network Acoustic Models for Automatic Speech Recognition. CoRR abs/2002.10502 (2020)
2010 – 2019
- 2019
- [c86]George Saon, Zoltán Tüske, Kartik Audhkhasi, Brian Kingsbury, Michael Picheny, Samuel Thomas:
Simplified LSTMS for Speech Recognition. ASRU 2019: 547-553 - [c85]Wei Zhang, Xiaodong Cui, Ulrich Finkler, Brian Kingsbury, George Saon, David S. Kung, Michael Picheny:
Distributed Deep Learning Strategies for Automatic Speech Recognition. ICASSP 2019: 5706-5710 - [c84]George Saon, Zoltán Tüske, Kartik Audhkhasi, Brian Kingsbury:
Sequence Noise Injected Training for End-to-end Speech Recognition. ICASSP 2019: 6261-6265 - [c83]Samuel Thomas, Masayuki Suzuki, Yinghui Huang, Gakuto Kurata, Zoltán Tüske, George Saon, Brian Kingsbury, Michael Picheny, Tom Dibert, Alice Kaiser-Schatzlein, Bern Samko:
English Broadcast News Speech Recognition by Humans and Machines. ICASSP 2019: 6455-6459 - [c82]Michael Picheny, Zoltán Tüske, Brian Kingsbury, Kartik Audhkhasi, Xiaodong Cui, George Saon:
Challenging the Boundaries of Speech Recognition: The MALACH Corpus. INTERSPEECH 2019: 326-330 - [c81]Kartik Audhkhasi, George Saon, Zoltán Tüske, Brian Kingsbury, Michael Picheny:
Forget a Bit to Learn Better: Soft Forgetting for CTC-Based Automatic Speech Recognition. INTERSPEECH 2019: 2618-2622 - [c80]Wei Zhang, Xiaodong Cui, Ulrich Finkler, George Saon, Abdullah Kayi, Alper Buyuktosunoglu, Brian Kingsbury, David S. Kung, Michael Picheny:
A Highly Efficient Distributed Deep Learning System for Automatic Speech Recognition. INTERSPEECH 2019: 2628-2632 - [c79]Zoltán Tüske, Kartik Audhkhasi, George Saon:
Advancing Sequence-to-Sequence Based Speech Recognition. INTERSPEECH 2019: 3780-3784 - [i12]Wei Zhang, Xiaodong Cui, Ulrich Finkler, Brian Kingsbury, George Saon, David S. Kung, Michael Picheny:
Distributed Deep Learning Strategies For Automatic Speech Recognition. CoRR abs/1904.04956 (2019) - [i11]Samuel Thomas, Masayuki Suzuki, Yinghui Huang, Gakuto Kurata, Zoltán Tüske, George Saon, Brian Kingsbury, Michael Picheny, Tom Dibert, Alice Kaiser-Schatzlein, Bern Samko:
English Broadcast News Speech Recognition by Humans and Machines. CoRR abs/1904.13258 (2019) - [i10]Wei Zhang, Xiaodong Cui, Ulrich Finkler, George Saon, Abdullah Kayi, Alper Buyuktosunoglu, Brian Kingsbury, David S. Kung, Michael Picheny:
A Highly Efficient Distributed Deep Learning System For Automatic Speech Recognition. CoRR abs/1907.05701 (2019) - [i9]Michael Picheny, Zoltán Tüske, Brian Kingsbury, Kartik Audhkhasi, Xiaodong Cui, George Saon:
Challenging the Boundaries of Speech Recognition: The MALACH Corpus. CoRR abs/1908.03455 (2019) - 2018
- [c78]Kartik Audhkhasi, Brian Kingsbury, Bhuvana Ramabhadran, George Saon, Michael Picheny:
Building Competitive Direct Acoustics-to-Word Models for English Conversational Speech Recognition. ICASSP 2018: 4759-4763 - 2017
- [j14]George Saon, Michael Picheny:
Recent advances in conversational speech recognition using convolutional and recurrent neural networks. IBM J. Res. Dev. 61(4-5): 1:1-1:10 (2017) - [j13]Kartik Audhkhasi, Andrew Rosenberg, George Saon, Abhinav Sethy, Bhuvana Ramabhadran, Stanley F. Chen, Michael Picheny:
Recent progress in deep end-to-end models for spoken language processing. IBM J. Res. Dev. 61(4-5): 2:1-2:10 (2017) - [c77]Gakuto Kurata, Bhuvana Ramabhadran, George Saon, Abhinav Sethy:
Language modeling with highway LSTM. ASRU 2017: 244-251 - [c76]Jia Cui, Brian Kingsbury, Bhuvana Ramabhadran, George Saon, Tom Sercu, Kartik Audhkhasi, Abhinav Sethy, Markus Nußbaum-Thom, Andrew Rosenberg:
Knowledge distillation across ensembles of multilingual models for low-resource languages. ICASSP 2017: 4825-4829 - [c75]Tom Sercu, George Saon, Jia Cui, Xiaodong Cui, Bhuvana Ramabhadran, Brian Kingsbury, Abhinav Sethy:
Network architectures for multilingual speech representation learning. ICASSP 2017: 5295-5299 - [c74]Xiaodong Cui, Vaibhava Goel, George Saon:
Embedding-Based Speaker Adaptive Training of Deep Neural Networks. INTERSPEECH 2017: 122-126 - [c73]George Saon, Gakuto Kurata, Tom Sercu, Kartik Audhkhasi, Samuel Thomas, Dimitrios Dimitriadis, Xiaodong Cui, Bhuvana Ramabhadran, Michael Picheny, Lynn-Li Lim, Bergul Roomi, Phil Hall:
English Conversational Telephone Speech Recognition by Humans and Machines. INTERSPEECH 2017: 132-136 - [c72]Gakuto Kurata, Abhinav Sethy, Bhuvana Ramabhadran, George Saon:
Empirical Exploration of Novel Architectures and Objectives for Language Models. INTERSPEECH 2017: 279-283 - [c71]Kartik Audhkhasi, Bhuvana Ramabhadran, George Saon, Michael Picheny, David Nahamoo:
Direct Acoustics-to-Word Models for English Conversational Speech Recognition. INTERSPEECH 2017: 959-963 - [c70]Guojing Cong, Brian Kingsbury, Soumyadip Gosh, George Saon, Fan Zhou:
Accelerating deep neural network learning for speech recognition on a cluster of GPUs. MLHPC@SC 2017: 3:1-3:8 - [i8]George Saon, Gakuto Kurata, Tom Sercu, Kartik Audhkhasi, Samuel Thomas, Dimitrios Dimitriadis, Xiaodong Cui, Bhuvana Ramabhadran, Michael Picheny, Lynn-Li Lim, Bergul Roomi, Phil Hall:
English Conversational Telephone Speech Recognition by Humans and Machines. CoRR abs/1703.02136 (2017) - [i7]Kartik Audhkhasi, Bhuvana Ramabhadran, George Saon, Michael Picheny, David Nahamoo:
Direct Acoustics-to-Word Models for English Conversational Speech Recognition. CoRR abs/1703.07754 (2017) - [i6]Gakuto Kurata, Bhuvana Ramabhadran, George Saon, Abhinav Sethy:
Language Modeling with Highway LSTM. CoRR abs/1709.06436 (2017) - [i5]Xiaodong Cui, Vaibhava Goel, George Saon:
Embedding-Based Speaker Adaptive Training of Deep Neural Networks. CoRR abs/1710.06937 (2017) - [i4]Kartik Audhkhasi, Brian Kingsbury, Bhuvana Ramabhadran, George Saon, Michael Picheny:
Building competitive direct acoustics-to-word models for English conversational speech recognition. CoRR abs/1712.03133 (2017) - 2016
- [c69]David Haws, Dimitrios Dimitriadis, George Saon, Samuel Thomas, Michael Picheny:
On the importance of event detection for ASR. ICASSP 2016: 5705-5709 - [c68]George Saon, Tom Sercu, Steven J. Rennie, Hong-Kwang Jeff Kuo:
The IBM 2016 English Conversational Telephone Speech Recognition System. INTERSPEECH 2016: 7-11 - [c67]Masayuki Suzuki, Ryuki Tachibana, Samuel Thomas, Bhuvana Ramabhadran, George Saon:
Domain Adaptation of CNN Based Acoustic Models Under Limited Resource Settings. INTERSPEECH 2016: 1588-1592 - [i3]George Saon, Tom Sercu, Steven J. Rennie, Hong-Kwang Jeff Kuo:
The IBM 2016 English Conversational Telephone Speech Recognition System. CoRR abs/1604.08242 (2016) - 2015
- [j12]Tara N. Sainath, Brian Kingsbury, George Saon, Hagen Soltau, Abdel-rahman Mohamed, George E. Dahl, Bhuvana Ramabhadran:
Deep Convolutional Neural Networks for Large-scale Speech Tasks. Neural Networks 64: 39-48 (2015) - [c66]Samuel Thomas, George Saon, Maarten Van Segbroeck, Shrikanth S. Narayanan:
Improvements to the IBM speech activity detection system for the DARPA RATS program. ICASSP 2015: 4500-4504 - [c65]Nitish Shirish Keskar, George Saon:
A nonmonotone learning rate strategy for SGD training of deep neural networks. ICASSP 2015: 4974-4978 - [c64]Lidia Mangu, George Saon, Michael Picheny, Brian Kingsbury:
Order-free spoken term detection. ICASSP 2015: 5331-5335 - [c63]George Saon, Hong-Kwang Jeff Kuo, Steven J. Rennie, Michael Picheny:
The IBM 2015 English conversational telephone speech recognition system. INTERSPEECH 2015: 3140-3144 - [c62]Samuel Thomas, George Saon, Hong-Kwang Jeff Kuo, Lidia Mangu:
The IBM BOLT speech transcription system. INTERSPEECH 2015: 3150-3153 - [c61]Jia Cui, George Saon, Bhuvana Ramabhadran, Brian Kingsbury:
A multi-region deep neural network model in speech recognition. INTERSPEECH 2015: 3244-3248 - [i2]George Saon, Hong-Kwang Jeff Kuo, Steven J. Rennie, Michael Picheny:
The IBM 2015 English Conversational Telephone Speech Recognition System. CoRR abs/1505.05899 (2015) - 2014
- [c60]Samuel Thomas, Sriram Ganapathy, George Saon, Hagen Soltau:
Analyzing convolutional neural networks for speech activity detection in mismatched acoustic conditions. ICASSP 2014: 2519-2523 - [c59]George Saon, Hagen Soltau:
A comparison of two optimization techniques for sequence discriminative training of deep neural networks. ICASSP 2014: 5567-5571 - [c58]Hagen Soltau, George Saon, Tara N. Sainath:
Joint training of convolutional and non-convolutional neural networks. ICASSP 2014: 5572-5576 - [c57]Tara N. Sainath, Brian Kingsbury, Abdel-rahman Mohamed, George Saon, Bhuvana Ramabhadran:
Improvements to filterbank and delta learning within a deep neural network framework. ICASSP 2014: 6839-6843 - [c56]George Saon, Hagen Soltau, Ahmad Emami, Michael Picheny:
Unfolded recurrent neural networks for speech recognition. INTERSPEECH 2014: 343-347 - [c55]Tara N. Sainath, I-Hsin Chung, Bhuvana Ramabhadran, Michael Picheny, John A. Gunnels, Brian Kingsbury, George Saon, Vernon Austel, Upendra V. Chaudhari:
Parallel deep neural network training for LVCSR tasks using blue gene/Q. INTERSPEECH 2014: 1048-1052 - [c54]George Saon:
A distributed architecture for fast SGD sequence discriminative training of DNN acoustic models. SLT 2014: 183-188 - [p1]Hagen Soltau, George Saon, Lidia Mangu, Hong-Kwang Kuo, Brian Kingsbury, Stephen M. Chu, Fadi Biadsy:
Automatic Speech Recognition. NLP of Semitic Languages 2014: 409-459 - 2013
- [c53]George Saon, Hagen Soltau, David Nahamoo, Michael Picheny:
Speaker adaptation of neural network acoustic models using i-vectors. ASRU 2013: 55-59 - [c52]Lidia Mangu, Hagen Soltau, Hong-Kwang Kuo, George Saon:
The IBM keyword search system for the DARPA RATS program. ASRU 2013: 204-209 - [c51]Tara N. Sainath, Brian Kingsbury, Abdel-rahman Mohamed, George E. Dahl, George Saon, Hagen Soltau, Tomás Beran, Aleksandr Y. Aravkin, Bhuvana Ramabhadran:
Improvements to Deep Convolutional Neural Networks for LVCSR. ASRU 2013: 315-320 - [c50]Lidia Mangu, Hagen Soltau, Hong-Kwang Kuo, Brian Kingsbury, George Saon:
Exploiting diversity for spoken term detection. ICASSP 2013: 8282-8286 - [c49]Hagen Soltau, Hong-Kwang Kuo, Lidia Mangu, George Saon, Tomás Beran:
Neural network acoustic models for the DARPA RATS program. INTERSPEECH 2013: 3092-3096 - [c48]George Saon, Samuel Thomas, Hagen Soltau, Sriram Ganapathy, Brian Kingsbury:
The IBM speech activity detection system for the DARPA RATS program. INTERSPEECH 2013: 3497-3501 - [i1]Tara N. Sainath, Brian Kingsbury, Abdel-rahman Mohamed, George E. Dahl, George Saon, Hagen Soltau, Tomás Beran, Aleksandr Y. Aravkin, Bhuvana Ramabhadran:
Improvements to deep convolutional neural networks for LVCSR. CoRR abs/1309.1501 (2013) - 2012
- [j11]George Saon, Hagen Soltau:
Boosting systems for large vocabulary continuous speech recognition. Speech Commun. 54(2): 212-218 (2012) - [j10]George Saon, Jen-Tzung Chien
:
Large-Vocabulary Continuous Speech Recognition Systems: A Look at Some Recent Advances. IEEE Signal Process. Mag. 29(6): 18-33 (2012) - [j9]George Saon, Jen-Tzung Chien
:
Bayesian Sensing Hidden Markov Models. IEEE Trans. Speech Audio Process. 20(1): 43-54 (2012) - [c47]George Saon, Jen-Tzung Chien:
Recent developments in large vocabulary continuous speech recognition. APSIPA 2012: 1-6 - [c46]George Saon, Brian Kingsbury:
Discriminative feature-space transforms using deep neural networks. INTERSPEECH 2012: 14-17 - [c45]Xiaodong Cui, Mohamed Afify, George Saon, Vaibhava Goel:
Sparse Bayesian Factor Analysis for Stereo-based Stochastic Mapping. INTERSPEECH 2012: 795-798 - 2011
- [j8]Michael Picheny, David Nahamoo, Vaibhava Goel
, Brian Kingsbury, Bhuvana Ramabhadran, Steven J. Rennie, George Saon:
Trends and advances in speech recognition. IBM J. Res. Dev. 55(5): 2 (2011) - [c44]George Saon, Jen-Tzung Chien
:
Some properties of Bayesian sensing hidden Markov models. ASRU 2011: 65-70 - [c43]Hong-Kwang Jeff Kuo, Ebru Arisoy, Lidia Mangu, George Saon:
Minimum Bayes risk discriminative language models for Arabic speech recognition. ASRU 2011: 208-213 - [c42]Lidia Mangu, Hong-Kwang Kuo, Stephen M. Chu, Brian Kingsbury, George Saon, Hagen Soltau, Fadi Biadsy:
The IBM 2011 GALE Arabic speech transcription system. ASRU 2011: 272-277 - [c41]Brian Kingsbury, Hagen Soltau, George Saon, Stephen M. Chu, Hong-Kwang Kuo, Lidia Mangu, Suman V. Ravuri, Nelson Morgan, Adam Janin:
The IBM 2009 GALE Arabic speech transcription system. ICASSP 2011: 4672-4675 - [c40]George Saon, Jen-Tzung Chien
:
Bayesian sensing hidden Markov models for speech recognition. ICASSP 2011: 5056-5059 - [c39]George Saon, Jen-Tzung Chien
:
Discriminative training for Bayesian sensing hidden Markov models. ICASSP 2011: 5316-5319 - [c38]George Saon, Jen-Tzung Chien:
Bayesian sensing hidden Markov models for speech recognition. MLSLP 2011 - 2010
- [c37]George Saon, Hagen Soltau, Upendra V. Chaudhari, Stephen M. Chu, Brian Kingsbury, Hong-Kwang Kuo, Lidia Mangu, Daniel Povey:
The IBM 2008 GALE Arabic speech transcription system. ICASSP 2010: 4378-4381 - [c36]George Saon, Hagen Soltau:
Boosting systems for LVCSR. INTERSPEECH 2010: 1341-1344 - [c35]Hagen Soltau, George Saon, Brian Kingsbury:
The IBM Attila speech recognition toolkit. SLT 2010: 97-102