default search action
Ruoming Pang
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2024
- [c57]Brandon McKinzie, Zhe Gan, Jean-Philippe Fauconnier, Sam Dodge, Bowen Zhang, Philipp Dufter, Dhruti Shah, Xianzhi Du, Futang Peng, Anton Belyi, Haotian Zhang, Karanjeet Singh, Doug Kang, Hongyu Hè, Max Schwarzer, Tom Gunter, Xiang Kong, Aonan Zhang, Jianyu Wang, Chong Wang, Nan Du, Tao Lei, Sam Wiseman, Mark Lee, Zirui Wang, Ruoming Pang, Peter Grasch, Alexander Toshev, Yinfei Yang:
MM1: Methods, Analysis and Insights from Multimodal LLM Pre-training. ECCV (29) 2024: 304-323 - [i54]Brandon McKinzie, Zhe Gan, Jean-Philippe Fauconnier, Sam Dodge, Bowen Zhang, Philipp Dufter, Dhruti Shah, Xianzhi Du, Futang Peng, Floris Weers, Anton Belyi, Haotian Zhang, Karanjeet Singh, Doug Kang, Ankur Jain, Hongyu Hè, Max Schwarzer, Tom Gunter, Xiang Kong, Aonan Zhang, Jianyu Wang, Chong Wang, Nan Du, Tao Lei, Sam Wiseman, Guoli Yin, Mark Lee, Zirui Wang, Ruoming Pang, Peter Grasch, Alexander Toshev, Yinfei Yang:
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training. CoRR abs/2403.09611 (2024) - [i53]Xianzhi Du, Tom Gunter, Xiang Kong, Mark Lee, Zirui Wang, Aonan Zhang, Nan Du, Ruoming Pang:
Revisiting MoE and Dense Speed-Accuracy Comparisons for LLM Training. CoRR abs/2405.15052 (2024) - [i52]Xiang Kong, Tom Gunter, Ruoming Pang:
Large Language Model-guided Document Selection. CoRR abs/2406.04638 (2024) - [i51]Guoli Yin, Haoping Bai, Shuang Ma, Feng Nan, Yanchao Sun, Zhaoyang Xu, Shen Ma, Jiarui Lu, Xiang Kong, Aonan Zhang, Dian Ang Yap, Yizhe Zhang, Karsten Ahnert, Vik Kamath, Mathias Berglund, Dominic Walsh, Tobias Gindele, Juergen Wiest, Zhengfeng Lai, Xiaoming Wang, Jiulong Shan, Meng Cao, Ruoming Pang, Zirui Wang:
MMAU: A Holistic Benchmark of Agent Capabilities Across Diverse Domains. CoRR abs/2407.18961 (2024) - [i50]Tom Gunter, Zirui Wang, Chong Wang, Ruoming Pang, Andy Narayanan, Aonan Zhang, Bowen Zhang, Chen Chen, Chung-Cheng Chiu, David Qiu, Deepak Gopinath, Dian Ang Yap, Dong Yin, Feng Nan, Floris Weers, Guoli Yin, Haoshuo Huang, Jianyu Wang, Jiarui Lu, John Peebles, Ke Ye, Mark Lee, Nan Du, Qibin Chen, Quentin Keunebroek, Sam Wiseman, Syd Evans, Tao Lei, Vivek Rathod, Xiang Kong, Xianzhi Du, Yanghao Li, Yongqiang Wang, Yuan Gao, Zaid Ahmed, Zhaoyang Xu, Zhiyun Lu, Al Rashid, Albin Madappally Jose, Alec Doane, Alfredo Bencomo, Allison Vanderby, Andrew Hansen, Ankur Jain, Anupama Mann Anupama, Areeba Kamal, Bugu Wu, Carolina Brum, Charlie Maalouf, Chinguun Erdenebileg, Chris Dulhanty, Dominik Moritz, Doug Kang, Eduardo Jimenez, Evan Ladd, Fangping Shi, Felix Bai, Frank Chu, Fred Hohman, Hadas Kotek, Hannah Gillis Coleman, Jane Li, Jeffrey P. Bigham, Jeffery Cao, Jeff Lai, Jessica Cheung, Jiulong Shan, Joe Zhou, John Li, Jun Qin, Karanjeet Singh, Karla Vega, Kelvin Zou, Laura Heckman, Lauren Gardiner, Margit Bowler, Maria Cordell, Meng Cao, Nicole Hay, Nilesh Shahdadpuri, Otto Godwin, Pranay Dighe, Pushyami Rachapudi, Ramsey Tantawi, Roman Frigg, Sam Davarnia, Sanskruti Shah, Saptarshi Guha, Sasha Sirovica, Shen Ma, Shuang Ma, Simon Wang, Sulgi Kim, Suma Jayaram, Vaishaal Shankar, Varsha Paidi, Vivek Kumar, Xin Wang, Xin Zheng, Walker Cheng:
Apple Intelligence Foundation Language Models. CoRR abs/2407.21075 (2024) - [i49]Jiarui Lu, Thomas Holleis, Yizhe Zhang, Bernhard Aumayer, Feng Nan, Felix Bai, Shuang Ma, Shen Ma, Mengyu Li, Guoli Yin, Zirui Wang, Ruoming Pang:
ToolSandbox: A Stateful, Conversational, Interactive Evaluation Benchmark for LLM Tool Use Capabilities. CoRR abs/2408.04682 (2024) - [i48]Shengyu Feng, Xiang Kong, Shuang Ma, Aonan Zhang, Dong Yin, Chong Wang, Ruoming Pang, Yiming Yang:
Step-by-Step Reasoning for Math Problems via Twisted Sequential Monte Carlo. CoRR abs/2410.01920 (2024) - [i47]Haotian Sun, Tao Lei, Bowen Zhang, Yanghao Li, Haoshuo Huang, Ruoming Pang, Bo Dai, Nan Du:
EC-DIT: Scaling Diffusion Transformers with Adaptive Expert-Choice Routing. CoRR abs/2410.02098 (2024) - [i46]Ruohong Zhang, Bowen Zhang, Yanghao Li, Haotian Zhang, Zhiqing Sun, Zhe Gan, Yinfei Yang, Ruoming Pang, Yiming Yang:
Improve Vision Language Model Chain-of-thought Reasoning. CoRR abs/2410.16198 (2024) - 2023
- [c56]Sheng Li, Garrett Andersen, Tao Chen, Liqun Cheng, Julian Grady, Da Huang, Quoc V. Le, Andrew Li, Xin Li, Yang Li, Chen Liang, Yifeng Lu, Yun Ni, Ruoming Pang, Mingxing Tan, Martin Wicke, Gang Wu, Shengqi Zhu, Parthasarathy Ranganathan, Norman P. Jouppi:
Hyperscale Hardware Optimized Neural Architecture Search. ASPLOS (3) 2023: 343-358 - [c55]Chen Chen, Bowen Zhang, Liangliang Cao, Jiguang Shen, Tom Gunter, Albin Madappally Jose, Alexander Toshev, Yantao Zheng, Jonathon Shlens, Ruoming Pang, Yinfei Yang:
STAIR: Learning Sparse Text and Image Representation in Grounded Tokens. EMNLP 2023: 15079-15094 - [i45]Chen Chen, Bowen Zhang, Liangliang Cao, Jiguang Shen, Tom Gunter, Albin Madappally Jose, Alexander Toshev, Jonathon Shlens, Ruoming Pang, Yinfei Yang:
STAIR: Learning Sparse Text and Image Representation in Grounded Tokens. CoRR abs/2301.13081 (2023) - [i44]Rami Botros, Anmol Gulati, Tara N. Sainath, Krzysztof Choromanski, Ruoming Pang, Trevor Strohman, Weiran Wang, Jiahui Yu:
Practical Conformer: Optimizing size, speed and flops of Conformer for on-Device and cloud ASR. CoRR abs/2304.00171 (2023) - [i43]Erik A. Daxberger, Floris Weers, Bowen Zhang, Tom Gunter, Ruoming Pang, Marcin Eichner, Michael Emmersberger, Yinfei Yang, Alexander Toshev, Xianzhi Du:
Mobile V-MoEs: Scaling Down Vision Transformers via Sparse Mixture-of-Experts. CoRR abs/2309.04354 (2023) - [i42]Cheng-I Jeff Lai, Zhiyun Lu, Liangliang Cao, Ruoming Pang:
Instruction-Following Speech Recognition. CoRR abs/2309.09843 (2023) - 2022
- [j5]Yu Zhang, Daniel S. Park, Wei Han, James Qin, Anmol Gulati, Joel Shor, Aren Jansen, Yuanzhong Xu, Yanping Huang, Shibo Wang, Zongwei Zhou, Bo Li, Min Ma, William Chan, Jiahui Yu, Yongqiang Wang, Liangliang Cao, Khe Chai Sim, Bhuvana Ramabhadran, Tara N. Sainath, Françoise Beaufays, Zhifeng Chen, Quoc V. Le, Chung-Cheng Chiu, Ruoming Pang, Yonghui Wu:
BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition. IEEE J. Sel. Top. Signal Process. 16(6): 1519-1532 (2022) - [c54]Bo Li, Ruoming Pang, Yu Zhang, Tara N. Sainath, Trevor Strohman, Parisa Haghani, Yun Zhu, Brian Farris, Neeraj Gaur, Manasa Prasad:
Massively Multilingual ASR: A Lifelong Learning Solution. ICASSP 2022: 6397-6401 - [c53]Ke Hu, Tara N. Sainath, Arun Narayanan, Ruoming Pang, Trevor Strohman:
Transducer-Based Streaming Deliberation for Cascaded Encoders. ICASSP 2022: 8107-8111 - [c52]Tara N. Sainath, Yanzhang He, Arun Narayanan, Rami Botros, Weiran Wang, David Qiu, Chung-Cheng Chiu, Rohit Prabhavalkar, Alexander Gruenstein, Anmol Gulati, Bo Li, David Rybach, Emmanuel Guzman, Ian McGraw, James Qin, Krzysztof Choromanski, Qiao Liang, Robert David, Ruoming Pang, Shuo-Yiin Chang, Trevor Strohman, W. Ronny Huang, Wei Han, Yonghui Wu, Yu Zhang:
Improving The Latency And Quality Of Cascaded Encoders. ICASSP 2022: 8112-8116 - [c51]Jiahui Yu, Xin Li, Jing Yu Koh, Han Zhang, Ruoming Pang, James Qin, Alexander Ku, Yuanzhong Xu, Jason Baldridge, Yonghui Wu:
Vector-quantized Image Modeling with Improved VQGAN. ICLR 2022 - [c50]W. Ronny Huang, Cal Peyser, Tara N. Sainath, Ruoming Pang, Trevor D. Strohman, Shankar Kumar:
Sentence-Select: Large-Scale Language Model Data Selection for Rare-Word Speech Recognition. INTERSPEECH 2022: 689-693 - [c49]Bo Li, Tara N. Sainath, Ruoming Pang, Shuo-Yiin Chang, Qiumin Xu, Trevor Strohman, Vince Chen, Qiao Liang, Heguang Liu, Yanzhang He, Parisa Haghani, Sameer Bidichandani:
A Language Agnostic Multilingual Streaming On-Device ASR System. INTERSPEECH 2022: 3188-3192 - [c48]Paul Barham, Aakanksha Chowdhery, Jeff Dean, Sanjay Ghemawat, Steven Hand, Dan Hurt, Michael Isard, Hyeontaek Lim, Ruoming Pang, Sudip Roy, Brennan Saeta, Parker Schuh, Ryan Sepassi, Laurent El Shafey, Chandramohan A. Thekkath, Yonghui Wu:
Pathways: Asynchronous Distributed Dataflow for ML. MLSys 2022 - [i41]W. Ronny Huang, Cal Peyser, Tara N. Sainath, Ruoming Pang, Trevor Strohman, Shankar Kumar:
Sentence-Select: Large-Scale Language Model Data Selection for Rare-Word Speech Recognition. CoRR abs/2203.05008 (2022) - [i40]Paul Barham, Aakanksha Chowdhery, Jeff Dean, Sanjay Ghemawat, Steven Hand, Dan Hurt, Michael Isard, Hyeontaek Lim, Ruoming Pang, Sudip Roy, Brennan Saeta, Parker Schuh, Ryan Sepassi, Laurent El Shafey, Chandramohan A. Thekkath, Yonghui Wu:
Pathways: Asynchronous Distributed Dataflow for ML. CoRR abs/2203.12533 (2022) - [i39]Bo Li, Tara N. Sainath, Ruoming Pang, Shuo-Yiin Chang, Qiumin Xu, Trevor Strohman, Vince Chen, Qiao Liang, Heguang Liu, Yanzhang He, Parisa Haghani, Sameer Bidichandani:
A Language Agnostic Multilingual Streaming On-Device ASR System. CoRR abs/2208.13916 (2022) - 2021
- [c47]Yu-An Chung, Yu Zhang, Wei Han, Chung-Cheng Chiu, James Qin, Ruoming Pang, Yonghui Wu:
w2v-BERT: Combining Contrastive Learning and Masked Language Modeling for Self-Supervised Speech Pre-Training. ASRU 2021: 244-250 - [c46]Bo Li, Ruoming Pang, Tara N. Sainath, Anmol Gulati, Yu Zhang, James Qin, Parisa Haghani, W. Ronny Huang, Min Ma, Junwen Bai:
Scaling End-to-End Models for Large-Scale Multilingual ASR. ASRU 2021: 1011-1018 - [c45]Sheng Li, Mingxing Tan, Ruoming Pang, Andrew Li, Liqun Cheng, Quoc V. Le, Norman P. Jouppi:
Searching for Fast Model Families on Datacenter Accelerators. CVPR 2021: 8085-8095 - [c44]Arun Narayanan, Tara N. Sainath, Ruoming Pang, Jiahui Yu, Chung-Cheng Chiu, Rohit Prabhavalkar, Ehsan Variani, Trevor Strohman:
Cascaded Encoders for Unifying Streaming and Non-Streaming ASR. ICASSP 2021: 5629-5633 - [c43]Bo Li, Anmol Gulati, Jiahui Yu, Tara N. Sainath, Chung-Cheng Chiu, Arun Narayanan, Shuo-Yiin Chang, Ruoming Pang, Yanzhang He, James Qin, Wei Han, Qiao Liang, Yu Zhang, Trevor Strohman, Yonghui Wu:
A Better and Faster end-to-end Model for Streaming ASR. ICASSP 2021: 5634-5638 - [c42]Jiahui Yu, Chung-Cheng Chiu, Bo Li, Shuo-Yiin Chang, Tara N. Sainath, Yanzhang He, Arun Narayanan, Wei Han, Anmol Gulati, Yonghui Wu, Ruoming Pang:
FastEmit: Low-Latency Streaming ASR with Sequence-Level Emission Regularization. ICASSP 2021: 6004-6008 - [c41]Zhaofeng Wu, Ding Zhao, Qiao Liang, Jiahui Yu, Anmol Gulati, Ruoming Pang:
Dynamic Sparsity Neural Networks for Automatic Speech Recognition. ICASSP 2021: 6014-6018 - [c40]Thibault Doutre, Wei Han, Min Ma, Zhiyun Lu, Chung-Cheng Chiu, Ruoming Pang, Arun Narayanan, Ananya Misra, Yu Zhang, Liangliang Cao:
Improving Streaming Automatic Speech Recognition with Non-Streaming Model Distillation on Unsupervised Data. ICASSP 2021: 6558-6562 - [c39]Jiahui Yu, Wei Han, Anmol Gulati, Chung-Cheng Chiu, Bo Li, Tara N. Sainath, Yonghui Wu, Ruoming Pang:
Dual-mode ASR: Unify and Improve Streaming ASR with Full-context Modeling. ICLR 2021 - [c38]Tara N. Sainath, Yanzhang He, Arun Narayanan, Rami Botros, Ruoming Pang, David Rybach, Cyril Allauzen, Ehsan Variani, James Qin, Quoc-Nam Le-The, Shuo-Yiin Chang, Bo Li, Anmol Gulati, Jiahui Yu, Chung-Cheng Chiu, Diamantino Caseiro, Wei Li, Qiao Liang, Pat Rondon:
An Efficient Streaming Non-Recurrent On-Device End-to-End Model with Improvements to Rare-Word Modeling. Interspeech 2021: 1777-1781 - [c37]Thibault Doutre, Wei Han, Chung-Cheng Chiu, Ruoming Pang, Olivier Siohan, Liangliang Cao:
Bridging the Gap Between Streaming and Non-Streaming ASR Systems by Distilling Ensembles of CTC and RNN-T Models. Interspeech 2021: 1807-1811 - [c36]Andros Tjandra, Ruoming Pang, Yu Zhang, Shigeki Karita:
Unsupervised Learning of Disentangled Speech Content and Style Representation. Interspeech 2021: 4089-4093 - [c35]Ke Hu, Ruoming Pang, Tara N. Sainath, Trevor Strohman:
Transformer Based Deliberation for Two-Pass Speech Recognition. SLT 2021: 68-74 - [c34]Chung-Cheng Chiu, Arun Narayanan, Wei Han, Rohit Prabhavalkar, Yu Zhang, Navdeep Jaitly, Ruoming Pang, Tara N. Sainath, Patrick Nguyen, Liangliang Cao, Yonghui Wu:
RNN-T Models Fail to Generalize to Out-of-Domain Audio: Causes and Solutions. SLT 2021: 873-880 - [i38]Ke Hu, Ruoming Pang, Tara N. Sainath, Trevor Strohman:
Transformer Based Deliberation for Two-Pass Speech Recognition. CoRR abs/2101.11577 (2021) - [i37]Sheng Li, Mingxing Tan, Ruoming Pang, Andrew Li, Liqun Cheng, Quoc V. Le, Norman P. Jouppi:
Searching for Fast Model Families on Datacenter Accelerators. CoRR abs/2102.05610 (2021) - [i36]Thibault Doutre, Wei Han, Chung-Cheng Chiu, Ruoming Pang, Olivier Siohan, Liangliang Cao:
Bridging the gap between streaming and non-streaming ASR systems bydistilling ensembles of CTC and RNN-T models. CoRR abs/2104.14346 (2021) - [i35]Bo Li, Ruoming Pang, Tara N. Sainath, Anmol Gulati, Yu Zhang, James Qin, Parisa Haghani, W. Ronny Huang, Min Ma:
Scaling End-to-End Models for Large-Scale Multilingual ASR. CoRR abs/2104.14830 (2021) - [i34]Yuanzhong Xu, HyoukJoong Lee, Dehao Chen, Blake A. Hechtman, Yanping Huang, Rahul Joshi, Maxim Krikun, Dmitry Lepikhin, Andy Ly, Marcello Maggioni, Ruoming Pang, Noam Shazeer, Shibo Wang, Tao Wang, Yonghui Wu, Zhifeng Chen:
GSPMD: General and Scalable Parallelization for ML Computation Graphs. CoRR abs/2105.04663 (2021) - [i33]Yu-An Chung, Yu Zhang, Wei Han, Chung-Cheng Chiu, James Qin, Ruoming Pang, Yonghui Wu:
W2v-BERT: Combining Contrastive Learning and Masked Language Modeling for Self-Supervised Speech Pre-Training. CoRR abs/2108.06209 (2021) - [i32]Yu Zhang, Daniel S. Park, Wei Han, James Qin, Anmol Gulati, Joel Shor, Aren Jansen, Yuanzhong Xu, Yanping Huang, Shibo Wang, Zongwei Zhou, Bo Li, Min Ma, William Chan, Jiahui Yu, Yongqiang Wang, Liangliang Cao, Khe Chai Sim, Bhuvana Ramabhadran, Tara N. Sainath, Françoise Beaufays, Zhifeng Chen, Quoc V. Le, Chung-Cheng Chiu, Ruoming Pang, Yonghui Wu:
BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition. CoRR abs/2109.13226 (2021) - [i31]Jiahui Yu, Xin Li, Jing Yu Koh, Han Zhang, Ruoming Pang, James Qin, Alexander Ku, Yuanzhong Xu, Jason Baldridge, Yonghui Wu:
Vector-quantized Image Modeling with Improved VQGAN. CoRR abs/2110.04627 (2021) - [i30]Bowen Zhang, Jiahui Yu, Christopher Fifty, Wei Han, Andrew M. Dai, Ruoming Pang, Fei Sha:
Co-training Transformer with Videos and Images Improves Action Recognition. CoRR abs/2112.07175 (2021) - 2020
- [c33]Mingxing Tan, Ruoming Pang, Quoc V. Le:
EfficientDet: Scalable and Efficient Object Detection. CVPR 2020: 10778-10787 - [c32]Jiahui Yu, Pengchong Jin, Hanxiao Liu, Gabriel Bender, Pieter-Jan Kindermans, Mingxing Tan, Thomas S. Huang, Xiaodan Song, Ruoming Pang, Quoc Le:
BigNAS: Scaling up Neural Architecture Search with Big Single-Stage Models. ECCV (7) 2020: 702-717 - [c31]Tara N. Sainath, Yanzhang He, Bo Li, Arun Narayanan, Ruoming Pang, Antoine Bruguier, Shuo-Yiin Chang, Wei Li, Raziel Alvarez, Zhifeng Chen, Chung-Cheng Chiu, David Garcia, Alexander Gruenstein, Ke Hu, Anjuli Kannan, Qiao Liang, Ian McGraw, Cal Peyser, Rohit Prabhavalkar, Golan Pundak, David Rybach, Yuan Shangguan, Yash Sheth, Trevor Strohman, Mirkó Visontai, Yonghui Wu, Yu Zhang, Ding Zhao:
A Streaming On-Device End-To-End Model Surpassing Server-Side Conventional Model Quality and Latency. ICASSP 2020: 6059-6063 - [c30]Bo Li, Shuo-Yiin Chang, Tara N. Sainath, Ruoming Pang, Yanzhang He, Trevor Strohman, Yonghui Wu:
Towards Fast and Accurate Streaming End-To-End ASR. ICASSP 2020: 6069-6073 - [c29]Tara N. Sainath, Ruoming Pang, Ron J. Weiss, Yanzhang He, Chung-Cheng Chiu, Trevor Strohman:
An Attention-Based Joint Acoustic and Text on-Device End-To-End Model. ICASSP 2020: 7039-7043 - [c28]Ke Hu, Tara N. Sainath, Ruoming Pang, Rohit Prabhavalkar:
Deliberation Model Based Two-Pass End-To-End Speech Recognition. ICASSP 2020: 7799-7803 - [c27]Wei Li, James Qin, Chung-Cheng Chiu, Ruoming Pang, Yanzhang He:
Parallel Rescoring with Transformer for Streaming On-Device Speech Recognition. INTERSPEECH 2020: 2122-2126 - [c26]Wei Han, Zhengdong Zhang, Yu Zhang, Jiahui Yu, Chung-Cheng Chiu, James Qin, Anmol Gulati, Ruoming Pang, Yonghui Wu:
ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context. INTERSPEECH 2020: 3610-3614 - [c25]Tara N. Sainath, Ruoming Pang, David Rybach, Basi García, Trevor Strohman:
Emitting Word Timings with End-to-End Models. INTERSPEECH 2020: 3615-3619 - [c24]Cal Peyser, Sepand Mavandadi, Tara N. Sainath, James Apfel, Ruoming Pang, Shankar Kumar:
Improving Tail Performance of a Deliberation E2E ASR Model Using a Large Text Corpus. INTERSPEECH 2020: 4921-4925 - [c23]Anmol Gulati, James Qin, Chung-Cheng Chiu, Niki Parmar, Yu Zhang, Jiahui Yu, Wei Han, Shibo Wang, Zhengdong Zhang, Yonghui Wu, Ruoming Pang:
Conformer: Convolution-augmented Transformer for Speech Recognition. INTERSPEECH 2020: 5036-5040 - [i29]Ke Hu, Tara N. Sainath, Ruoming Pang, Rohit Prabhavalkar:
Deliberation Model Based Two-Pass End-to-End Speech Recognition. CoRR abs/2003.07962 (2020) - [i28]Jiahui Yu, Pengchong Jin, Hanxiao Liu, Gabriel Bender, Pieter-Jan Kindermans, Mingxing Tan, Thomas S. Huang, Xiaodan Song, Ruoming Pang, Quoc V. Le:
BigNAS: Scaling Up Neural Architecture Search with Big Single-Stage Models. CoRR abs/2003.11142 (2020) - [i27]Tara N. Sainath, Yanzhang He, Bo Li, Arun Narayanan, Ruoming Pang, Antoine Bruguier, Shuo-Yiin Chang, Wei Li, Raziel Alvarez, Zhifeng Chen, Chung-Cheng Chiu, David Garcia, Alexander Gruenstein, Ke Hu, Minho Jin, Anjuli Kannan, Qiao Liang, Ian McGraw, Cal Peyser, Rohit Prabhavalkar, Golan Pundak, David Rybach, Yuan Shangguan, Yash Sheth, Trevor Strohman, Mirkó Visontai, Yonghui Wu, Yu Zhang, Ding Zhao:
A Streaming On-Device End-to-End Model Surpassing Server-Side Conventional Model Quality and Latency. CoRR abs/2003.12710 (2020) - [i26]Wei Han, Zhengdong Zhang, Yu Zhang, Jiahui Yu, Chung-Cheng Chiu, James Qin, Anmol Gulati, Ruoming Pang, Yonghui Wu:
ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context. CoRR abs/2005.03191 (2020) - [i25]Chung-Cheng Chiu, Arun Narayanan, Wei Han, Rohit Prabhavalkar, Yu Zhang, Navdeep Jaitly, Ruoming Pang, Tara N. Sainath, Patrick Nguyen, Liangliang Cao, Yonghui Wu:
RNN-T Models Fail to Generalize to Out-of-Domain Audio: Causes and Solutions. CoRR abs/2005.03271 (2020) - [i24]Anmol Gulati, James Qin, Chung-Cheng Chiu, Niki Parmar, Yu Zhang, Jiahui Yu, Wei Han, Shibo Wang, Zhengdong Zhang, Yonghui Wu, Ruoming Pang:
Conformer: Convolution-augmented Transformer for Speech Recognition. CoRR abs/2005.08100 (2020) - [i23]Zhaofeng Wu, Ding Zhao, Qiao Liang, Jiahui Yu, Anmol Gulati, Ruoming Pang:
Dynamic Sparsity Neural Networks for Automatic Speech Recognition. CoRR abs/2005.10627 (2020) - [i22]Cal Peyser, Sepand Mavandadi, Tara N. Sainath, James Apfel, Ruoming Pang, Shankar Kumar:
Improving Tail Performance of a Deliberation E2E ASR Model Using a Large Text Corpus. CoRR abs/2008.10491 (2020) - [i21]Wei Li, James Qin, Chung-Cheng Chiu, Ruoming Pang, Yanzhang He:
Parallel Rescoring with Transformer for Streaming On-Device Speech Recognition. CoRR abs/2008.13093 (2020) - [i20]Jiahui Yu, Wei Han, Anmol Gulati, Chung-Cheng Chiu, Bo Li, Tara N. Sainath, Yonghui Wu, Ruoming Pang:
Universal ASR: Unify and Improve Streaming ASR with Full-context Modeling. CoRR abs/2010.06030 (2020) - [i19]Yu Zhang, James Qin, Daniel S. Park, Wei Han, Chung-Cheng Chiu, Ruoming Pang, Quoc V. Le, Yonghui Wu:
Pushing the Limits of Semi-Supervised Learning for Automatic Speech Recognition. CoRR abs/2010.10504 (2020) - [i18]Jiahui Yu, Chung-Cheng Chiu, Bo Li, Shuo-Yiin Chang, Tara N. Sainath, Yanzhang He, Arun Narayanan, Wei Han, Anmol Gulati, Yonghui Wu, Ruoming Pang:
FastEmit: Low-latency Streaming ASR with Sequence-level Emission Regularization. CoRR abs/2010.11148 (2020) - [i17]Thibault Doutre, Wei Han, Min Ma, Zhiyun Lu, Chung-Cheng Chiu, Ruoming Pang, Arun Narayanan, Ananya Misra, Yu Zhang, Liangliang Cao:
Improving Streaming Automatic Speech Recognition With Non-Streaming Model Distillation On Unsupervised Data. CoRR abs/2010.12096 (2020) - [i16]Andros Tjandra, Ruoming Pang, Yu Zhang, Shigeki Karita:
Unsupervised Learning of Disentangled Speech Content and Style Representation. CoRR abs/2010.12973 (2020) - [i15]Arun Narayanan, Tara N. Sainath, Ruoming Pang, Jiahui Yu, Chung-Cheng Chiu, Rohit Prabhavalkar, Ehsan Variani, Trevor Strohman:
Cascaded encoders for unifying streaming and non-streaming ASR. CoRR abs/2010.14606 (2020) - [i14]Bo Li, Anmol Gulati, Jiahui Yu, Tara N. Sainath, Chung-Cheng Chiu, Arun Narayanan, Shuo-Yiin Chang, Ruoming Pang, Yanzhang He, James Qin, Wei Han, Qiao Liang, Yu Zhang, Trevor Strohman, Yonghui Wu:
A Better and Faster End-to-End Model for Streaming ASR. CoRR abs/2011.10798 (2020)
2010 – 2019
- 2019
- [c22]Naveen Arivazhagan, Colin Cherry, Wolfgang Macherey, Chung-Cheng Chiu, Semih Yavuz, Ruoming Pang, Wei Li, Colin Raffel:
Monotonic Infinite Lookback Attention for Simultaneous Machine Translation. ACL (1) 2019: 1313-1323 - [c21]Chung-Cheng Chiu, Anjuli Kannan, Rohit Prabhavalkar, Zhifeng Chen, Tara N. Sainath, Yonghui Wu, Wei Han, Yu Zhang, Ruoming Pang, Sergey Kishchenko, Patrick Nguyen, Arun Narayanan, Hank Liao, Shuyuan Zhang:
A Comparison of End-to-End Models for Long-Form Speech Recognition. ASRU 2019: 889-896 - [c20]Mingxing Tan, Bo Chen, Ruoming Pang, Vijay Vasudevan, Mark Sandler, Andrew Howard, Quoc V. Le:
MnasNet: Platform-Aware Neural Architecture Search for Mobile. CVPR 2019: 2820-2828 - [c19]Bo Li, Tara N. Sainath, Ruoming Pang, Zelin Wu:
Semi-supervised Training for End-to-end Models via Weak Distillation. ICASSP 2019: 2837-2841 - [c18]Yanzhang He, Tara N. Sainath, Rohit Prabhavalkar, Ian McGraw, Raziel Alvarez, Ding Zhao, David Rybach, Anjuli Kannan, Yonghui Wu, Ruoming Pang, Qiao Liang, Deepti Bhatia, Yuan Shangguan, Bo Li, Golan Pundak, Khe Chai Sim, Tom Bagby, Shuo-Yiin Chang, Kanishka Rao, Alexander Gruenstein:
Streaming End-to-end Speech Recognition for Mobile Devices. ICASSP 2019: 6381-6385 - [c17]Andrew Howard, Ruoming Pang, Hartwig Adam, Quoc V. Le, Mark Sandler, Bo Chen, Weijun Wang, Liang-Chieh Chen, Mingxing Tan, Grace Chu, Vijay Vasudevan, Yukun Zhu:
Searching for MobileNetV3. ICCV 2019: 1314-1324 - [c16]Wei-Ning Hsu, Yu Zhang, Ron J. Weiss, Heiga Zen, Yonghui Wu, Yuxuan Wang, Yuan Cao, Ye Jia, Zhifeng Chen, Jonathan Shen, Patrick Nguyen, Ruoming Pang:
Hierarchical Generative Modeling for Controllable Speech Synthesis. ICLR (Poster) 2019 - [c15]Ding Zhao, Tara N. Sainath, David Rybach, Pat Rondon, Deepti Bhatia, Bo Li, Ruoming Pang:
Shallow-Fusion End-to-End Contextual Biasing. INTERSPEECH 2019: 1418-1422 - [c14]Tara N. Sainath, Ruoming Pang, David Rybach, Yanzhang He, Rohit Prabhavalkar, Wei Li, Mirkó Visontai, Qiao Liang, Trevor Strohman, Yonghui Wu, Ian McGraw, Chung-Cheng Chiu:
Two-Pass End-to-End Speech Recognition. INTERSPEECH 2019: 2773-2777 - [c13]Ruoming Pang, Ramón Cáceres, Mike Burrows, Zhifeng Chen, Pratik Dave, Nathan Germer, Alexander Golynski, Kevin Graney, Nina Kang, Lea Kissner, Jeffrey L. Korn, Abhishek Parmar, Christopher D. Richards, Mengzhi Wang:
Zanzibar: Google's Consistent, Global Authorization System. USENIX ATC 2019: 33-46 - [i13]Jonathan Shen, Patrick Nguyen, Yonghui Wu, Zhifeng Chen, Mia Xu Chen, Ye Jia, Anjuli Kannan, Tara N. Sainath, Yuan Cao, Chung-Cheng Chiu, Yanzhang He, Jan Chorowski, Smit Hinsu, Stella Laurenzo, James Qin, Orhan Firat, Wolfgang Macherey, Suyog Gupta, Ankur Bapna, Shuyuan Zhang, Ruoming Pang, Ron J. Weiss, Rohit Prabhavalkar, Qiao Liang, Benoit Jacob, Bowen Liang, HyoukJoong Lee, Ciprian Chelba, Sébastien Jean, Bo Li, Melvin Johnson, Rohan Anil, Rajat Tibrewal, Xiaobing Liu, Akiko Eriguchi, Navdeep Jaitly, Naveen Ari, Colin Cherry, Parisa Haghani, Otavio Good, Youlong Cheng, Raziel Alvarez, Isaac Caswell, Wei-Ning Hsu, Zongheng Yang, Kuan-Chieh Wang, Ekaterina Gonina, Katrin Tomanek, Ben Vanik, Zelin Wu, Llion Jones, Mike Schuster, Yanping Huang, Dehao Chen, Kazuki Irie, George F. Foster, John Richardson, Klaus Macherey, Antoine Bruguier, Heiga Zen, Colin Raffel, Shankar Kumar, Kanishka Rao, David Rybach, Matthew Murray, Vijayaditya Peddinti, Maxim Krikun, Michiel Bacchiani, Thomas B. Jablin, Robert Suderman, Ian Williams, Benjamin Lee, Deepti Bhatia, Justin Carlson, Semih Yavuz, Yu Zhang, Ian McGraw, Max Galkin, Qi Ge, Golan Pundak, Chad Whipkey, Todd Wang, Uri Alon, Dmitry Lepikhin, Ye Tian, Sara Sabour, William Chan, Shubham Toshniwal, Baohua Liao, Michael Nirschl, Pat Rondon:
Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling. CoRR abs/1902.08295 (2019) - [i12]Golnaz Ghiasi, Tsung-Yi Lin, Ruoming Pang, Quoc V. Le:
NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection. CoRR abs/1904.07392 (2019) - [i11]Andrew Howard, Mark Sandler, Grace Chu, Liang-Chieh Chen, Bo Chen, Mingxing Tan, Weijun Wang, Yukun Zhu, Ruoming Pang, Vijay Vasudevan, Quoc V. Le, Hartwig Adam:
Searching for MobileNetV3. CoRR abs/1905.02244 (2019) - [i10]Naveen Arivazhagan, Colin Cherry, Wolfgang Macherey, Chung-Cheng Chiu, Semih Yavuz, Ruoming Pang, Wei Li, Colin Raffel:
Monotonic Infinite Lookback Attention for Simultaneous Machine Translation. CoRR abs/1906.05218 (2019) - [i9]Tara N. Sainath, Ruoming Pang, David Rybach, Yanzhang He, Rohit Prabhavalkar, Wei Li, Mirkó Visontai, Qiao Liang, Trevor Strohman, Yonghui Wu, Ian McGraw, Chung-Cheng Chiu:
Two-Pass End-to-End Speech Recognition. CoRR abs/1908.10992 (2019) - [i8]Chung-Cheng Chiu, Wei Han, Yu Zhang, Ruoming Pang, Sergey Kishchenko, Patrick Nguyen, Arun Narayanan, Hank Liao, Shuyuan Zhang, Anjuli Kannan, Rohit Prabhavalkar, Zhifeng Chen, Tara N. Sainath, Yonghui Wu:
A comparison of end-to-end models for long-form speech recognition. CoRR abs/1911.02242 (2019) - [i7]Mingxing Tan, Ruoming Pang, Quoc V. Le:
EfficientDet: Scalable and Efficient Object Detection. CoRR abs/1911.09070 (2019) - 2018
- [c12]Jonathan Shen, Ruoming Pang, Ron J. Weiss, Mike Schuster, Navdeep Jaitly, Zongheng Yang, Zhifeng Chen, Yu Zhang, Yuxuan Wang, R. J. Skerry-Ryan, Rif A. Saurous, Yannis Agiomyrgiannakis, Yonghui Wu:
Natural TTS Synthesis by Conditioning Wavenet on MEL Spectrogram Predictions. ICASSP 2018: 4779-4783 - [c11]Ruoming Pang, Tara N. Sainath, Rohit Prabhavalkar, Suyog Gupta, Yonghui Wu, Shuyuan Zhang, Chung-Cheng Chiu:
Compression of End-to-End Models. INTERSPEECH 2018: 27-31 - [c10]Ye Jia, Yu Zhang, Ron J. Weiss, Quan Wang, Jonathan Shen, Fei Ren, Zhifeng Chen, Patrick Nguyen, Ruoming Pang, Ignacio López-Moreno, Yonghui Wu:
Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis. NeurIPS 2018: 4485-4495 - [i6]Ye Jia, Yu Zhang, Ron J. Weiss, Quan Wang, Jonathan Shen, Fei Ren, Zhifeng Chen, Patrick Nguyen, Ruoming Pang, Ignacio López-Moreno, Yonghui Wu:
Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis. CoRR abs/1806.04558 (2018) - [i5]