


Остановите войну!
for scientists:


default search action
Yu Zhang 0033
Person information

- affiliation: Google
- affiliation (PhD 2017): Massachusetts Institute of Technology, Computer Science and Artificial Intelligence Laboratory, Cambridge, MA, USA
Other persons with the same name
- Yu Zhang — disambiguation page
- Yu Zhang 0001
— Loughborough University, Department of Aeronautical and Automotive Engineering, UK (and 2 more)
- Yu Zhang 0002 — Pennsylvania State University, University Park, PA, USA (and 2 more)
- Yu Zhang 0003 — Hainan Normal University, Haikou, China
- Yu Zhang 0004
— Southeast University, School of Computer Science and Engineering, Nanjing, China (and 1 more)
- Yu Zhang 0005
— University of California, Santa Cruz, CA, USA (and 2 more)
- Yu Zhang 0006
— Southern University of Science and Technology, Shenzhen, China (and 3 more)
- Yu Zhang 0007 — Microsoft Research Asia, China
- Yu Zhang 0008 — Zhejiang University, College of Computer Science, Hangzhou, China
- Yu Zhang 0009
— Lehigh University, Department of Bioengineering, Bethlehem, PA, USA (and 2 more)
- Yu Zhang 0010
— Northwestern University, Department of Chemistry & Center for Bio-inspired Energy Science (CBES), Evanston, IL, USA
- Yu Zhang 0011
— South China Normal University, Guangzhou, China
- Yu Zhang 0012
— Southeast University, National Mobile Communications Research Laboratory, Nanjing, China
- Yu Zhang 0013
— Pennsylvania State University, College of Information Sciences and Technology, State College, PA, USA
- Yu Zhang 0014
— Xidian University, State Key Laboratory of Integrated Services Networks, Xi'an, China
- Yu Zhang 0015
— Zhejiang University of Technology, College of Information Engineering, Hangzhou, China
- Yu Zhang 0016
— Jilin University, College of Computer Science and Technology, Changchun, China
- Yu Zhang 0017 — Sichuan University, West China Hospital, Department of Radiology, Chengdu, China
- Yu Zhang 0018
— Zhejiang University, College of Control Science and Engineering, Hangzhou, China (and 3 more)
- Yu Zhang 0019
— Wuhan University, Electronic and Information School, Signal Processing Lab, China
- Yu Zhang 0020
— Tongji University, School of Mathematics, Shanghai, China
- Yu Zhang 0021
— Guangdong Peizheng College, English Department, Guangzhou, China
- Yu Zhang 0022 — University of California Santa Barbara, Department of Education, CA, USA
- Yu Zhang 0023 — Hubei University for Nationalities, School of Information Engineering, Enshi, China
- Yu Zhang 0024 — Northeastern University, Software College, Shenyang, China
- Yu Zhang 0025 — Microsoft, Online Service Division, Sunnyvale, CA, USA (and 1 more)
- Yu Zhang 0026
— Tsinghua University, Department of Electronic Engineering, Beijing, China
- Yu Zhang 0027
— Huazhong University of Science and Technology, School of Computer Science and Technology, Service Computing Technology and System Lab, Cluster and Grid Computing Lab, Wuhan, China
- Yu Zhang 0028
— Data Storage Institute, A-STAR, Singapore (and 1 more)
- Yu Zhang 0029
— Chinese Academy of Sciences, Northwest Institute of Eco-Environment and Resources, Lanzhou, China
- Yu Zhang 0030
— Harbin Institute of Technology, Research Center for Social Computing and Information Retrieval, China
- Yu Zhang 0031
— Xinyang Normal University, Department of Computer Science and Technology, China
- Yu Zhang 0032
— Shanghai Jiao Tong University, School of Electronic Information and Electrical Engineering, China
- Yu Zhang 0034
— RMIT University, Melbourne, VIC, Australia (and 1 more)
- Yu Zhang 0035
— SenseTime Group Limited, Beijing, China (and 1 more)
- Yu Zhang 0036
— Harbin Institute of Technology, School of Computer Science and Technology, China
- Yu Zhang 0037
— Beijing Institute of Technology, School of Mechanical Engineering, China
- Yu Zhang 0038
— Harbin Engineering University, College of Information and Communication Engineering, China
- Yu Zhang 0039
— Huazhong University of Science and Technology, School of Electronic Information and Communications, Wuhan, China
- Yu Zhang 0040
— Shaanxi Normal University, MOE Key Laboratory of Modern Teaching Technology, Xi'an, China
- Yu Zhang 0041
— University of North Carolina at Chapel Hill, University of North Carolina at Chapel Hill, NC, USA
- Yu Zhang 0042
— State Grid Energy Research Institute Co., Ltd., Beijing, China (and 2 more)
- Yu Zhang 0043
— University of Oxford, Department of Computer Science, UK (and 1 more)
- Yu Zhang 0044
— University of Illinois at Urbana-Champaign, Department of Computer Science, IL, USA
- Yu Zhang 0045
— Harbin Institute of Technology, State Key Laboratory of Robotics and System, Harbin, China
- Yu Zhang 0046
— Xiamen University, Key Laboratory of Underwater Acoustic Communication and Marine Information Technology, Xiamen, China
- Yu Zhang 0047
— Tsinghua University, Department of Automation, Beijing, China
- Yu Zhang 0048
— Northeast Electric Power University, College of Information Engineering, Jilin, China
- Yu Zhang 0049
— Macquarie University, Australian School of Advanced Medicine, Sydney, Australia
- Yu Zhang 0050 — Tsinghua University, Department of Electronic Engineering, Beijing, China (and 1 more)
- Yu Zhang 0051 — Chemnitz University of Technology, Germany
- Yu Zhang 0052 — École centrale de Lyon, France
- Yu Zhang 0053 — Cardiff University, UK
- Yu Zhang 0054 — Qualcomm Inc., Beijing, China (and 1 more)
- Yu Zhang 0055
— Arizona State University, Computer Science and Engineering Department, Tempe, AZ, USA (and 1 more)
- Yu Zhang 0056
— North China Electric Power University, Electrical and Electronic Department, Beijing, China
- Yu Zhang 0057
— Air Force Engineering University, Aeronautics and Astronautics Engineering College, Xi'an, China
- Yu Zhang 0058 — Chongqing University, College of Communication Engineering, China
- Yu Zhang 0059 — University of California, Los Angeles, USA
- Yu Zhang 0060
— National University of Defense Technology, Key Laboratory of Science and Technology on ATR, Changsha, China
- Yu Zhang 0061
— University of Tokyo, Graduate School of Agricultural and Life Sciences, Japan
- Yu Zhang 0062
— Xidian University, Video and Image Processing System Laboratory, China
- Yu Zhang 0063
— Chinese University of Hong Kong, Electronic Engineering Department
- Yu Zhang 0064
— Southern Medical University, School of Biomedical Engineering, Guangzhou, China
- Yu Zhang 0065
— Yanshan University, Institute of Electrical Engineering, Qinhuangdao, China
- Yu Zhang 0066
— China South Industries Group Corporation, Weapon Equipment Research Institute, Beijing, China
- Yu Zhang 0067
— Liaoning Technical University, School of Science, Fuxin, China
- Yu Zhang 0068
— Nanjing University of Aeronautics and Astronautics, Key Laboratory of Radar Imaging and Microwave Photonics, Nanjing, China
- Yu Zhang 0069
— Chinese Academy of Sciences and Ministry of Water Resources, Institute of Soil and Water Conservation, Yangling, China
- Yu Zhang 0070
— Shanghai Ocean University, College of Marine Sciences, China
- Yu Zhang 0071
— Hainan University, State Key Laboratory of Marine Resource Utilization in South China Sea, Haikou, China
- Yu Zhang 0072
— Fudan University, Shanghai Key Laboratory of Intelligent Information Processing, China
- Yu Zhang 0073
— Northeastern University, Department of Systems Engineering, State Key Lab of Synthetic Automation of Process Industries, Shenyang, China
- Yu Zhang 0074
— Pennsylvania State University, Department of Civil and Environmental Engineering, University Park, USA
- Yu Zhang 0075 — Institute of High Performance Computing, Singapore (and 1 more)
- Yu Zhang 0076
— Southwest Jiaotong University, School of Physical Science and Technology, Chengdu, China
- Yu Zhang 0077
— Hangzhou Dianzi University, School of Electronics and Information, China (and 1 more)
- Yu Zhang 0078
— Tianjin University, School of Precision Instrument and Opto-electronics Engineering, State Key Laboratory of Precision Measuring Technology and Instruments, China (and 1 more)
- Yu Zhang 0079
— Beijing Institute of Technology, School of Information and Electronics, China (and 1 more)
- Yu Zhang 0080
— Chinese Academy of Space Technology, Beijing Orient Institute of Measurement and Test, China (and 1 more)
- Yu Zhang 0081
— University of Science and Technology Beijing, Donlinks School of Economics and Management, China (and 1 more)
- Yu Zhang 0082
— Army Engineering University of PLA, College of Communication Engineering, Nanjing, China
- Yu Zhang 0083
— Tsinghua University, Institute of Education, Beijing, China (and 1 more)
- Yu Zhang 0084
— Nanyang Technological University, School of Computer Science and Engineering, Singapore
- Yu Zhang 0085
— Southwest University, College of Computer and Information Science, Chongqing, China (and 1 more)
- Yu Zhang 0086
— University of Science and Technology of China, Lab for Intelligent Networking and Knowledge Engineering, Hefei,China
- Yu Zhang 0087
— University of South Florida, Department of Civil and Environmental Engineering, Tampa, FL, USA (and 1 more)
- Yu Zhang 0088
— Delphi Automotive, Agoura Hills, CA, USA (and 1 more)
- Yu Zhang 0089
— Jiangsu Normal University, School of Geography, Geomatics and Planning, Department of Land Resource Management, Xuzhou, China (and 1 more)
- Yu Zhang 0090
— Beijing Normal University, College of Aritficial Intelligence, China
- Yu Zhang 0091
— Harbin Institute of Technology, School of Astronautics, National Key Laboratory of Tunable Laser Technology, China
- Yu Zhang 0092
— Soochow University, China
- Yu Zhang 0093
— Macquarie University, Sydney, Australia (and 1 more)
- Yu Zhang 0094
— University of Kentucky, Department of Computer Science, Lexington, KY, USA
- Yu Zhang 0095
— Nankai University, College of Computer Science, Tianjin Key Laboratory of Network and Data Security Technology, Tianjin, China (and 1 more)
- Yu Zhang 0096
— Hong Kong Polytechnic University, Department of Mechanical Engineering, Hong Kong (and 2 more)
- Yu Zhang 0097
— City University of Hong Kong, Hong Kong (and 2 more)
- Yu Zhang 0098
— Fujian Jiangxia University, Department of Electronic Information Science, Fuzhou, China
- Yu Zhang 0099
— University of California Davis, Department of Electrical and Computer Engineering, CA, USA
- Yu Zhang 0100
— Chongqing University of Posts and Telecommunications, School of Economics and Management, China
- Yu Zhang 0101
— Huazhong University of Science and Technology, Wuhan National Laboratory for Optoelectronics, China
- Yu Zhang 0102
— Chinese Academy of Sciences, Institute of Microelectronics, Beijing, China (and 1 more)
- Yu Zhang 0103
— Nanjing University of Aeronautics and Astronautics, Department of Mathematics, State Key Laboratory of Mechanics and Control of Mechanical Structures, China
- Yu Zhang 0104
— Chinese Academy of Sciences, Key Laboratory of Ecosystem Network Observation and Modeling, Beijing, China (and 1 more)
- Yu Zhang 0105
— Information Engineering University Zhengzhou, College of Cryptographic Engineering, China
- Yu Zhang 0106
— Changchun Institute of Technology, College of Computer Science and Engineering, China
- Yu Zhang 0107
— Hunan University, College of Computer Science and Electronic Engineering, Changsha, China
- Yu Zhang 0108
— East China University of Science and Technology, MOE Key Laboratory of Advanced Control and Optimization for Chemical Process, Shanghai, China (and 1 more)
- Yu Zhang 0109
— Tongji University, MOE Key Laboratory of Road and Traffic Engineering, Shanghai, China
- Yu Zhang 0110
— Jilin University, College of Electronic Science and Engineering, State Key Laboratory of Integrated Optoelectronics, Changchun, China
- Yu Zhang 0111
— Southern University of Science and Technology, Department of Computer Science and Engineering, Guangdong Key Laboratory of Brain-Inspired Intelligent Computation, Shenzhen, China
- Yu Zhang 0112
— East China Normal University, Department of Computer Science and Technology, Shanghai, China
- Yu Zhang 0113
— Dalian Medical University, Second Affiliated Hospital, China
- Yu Zhang 0114
— Catalonia Institute for Energy Research (IREC), Spain
- Yu Zhang 0115
— University of Montreal, Department of Psychology, QC, Canada
- Yu Zhang 0116
— Shenyang University of Technology, Department of Mechanical Engineering, China
- Yu Zhang 0117
— Beijing University of Posts and Telecommunications, State Key Laboratory of Networking and Switching Technology, China (and 1 more)
- Yu Zhang 0118
— Liaoning Technical University, School of Software, Fuxin, China
- Yu Zhang 0119
— Shenzhen University, College of Life Sciences and Oceanography, Guangdong Engineering Research Center for Marine Algal Biotechnology, China
- Yu Zhang 0120
— Shandong Normal University, School of Information Science and Engineering, China
- Yu Zhang 0121
— Zhejiang Normal University, Institute of Precision Machinery and Smart Structure, College of Engineering, Jinhua, China
- Yu Zhang 0122
— China Academy of Information and Communications Technology, Beijing, China
- Yu Zhang 0123
— Arizona State University, School of Electrical, Computer and Energy Engineering, Tempe, AZ, USA
- Yu Zhang 0124
— Southeast University, School of Computer Science and Engineering, Nanjing, China (and 2 more)
Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2023
- [c92]Yong Cheng, Yu Zhang, Melvin Johnson, Wolfgang Macherey, Ankur Bapna:
Mu2SLAM: Multitask, Multilingual Speech and Language Models. ICML 2023: 5504-5520 - [c91]Yuma Koizumi, Heiga Zen, Shigeki Karita, Yifan Ding, Kohei Yatabe, Nobuyuki Morioka, Yu Zhang, Wei Han, Ankur Bapna, Michiel Bacchiani:
Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations. WASPAA 2023: 1-5 - [i86]Chao-Han Huck Yang, Bo Li, Yu Zhang, Nanxin Chen, Rohit Prabhavalkar, Tara N. Sainath, Trevor Strohman:
From English to More Languages: Parameter-Efficient Model Reprogramming for Cross-Lingual Speech Recognition. CoRR abs/2301.07851 (2023) - [i85]Bo Li, Dongseong Hwang, Zhouyuan Huo, Junwen Bai, Guru Prakash, Tara N. Sainath, Khe Chai Sim, Yu Zhang, Wei Han, Trevor Strohman, Françoise Beaufays:
Efficient Domain Adaptation for Speech Foundation Models. CoRR abs/2302.01496 (2023) - [i84]Zhong Meng, Weiran Wang, Rohit Prabhavalkar, Tara N. Sainath, Tongzhou Chen, Ehsan Variani, Yu Zhang, Bo Li, Andrew Rosenberg, Bhuvana Ramabhadran:
JEIT: Joint End-to-End Model and Internal Language Model Training for Speech Recognition. CoRR abs/2302.08583 (2023) - [i83]Ke Hu, Tara N. Sainath, Bo Li, Nan Du, Yanping Huang, Andrew M. Dai, Yu Zhang, Rodrigo Cabrera, Zhifeng Chen, Trevor Strohman:
Massively Multilingual Shallow Fusion with Large Language Models. CoRR abs/2302.08917 (2023) - [i82]Yu Zhang, Wei Han, James Qin, Yongqiang Wang, Ankur Bapna, Zhehuai Chen, Nanxin Chen, Bo Li, Vera Axelrod, Gary Wang, Zhong Meng, Ke Hu, Andrew Rosenberg, Rohit Prabhavalkar, Daniel S. Park, Parisa Haghani, Jason Riesa, Ginger Perng, Hagen Soltau, Trevor Strohman, Bhuvana Ramabhadran, Tara N. Sainath, Pedro J. Moreno, Chung-Cheng Chiu, Johan Schalkwyk, Françoise Beaufays, Yonghui Wu:
Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages. CoRR abs/2303.01037 (2023) - [i81]Yuma Koizumi, Heiga Zen, Shigeki Karita, Yifan Ding, Kohei Yatabe, Nobuyuki Morioka, Yu Zhang, Wei Han, Ankur Bapna, Michiel Bacchiani:
Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations. CoRR abs/2303.01664 (2023) - [i80]Yuma Koizumi, Heiga Zen, Shigeki Karita, Yifan Ding, Kohei Yatabe, Nobuyuki Morioka, Michiel Bacchiani, Yu Zhang, Wei Han, Ankur Bapna:
LibriTTS-R: A Restored Multi-Speaker Text-to-Speech Corpus. CoRR abs/2305.18802 (2023) - [i79]Nanxin Chen, Izhak Shafran, Yu Zhang, Chung-Cheng Chiu, Hagen Soltau, James Qin, Yonghui Wu:
Efficient Adapters for Giant Speech Models. CoRR abs/2306.08131 (2023) - [i78]Paul K. Rubenstein, Chulayuth Asawaroengchai, Duc Dung Nguyen, Ankur Bapna, Zalán Borsos, Félix de Chaumont Quitry, Peter Chen, Dalia El Badawy, Wei Han, Eugene Kharitonov, Hannah Muckenhirn, Dirk Padfield, James Qin, Danny Rozenberg, Tara N. Sainath, Johan Schalkwyk, Matthew Sharifi, Michelle Tadmor Ramanovich, Marco Tagliasacchi, Alexandru Tudor, Mihajlo Velimirovic, Damien Vincent, Jiahui Yu, Yongqiang Wang, Vicky Zayats, Neil Zeghidour, Yu Zhang, Zhishuai Zhang, Lukas Zilka, Christian Havnø Frank:
AudioPaLM: A Large Language Model That Can Speak and Listen. CoRR abs/2306.12925 (2023) - [i77]Shikhar Bharadwaj, Min Ma, Shikhar Vashishth, Ankur Bapna, Sriram Ganapathy, Vera Axelrod, Siddharth Dalmia, Wei Han, Yu Zhang, Daan van Esch, Sandy Ritchie, Partha Talukdar, Jason Riesa:
Multimodal Modeling For Spoken Language Identification. CoRR abs/2309.10567 (2023) - 2022
- [j3]Murali Karthick Baskar
, Andrew Rosenberg, Bhuvana Ramabhadran
, Yu Zhang
, Pedro J. Moreno:
Ask2Mask: Guided Data Selection for Masked Speech Modeling. IEEE J. Sel. Top. Signal Process. 16(6): 1357-1366 (2022) - [j2]Yu Zhang
, Daniel S. Park
, Wei Han
, James Qin, Anmol Gulati, Joel Shor
, Aren Jansen, Yuanzhong Xu, Yanping Huang, Shibo Wang, Zongwei Zhou, Bo Li
, Min Ma
, William Chan, Jiahui Yu, Yongqiang Wang, Liangliang Cao, Khe Chai Sim
, Bhuvana Ramabhadran
, Tara N. Sainath
, Françoise Beaufays, Zhifeng Chen
, Quoc V. Le, Chung-Cheng Chiu, Ruoming Pang, Yonghui Wu:
BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition. IEEE J. Sel. Top. Signal Process. 16(6): 1519-1532 (2022) - [c90]Joel Shor, Aren Jansen, Wei Han, Daniel S. Park, Yu Zhang:
Universal Paralinguistic Speech Representations Using self-Supervised Conformers. ICASSP 2022: 3169-3173 - [c89]Bo Li, Ruoming Pang, Yu Zhang, Tara N. Sainath, Trevor Strohman, Parisa Haghani, Yun Zhu, Brian Farris, Neeraj Gaur, Manasa Prasad:
Massively Multilingual ASR: A Lifelong Learning Solution. ICASSP 2022: 6397-6401 - [c88]Junwen Bai, Bo Li, Yu Zhang, Ankur Bapna, Nikhil Siddhartha, Khe Chai Sim, Tara N. Sainath:
Joint Unsupervised and Supervised Training for Multilingual ASR. ICASSP 2022: 6402-6406 - [c87]Qiujia Li, Yu Zhang, David Qiu, Yanzhang He, Liangliang Cao, Philip C. Woodland:
Improving Confidence Estimation on Out-of-Domain Data for End-to-End Speech Recognition. ICASSP 2022: 6537-6541 - [c86]Zhehuai Chen, Yu Zhang, Andrew Rosenberg, Bhuvana Ramabhadran, Pedro J. Moreno, Gary Wang:
Tts4pretrain 2.0: Advancing the use of Text and Speech in ASR Pretraining with Consistency and Contrastive Losses. ICASSP 2022: 7677-7681 - [c85]Tara N. Sainath, Yanzhang He, Arun Narayanan, Rami Botros, Weiran Wang, David Qiu, Chung-Cheng Chiu, Rohit Prabhavalkar, Alexander Gruenstein, Anmol Gulati, Bo Li, David Rybach, Emmanuel Guzman, Ian McGraw, James Qin, Krzysztof Choromanski, Qiao Liang, Robert David, Ruoming Pang, Shuo-Yiin Chang, Trevor Strohman, W. Ronny Huang, Wei Han, Yonghui Wu, Yu Zhang:
Improving The Latency And Quality Of Cascaded Encoders. ICASSP 2022: 8112-8116 - [c84]Ye Jia, Yifan Ding, Ankur Bapna, Colin Cherry, Yu Zhang, Alexis Conneau, Nobu Morioka:
Leveraging unsupervised and weakly-supervised data to improve direct speech-to-speech translation. INTERSPEECH 2022: 1721-1725 - [c83]Kuan-Po Huang, Yu-Kuan Fu, Yu Zhang, Hung-yi Lee:
Improving Distortion Robustness of Self-supervised Speech Processing Tasks with Domain Adaptation. INTERSPEECH 2022: 2193-2197 - [c82]Murali Karthick Baskar, Andrew Rosenberg, Bhuvana Ramabhadran, Yu Zhang, Nicolás Serrano:
Reducing Domain mismatch in Self-supervised speech pre-training. INTERSPEECH 2022: 3028-3032 - [c81]Alexis Conneau, Ankur Bapna, Yu Zhang, Min Ma, Patrick von Platen, Anton Lozhkov, Colin Cherry, Ye Jia, Clara Rivera, Mihir Kale, Daan van Esch, Vera Axelrod, Simran Khanuja, Jonathan H. Clark, Orhan Firat, Michael Auli, Sebastian Ruder, Jason Riesa, Melvin Johnson:
XTREME-S: Evaluating Cross-lingual Speech Representations. INTERSPEECH 2022: 3248-3252 - [c80]Zhiyun Lu, Yongqiang Wang, Yu Zhang, Wei Han, Zhehuai Chen, Parisa Haghani:
Unsupervised Data Selection via Discrete Speech Representation for ASR. INTERSPEECH 2022: 3393-3397 - [c79]Zhehuai Chen, Yu Zhang, Andrew Rosenberg, Bhuvana Ramabhadran, Pedro J. Moreno, Ankur Bapna, Heiga Zen:
MAESTRO: Matched Speech Text Representations through Modality Matching. INTERSPEECH 2022: 4093-4097 - [c78]Lev Finkelstein, Heiga Zen, Norman Casagrande, Chun-an Chan, Ye Jia, Tom Kenter, Alexey Petelin, Jonathan Shen, Vincent Wan, Yu Zhang, Yonghui Wu, Rob Clark:
Training Text-To-Speech Systems From Synthetic Data: A Practical Approach For Accent Transfer Tasks. INTERSPEECH 2022: 4571-4575 - [c77]Tara N. Sainath, Rohit Prabhavalkar, Ankur Bapna, Yu Zhang, Zhouyuan Huo, Zhehuai Chen, Bo Li, Weiran Wang, Trevor Strohman:
JOIST: A Joint Speech and Text Streaming Model for ASR. SLT 2022: 52-59 - [c76]Zhehuai Chen, Ankur Bapna, Andrew Rosenberg, Yu Zhang, Bhuvana Ramabhadran, Pedro J. Moreno, Nanxin Chen:
Maestro-U: Leveraging Joint Speech-Text Representation Learning for Zero Supervised Speech ASR. SLT 2022: 68-75 - [c75]Zhong Meng, Tongzhou Chen, Rohit Prabhavalkar, Yu Zhang, Gary Wang, Kartik Audhkhasi, Jesse Emond, Trevor Strohman, Bhuvana Ramabhadran, W. Ronny Huang, Ehsan Variani, Yinghui Huang, Pedro J. Moreno:
Modular Hybrid Autoregressive Transducer. SLT 2022: 197-204 - [c74]Alexis Conneau, Min Ma, Simran Khanuja, Yu Zhang, Vera Axelrod, Siddharth Dalmia, Jason Riesa, Clara Rivera, Ankur Bapna:
FLEURS: FEW-Shot Learning Evaluation of Universal Representations of Speech. SLT 2022: 798-805 - [c73]Kuan-Po Huang, Yu-Kuan Fu, Tsu-Yuan Hsu, Fabian Ritter Gutierrez, Fan-Lin Wang, Liang-Hsuan Tseng, Yu Zhang, Hung-yi Lee:
Improving Generalizability of Distilled Self-Supervised Speech Processing Models Under Distorted Settings. SLT 2022: 1112-1119 - [i76]Ankur Bapna, Colin Cherry, Yu Zhang, Ye Jia, Melvin Johnson, Yong Cheng, Simran Khanuja, Jason Riesa, Alexis Conneau:
mSLAM: Massively multilingual joint pre-training for speech and text. CoRR abs/2202.01374 (2022) - [i75]Murali Karthick Baskar, Andrew Rosenberg, Bhuvana Ramabhadran, Yu Zhang, Pedro J. Moreno:
Ask2Mask: Guided Data Selection for Masked Speech Modeling. CoRR abs/2202.12719 (2022) - [i74]Alexis Conneau, Ankur Bapna, Yu Zhang, Min Ma, Patrick von Platen, Anton Lozhkov, Colin Cherry, Ye Jia, Clara Rivera, Mihir Kale, Daan van Esch, Vera Axelrod, Simran Khanuja, Jonathan H. Clark, Orhan Firat, Michael Auli, Sebastian Ruder, Jason Riesa, Melvin Johnson:
XTREME-S: Evaluating Cross-lingual Speech Representations. CoRR abs/2203.10752 (2022) - [i73]Ye Jia, Yifan Ding, Ankur Bapna, Colin Cherry, Yu Zhang, Alexis Conneau, Nobuyuki Morioka:
Leveraging unsupervised and weakly-supervised data to improve direct speech-to-speech translation. CoRR abs/2203.13339 (2022) - [i72]Kuan-Po Huang, Yu-Kuan Fu, Yu Zhang, Hung-yi Lee:
Improving Distortion Robustness of Self-supervised Speech Processing Tasks with Domain Adaptation. CoRR abs/2203.16104 (2022) - [i71]Zhehuai Chen, Yu Zhang, Andrew Rosenberg, Bhuvana Ramabhadran, Pedro J. Moreno, Ankur Bapna, Heiga Zen:
MAESTRO: Matched Speech Text Representations through Modality Matching. CoRR abs/2204.03409 (2022) - [i70]Alexis Conneau, Min Ma, Simran Khanuja, Yu Zhang, Vera Axelrod, Siddharth Dalmia, Jason Riesa, Clara Rivera, Ankur Bapna:
FLEURS: Few-shot Learning Evaluation of Universal Representations of Speech. CoRR abs/2205.12446 (2022) - [i69]Lev Finkelstein, Heiga Zen, Norman Casagrande, Chun-an Chan, Ye Jia, Tom Kenter, Alexey Petelin, Jonathan Shen, Vincent Wan, Yu Zhang, Yonghui Wu, Rob Clark:
Training Text-To-Speech Systems From Synthetic Data: A Practical Approach For Accent Transfer Tasks. CoRR abs/2208.13183 (2022) - [i68]Dongseong Hwang, Khe Chai Sim, Yu Zhang, Trevor Strohman:
Comparison of Soft and Hard Target RNN-T Distillation for Large-scale ASR. CoRR abs/2210.05793 (2022) - [i67]Tara N. Sainath, Rohit Prabhavalkar, Ankur Bapna, Yu Zhang, Zhouyuan Huo, Zhehuai Chen, Bo Li, Weiran Wang, Trevor Strohman:
JOIST: A Joint Speech and Text Streaming Model For ASR. CoRR abs/2210.07353 (2022) - [i66]Kuan-Po Huang, Yu-Kuan Fu, Tsu-Yuan Hsu, Fabian Ritter Gutierrez, Fan-Lin Wang, Liang-Hsuan Tseng, Yu Zhang, Hung-yi Lee:
Improving generalizability of distilled self-supervised speech processing models under distorted settings. CoRR abs/2210.07978 (2022) - [i65]Zhehuai Chen, Ankur Bapna, Andrew Rosenberg, Yu Zhang, Bhuvana Ramabhadran, Pedro J. Moreno, Nanxin Chen:
Maestro-U: Leveraging joint speech-text representation learning for zero supervised speech ASR. CoRR abs/2210.10027 (2022) - [i64]Takaaki Saeki, Heiga Zen, Zhehuai Chen, Nobuyuki Morioka, Gary Wang, Yu Zhang, Ankur Bapna, Andrew Rosenberg, Bhuvana Ramabhadran:
Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-Speech. CoRR abs/2210.15447 (2022) - [i63]Nobuyuki Morioka, Heiga Zen, Nanxin Chen, Yu Zhang, Yifan Ding:
Residual Adapters for Few-Shot Text-to-Speech Speaker Adaptation. CoRR abs/2210.15868 (2022) - [i62]Zhong Meng, Tongzhou Chen, Rohit Prabhavalkar, Yu Zhang, Gary Wang, Kartik Audhkhasi, Jesse Emond, Trevor Strohman, Bhuvana Ramabhadran, W. Ronny Huang, Ehsan Variani, Yinghui Huang, Pedro J. Moreno:
Modular Hybrid Autoregressive Transducer. CoRR abs/2210.17049 (2022) - [i61]Chao-Han Huck Yang, Bo Li, Yu Zhang, Nanxin Chen, Tara N. Sainath, Sabato Marco Siniscalchi, Chin-Hui Lee:
A Quantum Kernel Learning Approach to Acoustic Modeling for Spoken Command Recognition. CoRR abs/2211.01263 (2022) - 2021
- [c72]Yu-An Chung, Yu Zhang, Wei Han, Chung-Cheng Chiu, James Qin, Ruoming Pang, Yonghui Wu:
w2v-BERT: Combining Contrastive Learning and Masked Language Modeling for Self-Supervised Speech Pre-Training. ASRU 2021: 244-250 - [c71]Zhehuai Chen, Yu Zhang, Andrew Rosenberg, Bhuvana Ramabhadran, Gary Wang, Pedro J. Moreno:
Injecting Text in Self-Supervised Speech Pretraining. ASRU 2021: 251-258 - [c70]Bo Li, Ruoming Pang, Tara N. Sainath, Anmol Gulati, Yu Zhang, James Qin, Parisa Haghani, W. Ronny Huang, Min Ma, Junwen Bai:
Scaling End-to-End Models for Large-Scale Multilingual ASR. ASRU 2021: 1011-1018 - [c69]Bo Li, Anmol Gulati, Jiahui Yu, Tara N. Sainath, Chung-Cheng Chiu, Arun Narayanan, Shuo-Yiin Chang, Ruoming Pang, Yanzhang He, James Qin, Wei Han, Qiao Liang, Yu Zhang, Trevor Strohman, Yonghui Wu:
A Better and Faster end-to-end Model for Streaming ASR. ICASSP 2021: 5634-5638 - [c68]Harsh Shrivastava, Ankush Garg, Yuan Cao, Yu Zhang, Tara N. Sainath:
Echo State Speech Recognition. ICASSP 2021: 5669-5673 - [c67]Isaac Elias, Heiga Zen, Jonathan Shen, Yu Zhang, Ye Jia, Ron J. Weiss, Yonghui Wu:
Parallel Tacotron: Non-Autoregressive and Controllable TTS. ICASSP 2021: 5709-5713 - [c66]Qiujia Li, David Qiu, Yu Zhang, Bo Li, Yanzhang He, Philip C. Woodland, Liangliang Cao, Trevor Strohman:
Confidence Estimation for Attention-Based Sequence-to-Sequence Models for Speech Recognition. ICASSP 2021: 6388-6392 - [c65]David Qiu, Qiujia Li, Yanzhang He, Yu Zhang, Bo Li, Liangliang Cao, Rohit Prabhavalkar, Deepti Bhatia, Wei Li, Ke Hu, Tara N. Sainath, Ian McGraw:
Learning Word-Level Confidence for Subword End-To-End ASR. ICASSP 2021: 6393-6397 - [c64]Thibault Doutre, Wei Han, Min Ma, Zhiyun Lu, Chung-Cheng Chiu, Ruoming Pang, Arun Narayanan, Ananya Misra, Yu Zhang, Liangliang Cao:
Improving Streaming Automatic Speech Recognition with Non-Streaming Model Distillation on Unsupervised Data. ICASSP 2021: 6558-6562 - [c63]Nanxin Chen, Yu Zhang, Heiga Zen, Ron J. Weiss, Mohammad Norouzi, William Chan:
WaveGrad: Estimating Gradients for Waveform Generation. ICLR 2021 - [c62]Isaac Elias, Heiga Zen, Jonathan Shen, Yu Zhang, Ye Jia, R. J. Skerry-Ryan, Yonghui Wu:
Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling. Interspeech 2021: 141-145 - [c61]Ye Jia, Heiga Zen, Jonathan Shen, Yu Zhang, Yonghui Wu:
PnG BERT: Augmented BERT on Phonemes and Graphemes for Neural TTS. Interspeech 2021: 151-155 - [c60]Zhehuai Chen, Andrew Rosenberg, Yu Zhang, Heiga Zen, Mohammadreza Ghodsi, Yinghui Huang, Jesse Emond, Gary Wang, Bhuvana Ramabhadran, Pedro J. Moreno:
Semi-Supervision in ASR: Sequential MixMatch and Factorized TTS-Based Augmentation. Interspeech 2021: 736-740 - [c59]Zhiyun Lu, Wei Han, Yu Zhang, Liangliang Cao:
Exploring Targeted Universal Adversarial Perturbations to End-to-End ASR Models. Interspeech 2021: 3460-3464 - [c58]Nanxin Chen, Yu Zhang, Heiga Zen, Ron J. Weiss, Mohammad Norouzi, Najim Dehak
, William Chan:
WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis. Interspeech 2021: 3765-3769 - [c57]Qiujia Li, Yu Zhang, Bo Li, Liangliang Cao, Philip C. Woodland:
Residual Energy-Based Models for End-to-End Speech Recognition. Interspeech 2021: 4069-4073 - [c56]David Qiu, Yanzhang He, Qiujia Li, Yu Zhang, Liangliang Cao, Ian McGraw:
Multi-Task Learning for End-to-End ASR Word and Utterance Confidence with Deletion Prediction. Interspeech 2021: 4074-4078 - [c55]Andros Tjandra, Ruoming Pang, Yu Zhang, Shigeki Karita:
Unsupervised Learning of Disentangled Speech Content and Style Representation. Interspeech 2021: 4089-4093 - [c54]Chung-Cheng Chiu, Arun Narayanan, Wei Han, Rohit Prabhavalkar, Yu Zhang, Navdeep Jaitly, Ruoming Pang, Tara N. Sainath, Patrick Nguyen, Liangliang Cao, Yonghui Wu:
RNN-T Models Fail to Generalize to Out-of-Domain Audio: Causes and Solutions. SLT 2021: 873-880 - [i60]Harsh Shrivastava, Ankush Garg, Yuan Cao, Yu Zhang, Tara N. Sainath:
Echo State Speech Recognition. CoRR abs/2102.09114 (2021) - [i59]David Qiu, Qiujia Li, Yanzhang He, Yu Zhang, Bo Li, Liangliang Cao, Rohit Prabhavalkar, Deepti Bhatia, Wei Li, Ke Hu, Tara N. Sainath, Ian McGraw:
Learning Word-Level Confidence For Subword End-to-End ASR. CoRR abs/2103.06716 (2021) - [i58]Qiujia Li, Yu Zhang, Bo Li, Liangliang Cao, Philip C. Woodland:
Residual Energy-Based Models for End-to-End Speech Recognition. CoRR abs/2103.14152 (2021) - [i57]Isaac Elias, Heiga Zen, Jonathan Shen, Yu Zhang, Jia Ye, R. J. Skerry-Ryan, Yonghui Wu:
Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling. CoRR abs/2103.14574 (2021) - [i56]Ye Jia, Heiga Zen, Jonathan Shen, Yu Zhang, Yonghui Wu:
PnG BERT: Augmented BERT on Phonemes and Graphemes for Neural TTS. CoRR abs/2103.15060 (2021) - [i55]William Chan, Daniel S. Park, Chris A. Lee, Yu Zhang, Quoc V. Le, Mohammad Norouzi:
SpeechStew: Simply Mix All Available Speech Recognition Data to Train One Large Neural Network. CoRR abs/2104.02133 (2021) - [i54]Zhiyun Lu, Wei Han, Yu Zhang, Liangliang Cao:
Exploring Targeted Universal Adversarial Perturbations to End-to-end ASR Models. CoRR abs/2104.02757 (2021) - [i53]David Qiu, Yanzhang He, Qiujia Li, Yu Zhang, Liangliang Cao, Ian McGraw:
Multi-Task Learning for End-to-End ASR Word and Utterance Confidence with Deletion Prediction. CoRR abs/2104.12870 (2021) - [i52]Bo Li, Ruoming Pang, Tara N. Sainath, Anmol Gulati, Yu Zhang, James Qin, Parisa Haghani, W. Ronny Huang, Min Ma:
Scaling End-to-End Models for Large-Scale Multilingual ASR. CoRR abs/2104.14830 (2021) - [i51]Nanxin Chen, Yu Zhang, Heiga Zen, Ron J. Weiss, Mohammad Norouzi, Najim Dehak, William Chan:
WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis. CoRR abs/2106.09660 (2021) - [i50]Yu-An Chung, Yu Zhang, Wei Han, Chung-Cheng Chiu, James Qin, Ruoming Pang, Yonghui Wu:
W2v-BERT: Combining Contrastive Learning and Masked Language Modeling for Self-Supervised Speech Pre-Training. CoRR abs/2108.06209 (2021) - [i49]Zhehuai Chen, Yu Zhang, Andrew Rosenberg, Bhuvana Ramabhadran, Gary Wang, Pedro J. Moreno:
Injecting Text in Self-Supervised Speech Pretraining. CoRR abs/2108.12226 (2021) - [i48]Yu Zhang, Daniel S. Park, Wei Han, James Qin, Anmol Gulati, Joel Shor, Aren Jansen, Yuanzhong Xu, Yanping Huang, Shibo Wang, Zongwei Zhou, Bo Li, Min Ma, William Chan, Jiahui Yu, Yongqiang Wang, Liangliang Cao, Khe Chai Sim, Bhuvana Ramabhadran, Tara N. Sainath, Françoise Beaufays, Zhifeng Chen, Quoc V. Le, Chung-Cheng Chiu, Ruoming Pang, Yonghui Wu:
BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition. CoRR abs/2109.13226 (2021) - [i47]Qiujia Li, Yu Zhang, David Qiu, Yanzhang He, Liangliang Cao, Philip C. Woodland:
Improving Confidence Estimation on Out-of-Domain Data for End-to-End Speech Recognition. CoRR abs/2110.03327 (2021) - [i46]Ankur Bapna, Yu-An Chung, Nan Wu, Anmol Gulati, Ye Jia, Jonathan H. Clark, Melvin Johnson, Jason Riesa, Alexis Conneau, Yu Zhang:
SLAM: A Unified Encoder for Speech and Language Modeling via Speech-Text Joint Pre-Training. CoRR abs/2110.10329 (2021) - [i45]Junwen Bai, Bo Li, Yu Zhang, Ankur Bapna, Nikhil Siddhartha, Khe Chai Sim, Tara N. Sainath:
Joint Unsupervised and Supervised Training for Multilingual ASR. CoRR abs/2111.08137 (2021) - 2020
- [c53]Pei Sun, Henrik Kretzschmar, Xerxes Dotiwalla, Aurelien Chouard, Vijaysai Patnaik, Paul Tsui, James Guo, Yin Zhou, Yuning Chai, Benjamin Caine, Vijay Vasudevan, Wei Han, Jiquan Ngiam, Hang Zhao, Aleksei Timofeev, Scott Ettinger, Maxim Krivokon, Amy Gao, Aditya Joshi, Yu Zhang, Jonathon Shlens, Zhifeng Chen, Dragomir Anguelov:
Scalability in Perception for Autonomous Driving: Waymo Open Dataset. CVPR 2020: 2443-2451 - [c52]Tara N. Sainath, Yanzhang He, Bo Li, Arun Narayanan, Ruoming Pang, Antoine Bruguier, Shuo-Yiin Chang, Wei Li, Raziel Alvarez, Zhifeng Chen, Chung-Cheng Chiu, David Garcia, Alexander Gruenstein, Ke Hu, Anjuli Kannan, Qiao Liang, Ian McGraw, Cal Peyser, Rohit Prabhavalkar, Golan Pundak, David Rybach, Yuan Shangguan, Yash Sheth, Trevor Strohman, Mirkó Visontai, Yonghui Wu, Yu Zhang, Ding Zhao:
A Streaming On-Device End-To-End Model Surpassing Server-Side Conventional Model Quality and Latency. ICASSP 2020: 6059-6063 - [c51]Guangzhi Sun, Yu Zhang, Ron J. Weiss, Yuan Cao, Heiga Zen, Yonghui Wu:
Fully-Hierarchical Fine-Grained Prosody Modeling For Interpretable Speech Synthesis. ICASSP 2020: 6264-6268 - [c50]Guangzhi Sun, Yu Zhang, Ron J. Weiss, Yuan Cao, Heiga Zen, Andrew Rosenberg, Bhuvana Ramabhadran, Yonghui Wu:
Generating Diverse and Natural Text-to-Speech Samples Using a Quantized Fine-Grained VAE and Autoregressive Prosody Prior. ICASSP 2020: 6699-6703 - [c49]Daniel S. Park, Yu Zhang, Chung-Cheng Chiu, Youzheng Chen, Bo Li, William Chan, Quoc V. Le, Yonghui Wu:
Specaugment on Large Scale Datasets. ICASSP 2020: 6879-6883 - [c48]Gary Wang, Andrew Rosenberg, Zhehuai Chen, Yu Zhang, Bhuvana Ramabhadran, Yonghui Wu, Pedro J. Moreno:
Improving Speech Recognition Using Consistent Predictions on Synthesized Speech. ICASSP 2020: 7029-7033 - [c47]Zhiyun Lu, Liangliang Cao, Yu Zhang, Chung-Cheng Chiu, James Fan:
Speech Sentiment Analysis via Pre-Trained Features from End-to-End ASR Models. ICASSP 2020: 7149-7153 - [c46]Tomoki Hayashi, Ryuichi Yamamoto, Katsuki Inoue, Takenori Yoshimura, Shinji Watanabe
, Tomoki Toda
, Kazuya Takeda, Yu Zhang, Xu Tan:
Espnet-TTS: Unified, Reproducible, and Integratable Open Source End-to-End Text-to-Speech Toolkit. ICASSP 2020: 7654-7658 - [c45]Zelin Wu, Bo Li, Yu Zhang, Petar S. Aleksic, Tara N. Sainath:
Multistate Encoding with End-To-End Speech RNN Transducer Network. ICASSP 2020: 7819-7823 - [c44]Zhehuai Chen, Andrew Rosenberg, Yu Zhang, Gary Wang, Bhuvana Ramabhadran, Pedro J. Moreno:
Improving Speech Recognition Using GAN-Based Speech Synthesis and Contrastive Unspoken Text Selection. INTERSPEECH 2020: 556-560 - [c43]Daniel S. Park, Yu Zhang, Ye Jia, Wei Han, Chung-Cheng Chiu, Bo Li, Yonghui Wu, Quoc V. Le:
Improved Noisy Student Training for Automatic Speech Recognition. INTERSPEECH 2020: 2817-2821 - [c42]Gary Wang, Andrew Rosenberg, Zhehuai Chen, Yu Zhang, Bhuvana Ramabhadran, Pedro J. Moreno:
SCADA: Stochastic, Consistent and Adversarial Data Augmentation to Improve ASR. INTERSPEECH 2020: 2832-2836 - [c41]Wei Han, Zhengdong Zhang, Yu Zhang, Jiahui Yu, Chung-Cheng Chiu, James Qin, Anmol Gulati, Ruoming Pang, Yonghui Wu:
ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context. INTERSPEECH 2020: 3610-3614 - [c40]Anmol Gulati, James Qin, Chung-Cheng Chiu, Niki Parmar, Yu Zhang, Jiahui Yu, Wei Han, Shibo Wang, Zhengdong Zhang, Yonghui Wu, Ruoming Pang:
Conformer: Convolution-augmented Transformer for Speech Recognition. INTERSPEECH 2020: 5036-5040 - [c39]Eric Chen, Zhiyun Lu, Hao Xu, Liangliang Cao, Yu Zhang, James Fan:
A Large Scale Speech Sentiment Corpus. LREC 2020: 6549-6555 - [i44]Guangzhi Sun, Yu Zhang, Ron J. Weiss, Yuan Cao, Heiga Zen, Yonghui Wu:
Fully-hierarchical fine-grained prosody modeling for interpretable speech synthesis. CoRR abs/2002.03785 (2020) - [i43]Guangzhi Sun, Yu Zhang, Ron J. Weiss, Yuan Cao, Heiga Zen, Andrew Rosenberg, Bhuvana Ramabhadran, Yonghui Wu:
Generating diverse and natural text-to-speech samples using a quantized fine-grained VAE and auto-regressive prosody prior. CoRR abs/2002.03788 (2020) - [i42]