- Heming Wang, Yao Qian, Hemin Yang, Nauyuki Kanda, Peidong Wang, Takuya Yoshioka, Xiaofei Wang, Yiming Wang, Shujie Liu, Zhuo Chen, DeLiang Wang, Michael Zeng:
DATA2VEC-SG: Improving Self-Supervised Learning Representations for Speech Generation Tasks. ICASSP 2023: 1-5 - Kun Wei, Long Zhou, Ziqiang Zhang, Liping Chen, Shujie Liu, Lei He, Jinyu Li, Furu Wei:
Joint Pre-Training with Speech and Bilingual Text for Direct Speech to Speech Translation. ICASSP 2023: 1-5 - Haibin Yu, Yuxuan Hu, Yao Qian, Ma Jin, Linquan Liu, Shujie Liu, Yu Shi, Yanmin Qian, Edward Lin, Michael Zeng:
Code-Switching Text Generation and Injection in Mandarin-English ASR. ICASSP 2023: 1-5 - Qiu-Shi Zhu, Long Zhou, Jie Zhang, Shujie Liu, Yu-Chen Hu, Li-Rong Dai:
Robust Data2VEC: Noise-Robust Speech Representation Learning for ASR by Combining Regression and Improved Contrastive Learning. ICASSP 2023: 1-5 - Xiaotong Zhou, Binglan Fu, Shujie Du, Chunlin Liu, Weijun Peng, Guo Chen, Qianrong Jiang:
SPLNet: An image denoising network with simplified channel attention computation. ICCIP 2023: 126-132 - Shujie Zhang, Tianyue Zheng, Zhe Chen, Jingzhi Hu, Abdelwahed Khamis, Jiajun Liu, Jun Luo:
OCHID-Fi: Occlusion-Robust Hand Pose Estimation in 3D via RF-Vision. ICCV 2023: 15066-15075 - Sanyuan Chen, Yu Wu, Chengyi Wang, Shujie Liu, Daniel Tompkins, Zhuo Chen, Wanxiang Che, Xiangzhan Yu, Furu Wei:
BEATs: Audio Pre-Training with Acoustic Tokenizers. ICML 2023: 5178-5193 - Jingtian Liu, Zhonghui Wu, Zhuoyuan Li, Menglin Li, Shujie Yang, Han Xiao, Changqiao Xu:
DLCCB: A Dynamic Labeling Based Covert Communication Method on Blockchain. IWCMC 2023: 168-173 - Chenyang Le, Yao Qian, Long Zhou, Shujie Liu, Yanmin Qian, Michael Zeng, Xuedong Huang:
ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text Translation. NeurIPS 2023 - Chengyi Wang, Sanyuan Chen, Yu Wu, Ziqiang Zhang, Long Zhou, Shujie Liu, Zhuo Chen, Yanqing Liu, Huaming Wang, Jinyu Li, Lei He, Sheng Zhao, Furu Wei:
Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers. CoRR abs/2301.02111 (2023) - Jiajun Deng, Xurong Xie, Tianzi Wang, Mingyu Cui, Boyang Xue, Zengrui Jin, Guinan Li, Shujie Hu, Xunying Liu:
Confidence Score Based Speaker Adaptation of Conformer Speech Recognition Systems. CoRR abs/2302.07521 (2023) - Shujie Hu, Xurong Xie, Zengrui Jin, Mengzhe Geng, Yi Wang, Mingyu Cui, Jiajun Deng, Xunying Liu, Helen Meng:
Exploring Self-supervised Pre-trained ASR Models For Dysarthric and Elderly Speech Recognition. CoRR abs/2302.14564 (2023) - Eric Sun, Jinyu Li, Yuxuan Hu, Yimeng Zhu, Long Zhou, Jian Xue, Peidong Wang, Linquan Liu, Shujie Liu, Edward Lin, Yifan Gong:
Building High-accuracy Multilingual ASR with Gated Language Experts and Curriculum Training. CoRR abs/2303.00786 (2023) - Ziqiang Zhang, Long Zhou, Chengyi Wang, Sanyuan Chen, Yu Wu, Shujie Liu, Zhuo Chen, Yanqing Liu, Huaming Wang, Jinyu Li, Lei He, Sheng Zhao, Furu Wei:
Speak Foreign Languages with Your Own Voice: Cross-Lingual Neural Codec Language Modeling. CoRR abs/2303.03926 (2023) - Chenda Li, Yao Qian, Zhuo Chen, Dongmei Wang, Takuya Yoshioka, Shujie Liu, Yanmin Qian, Michael Zeng:
Target Sound Extraction with Variable Cross-modality Clues. CoRR abs/2303.08372 (2023) - Haibin Yu, Yuxuan Hu, Yao Qian, Ma Jin, Linquan Liu, Shujie Liu, Yu Shi, Yanmin Qian, Edward Lin, Michael Zeng:
Code-Switching Text Generation and Injection in Mandarin-English ASR. CoRR abs/2303.10949 (2023) - Mengzhe Geng, Zengrui Jin, Tianzi Wang, Shujie Hu, Jiajun Deng, Mingyu Cui, Guinan Li, Jianwei Yu, Xurong Xie, Xunying Liu:
Use of Speech Impairment Severity for Dysarthric Speech Recognition. CoRR abs/2305.10659 (2023) - Chenyang Le, Yao Qian, Long Zhou, Shujie Liu, Michael Zeng, Xuedong Huang:
ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text Translation. CoRR abs/2305.14838 (2023) - Tianrui Wang, Long Zhou, Ziqiang Zhang, Yu Wu, Shujie Liu, Yashesh Gaur, Zhuo Chen, Jinyu Li, Furu Wei:
VioLA: Unified Codec Language Models for Speech Recognition, Synthesis, and Translation. CoRR abs/2305.16107 (2023) - Jiajun Deng, Guinan Li, Xurong Xie, Zengrui Jin, Mingyu Cui, Tianzi Wang, Shujie Hu, Mengzhe Geng, Xunying Liu:
Factorised Speaker-environment Adaptive Training of Conformer Speech Recognition Systems. CoRR abs/2306.14608 (2023) - Yuang Li, Yu Wu, Jinyu Li, Shujie Liu:
Prompting Large Language Models for Zero-Shot Domain Adaptation in Speech Recognition. CoRR abs/2306.16007 (2023) - Yuang Li, Yu Wu, Jinyu Li, Shujie Liu:
Accelerating Transducers through Adjacent Token Merging. CoRR abs/2306.16009 (2023) - Guinan Li, Jiajun Deng, Mengzhe Geng, Zengrui Jin, Tianzi Wang, Shujie Hu, Mingyu Cui, Helen Meng, Xunying Liu:
Audio-visual End-to-end Multi-channel Speech Separation, Dereverberation and Recognition. CoRR abs/2307.02909 (2023) - Jian Wu, Yashesh Gaur, Zhuo Chen, Long Zhou, Yimeng Zhu, Tianrui Wang, Jinyu Li, Shujie Liu, Bo Ren, Linquan Liu, Yu Wu:
On decoder-only architecture for speech-to-text and large language model integration. CoRR abs/2307.03917 (2023) - Xiaofei Wang, Manthan Thakker, Zhuo Chen, Naoyuki Kanda, Sefik Emre Eskimez, Sanyuan Chen, Min Tang, Shujie Liu, Jinyu Li, Takuya Yoshioka:
SpeechX: Neural Codec Language Model as a Versatile Speech Transformer. CoRR abs/2308.06873 (2023) - Shujie Zhang, Tianyue Zheng, Zhe Chen, Jingzhi Hu, Abdelwahed Khamis, Jiajun Liu, Jun Luo:
OCHID-Fi: Occlusion-Robust Hand Pose Estimation in 3D via RF-Vision. CoRR abs/2308.10146 (2023) - Guangyu Chen, Yu Wu, Shujie Liu, Tao Liu, Xiaoyong Du, Furu Wei:
WavMark: Watermarking for Audio Generation. CoRR abs/2308.12770 (2023) - Leying Zhang, Yao Qian, Linfeng Yu, Heming Wang, Xinkai Wang, Hemin Yang, Long Zhou, Shujie Liu, Yanmin Qian, Michael Zeng:
Diffusion Conditional Expectation Model for Efficient and Robust Target Speech Extraction. CoRR abs/2309.13874 (2023) - Jing Pan, Jian Wu, Yashesh Gaur, Sunit Sivasankaran, Zhuo Chen, Shujie Liu, Jinyu Li:
COSMIC: Data Efficient Instruction-tuning For Speech In-Context Learning. CoRR abs/2311.02248 (2023) - Zengrui Jin, Xurong Xie, Tianzi Wang, Mengzhe Geng, Jiajun Deng, Guinan Li, Shujie Hu, Xunying Liu:
Towards Automatic Data Augmentation for Disordered Speech Recognition. CoRR abs/2312.08641 (2023)