default search action
Chao Weng
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j13]Dongchao Yang, Songxiang Liu, Rongjie Huang, Chao Weng, Helen Meng:
InstructTTS: Modelling Expressive TTS in Discrete Latent Space With Natural Language Style Prompt. IEEE ACM Trans. Audio Speech Lang. Process. 32: 2913-2925 (2024) - [c65]Rongjie Huang, Chunlei Zhang, Yongqi Wang, Dongchao Yang, Jinchuan Tian, Zhenhui Ye, Luping Liu, Zehan Wang, Ziyue Jiang, Xuankai Chang, Jiatong Shi, Chao Weng, Zhou Zhao, Dong Yu:
Make-A-Voice: Revisiting Voice Large Language Models as Scalable Multilingual and Multitask Learners. ACL (1) 2024: 10929-10942 - [c64]Haoxin Chen, Yong Zhang, Xiaodong Cun, Menghan Xia, Xintao Wang, Chao Weng, Ying Shan:
VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models. CVPR 2024: 7310-7320 - [c63]Yuanyuan Wang, Hangting Chen, Dongchao Yang, Jianwei Yu, Chao Weng, Zhiyong Wu, Helen Meng:
Consistent and Relevant: Rethink the Query Embedding in General Sound Separation. ICASSP 2024: 961-965 - [c62]Jianwei Cui, Yu Gu, Chao Weng, Jie Zhang, Liping Chen, Lirong Dai:
Sifisinger: A High-Fidelity End-to-End Singing Voice Synthesizer Based on Source-Filter Model. ICASSP 2024: 11126-11130 - [c61]Yu Gu, Qiushi Zhu, Guangzhi Lei, Chao Weng, Dan Su:
DurIAN-E 2: Duration Informed Attention Network with Adaptive Variational Autoencoder and Adversarial Learning for Expressive Text-to-Speech Synthesis. ICASSP 2024: 11266-11270 - [c60]Andong Li, Rilin Chen, Yu Gu, Chao Weng, Dan Su:
Opine: Leveraging a Optimization-Inspired Deep Unfolding Method for Multi-Channel Speech Enhancement. ICASSP 2024: 11376-11380 - [c59]Hangting Chen, Jianwei Yu, Chao Weng:
Complexity Scaling for Speech Denoising. ICASSP 2024: 12276-12280 - [i50]Haoxin Chen, Yong Zhang, Xiaodong Cun, Menghan Xia, Xintao Wang, Chao Weng, Ying Shan:
VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models. CoRR abs/2401.09047 (2024) - [i49]Yi Luo, Jianwei Yu, Hangting Chen, Rongzhi Gu, Chao Weng:
Gull: A Generative Multifunctional Audio Codec. CoRR abs/2404.04947 (2024) - 2023
- [j12]Jinchuan Tian, Jianwei Yu, Chao Weng, Yuexian Zou, Dong Yu:
Integrating Lattice-Free MMI Into End-to-End Speech Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 31: 25-38 (2023) - [j11]Dongchao Yang, Jianwei Yu, Helin Wang, Wen Wang, Chao Weng, Yuexian Zou, Dong Yu:
Diffsound: Discrete Diffusion Model for Text-to-Sound Generation. IEEE ACM Trans. Audio Speech Lang. Process. 31: 1720-1733 (2023) - [c58]Jianwei Yu, Hangting Chen, Yi Luo, Rongzhi Gu, Weihua Li, Chao Weng:
TSpeech-AI System Description to the 5th Deep Noise Suppression (DNS) Challenge. ICASSP 2023: 1-2 - [c57]Jinchuan Tian, Brian Yan, Jianwei Yu, Chao Weng, Dong Yu, Shinji Watanabe:
Bayes Risk CTC: Controllable CTC Alignment in Sequence-to-Sequence Tasks. ICLR 2023 - [c56]Sipan Li, Songxiang Liu, Luwen Zhang, Xiang Li, Yanyao Bian, Chao Weng, Zhiyong Wu, Helen Meng:
SnakeGAN: A Universal Vocoder Leveraging DDSP Prior Knowledge and Periodic Inductive Bias. ICME 2023: 1703-1708 - [c55]Jianwei Yu, Hangting Chen, Yi Luo, Rongzhi Gu, Chao Weng:
High Fidelity Speech Enhancement with Band-split RNN. INTERSPEECH 2023: 2483-2487 - [c54]Hangting Chen, Jianwei Yu, Yi Luo, Rongzhi Gu, Weihua Li, Zhuocheng Lu, Chao Weng:
Ultra Dual-Path Compression For Joint Echo Cancellation And Noise Suppression. INTERSPEECH 2023: 2523-2527 - [c53]Dongchao Yang, Songxiang Liu, Helin Wang, Jianwei Yu, Chao Weng, Yuexian Zou:
NoreSpeech: Knowledge Distillation based Conditional Diffusion Model for Noise-robust Expressive TTS. INTERSPEECH 2023: 4798-4802 - [c52]Xiang Li, Songxiang Liu, Max W. Y. Lam, Zhiyong Wu, Chao Weng, Helen Meng:
Diverse and Expressive Speech Prosody Prediction with Denoising Diffusion Probabilistic Model. INTERSPEECH 2023: 4858-4862 - [c51]Jinchuan Tian, Jianwei Yu, Hangting Chen, Brian Yan, Chao Weng, Dong Yu, Shinji Watanabe:
Bayes Risk Transducer: Transducer with Controllable Alignment Prediction. INTERSPEECH 2023: 4968-4972 - [i48]Dongchao Yang, Songxiang Liu, Rongjie Huang, Guangzhi Lei, Chao Weng, Helen Meng, Dong Yu:
InstructTTS: Modelling Expressive TTS in Discrete Latent Space with Natural Language Style Prompt. CoRR abs/2301.13662 (2023) - [i47]Dongchao Yang, Songxiang Liu, Rongjie Huang, Jinchuan Tian, Chao Weng, Yuexian Zou:
HiFi-Codec: Group-residual Vector quantization for High Fidelity Audio Codec. CoRR abs/2305.02765 (2023) - [i46]Xiang Li, Songxiang Liu, Max W. Y. Lam, Zhiyong Wu, Chao Weng, Helen Meng:
Diverse and Expressive Speech Prosody Prediction with Denoising Diffusion Probabilistic Model. CoRR abs/2305.16749 (2023) - [i45]Rongjie Huang, Chunlei Zhang, Yongqi Wang, Dongchao Yang, Luping Liu, Zhenhui Ye, Ziyue Jiang, Chao Weng, Zhou Zhao, Dong Yu:
Make-A-Voice: Unified Voice Synthesis With Discrete Representation. CoRR abs/2305.19269 (2023) - [i44]Yingqing He, Menghan Xia, Haoxin Chen, Xiaodong Cun, Yuan Gong, Jinbo Xing, Yong Zhang, Xintao Wang, Chao Weng, Ying Shan, Qifeng Chen:
Animate-A-Story: Storytelling with Retrieval-Augmented Video Generation. CoRR abs/2307.06940 (2023) - [i43]Jinchuan Tian, Jianwei Yu, Hangting Chen, Brian Yan, Chao Weng, Dong Yu, Shinji Watanabe:
Bayes Risk Transducer: Transducer with Controllable Alignment Prediction. CoRR abs/2308.10107 (2023) - [i42]Hangting Chen, Jianwei Yu, Yi Luo, Rongzhi Gu, Weihua Li, Zhuocheng Lu, Chao Weng:
Ultra Dual-Path Compression For Joint Echo Cancellation And Noise Suppression. CoRR abs/2308.11053 (2023) - [i41]Qiushi Zhu, Yu Gu, Chao Weng, Yuchen Hu, Lirong Dai, Jie Zhang:
Rep2wav: Noise Robust text-to-speech Using self-supervised representations. CoRR abs/2308.14553 (2023) - [i40]Hangting Chen, Jianwei Yu, Chao Weng:
Complexity Scaling for Speech Denoising. CoRR abs/2309.07757 (2023) - [i39]Sipan Li, Songxiang Liu, Luwen Zhang, Xiang Li, Yanyao Bian, Chao Weng, Zhiyong Wu, Helen Meng:
SnakeGAN: A Universal Vocoder Leveraging DDSP Prior Knowledge and Periodic Inductive Bias. CoRR abs/2309.07803 (2023) - [i38]Yu Gu, Yianrao Bian, Guangzhi Lei, Chao Weng, Dan Su:
DurIAN-E: Duration Informed Attention Network For Expressive Text-to-Speech Synthesis. CoRR abs/2309.12792 (2023) - [i37]Haoxin Chen, Menghan Xia, Yingqing He, Yong Zhang, Xiaodong Cun, Shaoshu Yang, Jinbo Xing, Yaofang Liu, Qifeng Chen, Xintao Wang, Chao Weng, Ying Shan:
VideoCrafter1: Open Diffusion Models for High-Quality Video Generation. CoRR abs/2310.19512 (2023) - [i36]Yuanyuan Wang, Hangting Chen, Dongchao Yang, Jianwei Yu, Chao Weng, Zhiyong Wu, Helen Meng:
Consistent and Relevant: Rethink the Query Embedding in General Sound Separation. CoRR abs/2312.15463 (2023) - 2022
- [j10]Jiatong Shi, Chunlei Zhang, Chao Weng, Shinji Watanabe, Meng Yu, Dong Yu:
An investigation of neural uncertainty estimation for target speaker extraction equipped RNN transducer. Comput. Speech Lang. 73: 101327 (2022) - [j9]Aswin Shanmugam Subramanian, Chao Weng, Shinji Watanabe, Meng Yu, Dong Yu:
Deep learning based multi-source localization with source splitting and its effectiveness in multi-talker speech recognition. Comput. Speech Lang. 75: 101360 (2022) - [j8]Jinchuan Tian, Jianwei Yu, Chao Weng, Yuexian Zou, Dong Yu:
Improving Mandarin End-to-End Speech Recognition With Word N-Gram Language Model. IEEE Signal Process. Lett. 29: 812-816 (2022) - [c50]Brian Yan, Chunlei Zhang, Meng Yu, Shi-Xiong Zhang, Siddharth Dalmia, Dan Berrebbi, Chao Weng, Shinji Watanabe, Dong Yu:
Joint Modeling of Code-Switched and Monolingual ASR via Conditional Factorization. ICASSP 2022: 6412-6416 - [c49]Xiaoyi Qin, Na Li, Chao Weng, Dan Su, Ming Li:
Simple Attention Module Based Speaker Verification with Iterative Noisy Label Detection. ICASSP 2022: 6722-6726 - [c48]Naijun Zheng, Na Li, Jianwei Yu, Chao Weng, Dan Su, Xunying Liu, Helen Meng:
Multi-Channel Speaker Diarization Using Spatial Features for Meetings. ICASSP 2022: 7337-7341 - [c47]Jinchuan Tian, Jianwei Yu, Chao Weng, Shi-Xiong Zhang, Dan Su, Dong Yu, Yuexian Zou:
Consistent Training and Decoding for End-to-End Speech Recognition Using Lattice-Free MMI. ICASSP 2022: 7782-7786 - [c46]Jingbei Li, Yi Meng, Chenyi Li, Zhiyong Wu, Helen Meng, Chao Weng, Dan Su:
Enhancing Speaking Styles in Conversational Text-to-Speech Synthesis with Graph-Based Multi-Modal Context Modeling. ICASSP 2022: 7917-7921 - [c45]Chunlei Zhang, Jiatong Shi, Chao Weng, Meng Yu, Dong Yu:
Towards end-to-end Speaker Diarization with Generalized Neural Speaker Clustering. ICASSP 2022: 8372-8376 - [c44]Naijun Zheng, Na Li, Xixin Wu, Lingwei Meng, Jiawen Kang, Haibin Wu, Chao Weng, Dan Su, Helen Meng:
The CUHK-Tencent Speaker Diarization System for the ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Challenge. ICASSP 2022: 9161-9165 - [c43]Xiaoyi Qin, Na Li, Chao Weng, Dan Su, Ming Li:
Cross-Age Speaker Verification: Learning Age-Invariant Speaker Embeddings. INTERSPEECH 2022: 1436-1440 - [c42]Helin Wang, Dongchao Yang, Chao Weng, Jianwei Yu, Yuexian Zou:
Improving Target Sound Extraction with Timestamp Information. INTERSPEECH 2022: 1526-1530 - [i35]Jinchuan Tian, Jianwei Yu, Chao Weng, Yuexian Zou, Dong Yu:
Improving Mandarin End-to-End Speech Recognition with Word N-gram Language Model. CoRR abs/2201.01995 (2022) - [i34]Naijun Zheng, Na Li, Xixin Wu, Lingwei Meng, Jiawen Kang, Haibin Wu, Chao Weng, Dan Su, Helen Meng:
The CUHK-TENCENT speaker diarization system for the ICASSP 2022 multi-channel multi-party meeting transcription challenge. CoRR abs/2202.01986 (2022) - [i33]Jinchuan Tian, Jianwei Yu, Chao Weng, Yuexian Zou, Dong Yu:
Integrate Lattice-Free MMI into End-to-End Speech Recognition. CoRR abs/2203.15614 (2022) - [i32]Helin Wang, Dongchao Yang, Chao Weng, Jianwei Yu, Yuexian Zou:
Improving Target Sound Extraction with Timestamp Information. CoRR abs/2204.00821 (2022) - [i31]Jinchuan Tian, Jianwei Yu, Chunlei Zhang, Chao Weng, Yuexian Zou, Dong Yu:
LAE: Language-Aware Encoder for Monolingual and Multilingual ASR. CoRR abs/2206.02093 (2022) - [i30]Xiaoyi Qin, Na Li, Chao Weng, Dan Su, Ming Li:
Cross-Age Speaker Verification: Learning Age-Invariant Speaker Embeddings. CoRR abs/2207.05929 (2022) - [i29]Dongchao Yang, Jianwei Yu, Helin Wang, Wen Wang, Chao Weng, Yuexian Zou, Dong Yu:
Diffsound: Discrete Diffusion Model for Text-to-sound Generation. CoRR abs/2207.09983 (2022) - [i28]Xiaoyi Qin, Na Li, Yuke Lin, Yiwei Ding, Chao Weng, Dan Su, Ming Li:
The DKU-Tencent System for the VoxCeleb Speaker Recognition Challenge 2022. CoRR abs/2210.05092 (2022) - [i27]Jinchuan Tian, Brian Yan, Jianwei Yu, Chao Weng, Dong Yu, Shinji Watanabe:
Bayes risk CTC: Controllable CTC alignment in Sequence-to-Sequence tasks. CoRR abs/2210.07499 (2022) - [i26]Dongchao Yang, Songxiang Liu, Jianwei Yu, Helin Wang, Chao Weng, Yuexian Zou:
NoreSpeech: Knowledge Distillation based Conditional Diffusion Model for Noise-robust Expressive TTS. CoRR abs/2211.02448 (2022) - 2021
- [c41]Xingchen Song, Zhiyong Wu, Yiheng Huang, Chao Weng, Dan Su, Helen M. Meng:
Non-Autoregressive Transformer ASR with CTC-Enhanced Decoder Input. ICASSP 2021: 5894-5898 - [c40]Xu Li, Na Li, Chao Weng, Xunying Liu, Dan Su, Dong Yu, Helen Meng:
Replay and Synthetic Speech Detection with Res2Net Architecture. ICASSP 2021: 6354-6358 - [c39]Chunlei Zhang, Meng Yu, Chao Weng, Dong Yu:
Towards Robust Speaker Verification with Target Speaker Enhancement. ICASSP 2021: 6693-6697 - [c38]Naijun Zheng, Na Li, Bo Wu, Meng Yu, Jianwei Yu, Chao Weng, Dan Su, Xunying Liu, Helen Meng:
A Joint Training Framework of Multi-Look Separator and Speaker Embedding Extractor for Overlapped Speech. ICASSP 2021: 6698-6702 - [c37]Wei Xia, Chunlei Zhang, Chao Weng, Meng Yu, Dong Yu:
Self-Supervised Text-Independent Speaker Verification Using Prototypical Momentum Contrastive Learning. ICASSP 2021: 6723-6727 - [c36]Jiatong Shi, Chunlei Zhang, Chao Weng, Shinji Watanabe, Meng Yu, Dong Yu:
Improving RNN Transducer with Target Speaker Extraction and Neural Uncertainty Estimation. ICASSP 2021: 6908-6912 - [c35]Aswin Shanmugam Subramanian, Chao Weng, Shinji Watanabe, Meng Yu, Yong Xu, Shi-Xiong Zhang, Dong Yu:
Directional ASR: A New Paradigm for E2E Multi-Speaker Speech Recognition with Source Localization. ICASSP 2021: 8433-8437 - [c34]Max W. Y. Lam, Jun Wang, Chao Weng, Dan Su, Dong Yu:
Raw Waveform Encoder with Multi-Scale Globally Attentive Locally Recurrent Networks for End-to-End Speech Recognition. Interspeech 2021: 316-320 - [c33]Helin Wang, Bo Wu, Lianwu Chen, Meng Yu, Jianwei Yu, Yong Xu, Shi-Xiong Zhang, Chao Weng, Dan Su, Dong Yu:
TeCANet: Temporal-Contextual Attention Network for Environment-Aware Speech Dereverberation. Interspeech 2021: 1109-1113 - [c32]Guoguo Chen, Shuzhou Chai, Guan-Bo Wang, Jiayu Du, Wei-Qiang Zhang, Chao Weng, Dan Su, Daniel Povey, Jan Trmal, Junbo Zhang, Mingjie Jin, Sanjeev Khudanpur, Shinji Watanabe, Shuaijiang Zhao, Wei Zou, Xiangang Li, Xuchen Yao, Yongqing Wang, Zhao You, Zhiyong Yan:
GigaSpeech: An Evolving, Multi-Domain ASR Corpus with 10, 000 Hours of Transcribed Audio. Interspeech 2021: 3670-3674 - [c31]Jianming Liu, Meng Yu, Yong Xu, Chao Weng, Shi-Xiong Zhang, Lianwu Chen, Dong Yu:
Neural Mask based Multi-channel Convolutional Beamforming for Joint Dereverberation, Echo Cancellation and Denoising. SLT 2021: 766-770 - [i25]Peng Liu, Yuewen Cao, Songxiang Liu, Na Hu, Guangzhi Li, Chao Weng, Dan Su:
VARA-TTS: Non-Autoregressive Text-to-Speech Synthesis based on Very Deep VAE with Residual Attention. CoRR abs/2102.06431 (2021) - [i24]Aswin Shanmugam Subramanian, Chao Weng, Shinji Watanabe, Meng Yu, Dong Yu:
Deep Learning based Multi-Source Localization with Source Splitting and its Effectiveness in Multi-Talker Speech Recognition. CoRR abs/2102.07955 (2021) - [i23]Helin Wang, Bo Wu, Lianwu Chen, Meng Yu, Jianwei Yu, Yong Xu, Shi-Xiong Zhang, Chao Weng, Dan Su, Dong Yu:
TeCANet: Temporal-Contextual Attention Network for Environment-Aware Speech Dereverberation. CoRR abs/2103.16849 (2021) - [i22]Max W. Y. Lam, Jun Wang, Chao Weng, Dan Su, Dong Yu:
Raw Waveform Encoder with Multi-Scale Globally Attentive Locally Recurrent Networks for End-to-End Speech Recognition. CoRR abs/2106.04275 (2021) - [i21]Jingbei Li, Yi Meng, Chenyi Li, Zhiyong Wu, Helen Meng, Chao Weng, Dan Su:
Spoken Style Learning with Multi-modal Hierarchical Context Encoding for Conversational Text-to-Speech Synthesis. CoRR abs/2106.06233 (2021) - [i20]Guoguo Chen, Shuzhou Chai, Guanbo Wang, Jiayu Du, Wei-Qiang Zhang, Chao Weng, Dan Su, Daniel Povey, Jan Trmal, Junbo Zhang, Mingjie Jin, Sanjeev Khudanpur, Shinji Watanabe, Shuaijiang Zhao, Wei Zou, Xiangang Li, Xuchen Yao, Yongqing Wang, Yujun Wang, Zhao You, Zhiyong Yan:
GigaSpeech: An Evolving, Multi-domain ASR Corpus with 10, 000 Hours of Transcribed Audio. CoRR abs/2106.06909 (2021) - [i19]Xiaoyi Qin, Na Li, Chao Weng, Dan Su, Ming Li:
Simple Attention Module based Speaker Verification with Iterative noisy label detection. CoRR abs/2110.06534 (2021) - [i18]Brian Yan, Chunlei Zhang, Meng Yu, Shi-Xiong Zhang, Siddharth Dalmia, Dan Berrebbi, Chao Weng, Shinji Watanabe, Dong Yu:
Joint Modeling of Code-Switched and Monolingual ASR via Conditional Factorization. CoRR abs/2111.15016 (2021) - [i17]Jinchuan Tian, Jianwei Yu, Chao Weng, Shi-Xiong Zhang, Dan Su, Dong Yu, Yuexian Zou:
Consistent Training and Decoding For End-to-end Speech Recognition Using Lattice-free MMI. CoRR abs/2112.02498 (2021) - [i16]Dongchao Yang, Helin Wang, Yuexian Zou, Chao Weng:
Detect what you want: Target Sound Detection. CoRR abs/2112.10153 (2021) - 2020
- [c30]Qiao Tian, Zewang Zhang, Ling-Hui Chen, Heng Lu, Chengzhu Yu, Chao Weng, Dong Yu:
The Tencent speech synthesis system for Blizzard Challenge 2020. Blizzard Challenge / Voice Conversion Challenge 2020 - [c29]Aswin Shanmugam Subramanian, Chao Weng, Meng Yu, Shi-Xiong Zhang, Yong Xu, Shinji Watanabe, Dong Yu:
Far-Field Location Guided Target Speech Extraction Using End-to-End Speech Recognition Objectives. ICASSP 2020: 7299-7303 - [c28]Zhao You, Dan Su, Jie Chen, Chao Weng, Dong Yu:
Dfsmn-San with Persistent Memory Model for Automatic Speech Recognition. ICASSP 2020: 7704-7708 - [c27]Chengqi Deng, Chengzhu Yu, Heng Lu, Chao Weng, Dong Yu:
Pitchnet: Unsupervised Singing Voice Conversion with Pitch Adversarial Network. ICASSP 2020: 7749-7753 - [c26]Yong Xu, Meng Yu, Shi-Xiong Zhang, Lianwu Chen, Chao Weng, Jianming Liu, Dong Yu:
Neural Spatio-Temporal Beamformer for Target Speech Separation. INTERSPEECH 2020: 56-60 - [c25]Chao Weng, Chengzhu Yu, Jia Cui, Chunlei Zhang, Dong Yu:
Minimum Bayes Risk Training of RNN-Transducer for End-to-End Speech Recognition. INTERSPEECH 2020: 966-970 - [c24]Yusong Wu, Shengchen Li, Chengzhu Yu, Heng Lu, Chao Weng, Liqiang Zhang, Dong Yu:
Peking Opera Synthesis via Duration Informed Attention Network. INTERSPEECH 2020: 1226-1230 - [c23]Liqiang Zhang, Chengzhu Yu, Heng Lu, Chao Weng, Chunlei Zhang, Yusong Wu, Xiang Xie, Zijin Li, Dong Yu:
DurIAN-SC: Duration Informed Attention Network Based Singing Voice Conversion System. INTERSPEECH 2020: 1231-1235 - [c22]Chengzhu Yu, Heng Lu, Na Hu, Meng Yu, Chao Weng, Kun Xu, Peng Liu, Deyi Tuo, Shiyin Kang, Guangzhi Lei, Dan Su, Dong Yu:
DurIAN: Duration Informed Attention Network for Speech Synthesis. INTERSPEECH 2020: 2027-2031 - [i15]Yong Xu, Meng Yu, Shi-Xiong Zhang, Lianwu Chen, Chao Weng, Jianming Liu, Dong Yu:
Neural Spatio-Temporal Beamformer for Target Speech Separation. CoRR abs/2005.03889 (2020) - [i14]Liqiang Zhang, Chengzhu Yu, Heng Lu, Chao Weng, Chunlei Zhang, Yusong Wu, Xiang Xie, Zijin Li, Dong Yu:
DurIAN-SC: Duration Informed Attention Network based Singing Voice Conversion System. CoRR abs/2008.03009 (2020) - [i13]Yusong Wu, Shengchen Li, Chengzhu Yu, Heng Lu, Chao Weng, Liqiang Zhang, Dong Yu:
Peking Opera Synthesis via Duration Informed Attention Network. CoRR abs/2008.03029 (2020) - [i12]Xu Li, Na Li, Chao Weng, Xunying Liu, Dan Su, Dong Yu, Helen Meng:
Replay and Synthetic Speech Detection with Res2net Architecture. CoRR abs/2010.15006 (2020) - [i11]Xingchen Song, Zhiyong Wu, Yiheng Huang, Chao Weng, Dan Su, Helen Meng:
Non-Autoregressive Transformer ASR with CTC-Enhanced Decoder Input. CoRR abs/2010.15025 (2020) - [i10]Aswin Shanmugam Subramanian, Chao Weng, Shinji Watanabe, Meng Yu, Yong Xu, Shi-Xiong Zhang, Dong Yu:
Directional ASR: A New Paradigm for E2E Multi-Speaker Speech Recognition with Source Localization. CoRR abs/2011.00091 (2020) - [i9]Jiatong Shi, Chunlei Zhang, Chao Weng, Shinji Watanabe, Meng Yu, Dong Yu:
Improving RNN Transducer With Target Speaker Extraction and Neural Uncertainty Estimation. CoRR abs/2011.13393 (2020) - [i8]Wei Xia, Chunlei Zhang, Chao Weng, Meng Yu, Dong Yu:
Self-supervised Text-independent Speaker Verification using Prototypical Momentum Contrastive Learning. CoRR abs/2012.07178 (2020)
2010 – 2019
- 2019
- [j7]Yanmin Qian, Chao Weng, Xuankai Chang, Shuai Wang, Dong Yu:
Erratum to: Past review, current progress, and challenges ahead on the cocktail party problem. Frontiers Inf. Technol. Electron. Eng. 20(3): 438 (2019) - [c21]Changhao Shan, Chao Weng, Guangsen Wang, Dan Su, Min Luo, Dong Yu, Lei Xie:
Component Fusion: Learning Replaceable Language Model Component for End-to-end Speech Recognition System. ICASSP 2019: 5631-5635 - [c20]Changhao Shan, Chao Weng, Guangsen Wang, Dan Su, Min Luo, Dong Yu, Lei Xie:
Investigating End-to-end Speech Recognition for Mandarin-english Code-switching. ICASSP 2019: 6056-6060 - [c19]Peidong Wang, Jia Cui, Chao Weng, Dong Yu:
Token-wise Training for Attention Based End-to-end Speech Recognition. ICASSP 2019: 6276-6280 - [c18]Chao Weng, Dong Yu:
A Comparison of Lattice-free Discriminative Training Criteria for Purely Sequence-trained Neural Network Acoustic Models. ICASSP 2019: 6430-6434 - [c17]Ozlem Kalinli, Gautam Bhattacharya, Chao Weng:
Parametric Cepstral Mean Normalization for Robust Speech Recognition. ICASSP 2019: 6735-6739 - [c16]Yong Xu, Chao Weng, Like Hui, Jianming Liu, Meng Yu, Dan Su, Dong Yu:
Joint Training of Complex Ratio Mask Based Beamformer and Acoustic Model for Noise Robust Asr. ICASSP 2019: 6745-6749 - [c15]Peidong Wang, Jia Cui, Chao Weng, Dong Yu:
Large Margin Training for Attention Based End-to-End Speech Recognition. INTERSPEECH 2019: 246-250 - [i7]Chengzhu Yu, Heng Lu, Na Hu, Meng Yu, Chao Weng, Kun Xu, Peng Liu, Deyi Tuo, Shiyin Kang, Guangzhi Lei, Dan Su, Dong Yu:
DurIAN: Duration Informed Attention Network For Multimodal Synthesis. CoRR abs/1909.01700 (2019) - [i6]Zhao You, Dan Su, Jie Chen, Chao Weng, Dong Yu:
DFSMN-SAN with Persistent Memory Model for Automatic Speech Recognition. CoRR abs/1910.13282 (2019) - [i5]Chao Weng, Chengzhu Yu, Jia Cui, Chunlei Zhang, Dong Yu:
Minimum Bayes Risk Training of RNN-Transducer for End-to-End Speech Recognition. CoRR abs/1911.12487 (2019) - [i4]Chengqi Deng, Chengzhu Yu, Heng Lu, Chao Weng, Dong Yu:
PitchNet: Unsupervised Singing Voice Conversion with Pitch Adversarial Network. CoRR abs/1912.01852 (2019) - [i3]Liqiang Zhang, Chengzhu Yu, Heng Lu, Chao Weng, Yusong Wu, Xiang Xie, Zijin Li, Dong Yu:
Learning Singing From Speech. CoRR abs/1912.10128 (2019) - [i2]Yusong Wu, Shengchen Li, Chengzhu Yu, Heng Lu, Chao Weng, Liqiang Zhang, Dong Yu:
Synthesising Expressiveness in Peking Opera via Duration Informed Attention Network. CoRR abs/1912.12010 (2019) - 2018
- [j6]Yanmin Qian, Chao Weng, Xuankai Chang, Shuai Wang, Dong Yu:
Past review, current progress, and challenges ahead on the cocktail party problem. Frontiers Inf. Technol. Electron. Eng. 19(1): 40-63 (2018) - [j5]Yanmin Qian, Chao Weng, Xuankai Chang, Shuai Wang, Dong Yu:
Erratum to: Past review, current progress, and challenges ahead on the cocktail party problem. Frontiers Inf. Technol. Electron. Eng. 19(4): 582 (2018) - [c14]Chao Weng, Jia Cui, Guangsen Wang, Jun Wang, Chengzhu Yu, Dan Su, Dong Yu:
Improving Attention Based Sequence-to-Sequence Models for End-to-End English Conversational Speech Recognition. INTERSPEECH 2018: 761-765 - [c13]Chengzhu Yu, Chunlei Zhang, Chao Weng, Jia Cui, Dong Yu:
A Multistage Training Framework for Acoustic-to-Word Model. INTERSPEECH 2018: 786-790 - [c12]Chunlei Zhang, Chengzhu Yu, Chao Weng, Jia Cui, Dong Yu:
An Exploration of Directly Using Word as ACOUSTIC Modeling Unit for Speech Recognition. SLT 2018: 64-69 - [c11]Jia Cui, Chao Weng, Guangsen Wang, Jun Wang, Peidong Wang, Chengzhu Yu, Dan Su, Dong Yu:
Improving Attention-Based End-to-End ASR Systems with Sequence-Based Loss Functions. SLT 2018: 353-360 - [i1]Chao Weng, Dong Yu:
A Comparison of Lattice-free Discriminative Training Criteria for Purely Sequence-Trained Neural Network Acoustic Models. CoRR abs/1811.03700 (2018) - 2016
- [j4]Chao Weng, Liang Zhou, Qiulin Ding:
基于不可见字符的主副式网页信息隐藏算法 (Primary and Secondary Webpage Information Hiding Algorithm Based on Invisible Characters). 计算机科学 43(7): 136-140 (2016) - 2015
- [b1]Chao Weng:
Towards robust conversational speech recognition and understanding. Georgia Institute of Technology, Atlanta, GA, USA, 2015 - [j3]Chao Weng, Biing-Hwang Fred Juang:
Discriminative Training Using Non-Uniform Criteria for Keyword Spotting on Spontaneous Speech. IEEE ACM Trans. Audio Speech Lang. Process. 23(2): 300-312 (2015) - [j2]Chao Weng, Dong Yu, Michael L. Seltzer, Jasha Droppo:
Deep Neural Networks for Single-Channel Multi-Talker Speech Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 23(10): 1670-1679 (2015) - 2014
- [j1]Chao Weng, David L. Thomson, Patrick Haffner, Biing-Hwang Juang:
Latent semantic rational kernels for topic spotting on conversational speech. IEEE ACM Trans. Audio Speech Lang. Process. 22(12): 1738-1749 (2014) - [c10]Zhen Huang, Chao Weng, Kehuang Li, You-Chi Cheng, Chin-Hui Lee:
Deep learning vector quantization for acoustic information retrieval. ICASSP 2014: 1350-1354 - [c9]Chao Weng, Dong Yu, Shinji Watanabe, Biing-Hwang Fred Juang:
Recurrent deep neural networks for robust speech recognition. ICASSP 2014: 5532-5536 - [c8]Chao Weng, Dong Yu, Michael L. Seltzer, Jasha Droppo:
Single-channel mixed speech recognition using deep neural networks. ICASSP 2014: 5632-5636 - [c7]Zhen Huang, Jinyu Li, Chao Weng, Chin-Hui Lee:
Beyond cross-entropy: towards better frame-level objective functions for deep neural network training in automatic speech recognition. INTERSPEECH 2014: 1214-1218 - [c6]Zhen Huang, Jinyu Li, Sabato Marco Siniscalchi, I-Fan Chen, Chao Weng, Chin-Hui Lee:
Feature space maximum a posteriori linear regression for adaptation of deep neural networks. INTERSPEECH 2014: 2992-2996 - 2013
- [c5]Chao Weng, Biing-Hwang Juang:
Adaptive boosted non-uniform mce for keyword spotting on spontaneous speech. ICASSP 2013: 6960-6964 - [c4]Chao Weng, Biing-Hwang Juang:
Latent semantic rational kernels for topic spotting on spontaneous conversational speech. ICASSP 2013: 8302-8306 - 2012
- [c3]Chao Weng, Biing-Hwang Juang:
A comparative study of discriminative training using non-uniform criteria for cross-layer acoustic modeling. ICASSP 2012: 4089-4092 - [c2]Chao Weng, Biing-Hwang Juang, Daniel Povey:
Discriminative Training Using Non-uniform Criteria for Keyword Spotting on Spontaneous Speech. INTERSPEECH 2012: 559-562 - 2011
- [c1]Chao Weng, Biing-Hwang Juang:
Recent development of discriminative training using non-uniform criteria for cross-level acoustic modeling. ICASSP 2011: 5332-5335
Coauthor Index
aka: Helen Meng
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-17 21:33 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint