default search action
Haizhou Li 0001
李海洲
Person information
- unicode name: 李海洲
- affiliation: Chinese University of Hong Kong (Shenzhen), China
- affiliation: National University of Singapore, Department of Electrical and Computer Engineering, Singapore
- affiliation (2006 - 2016): Nanyang Technological University, Singapore
- affiliation (2003 - 2016): Institute for Infocomm Research, A*STAR, Singapore
- affiliation (2011): University of New South Wales, Sydney, Australia
- affiliation (2009): University of Eastern Finland, Kuopio, Finland
- affiliation (PhD 1990): South China University of Technology, Guangzhou, China
Other persons with the same name
- Haizhou Li
- Haizhou Li 0002 — Blaise Pascal University, Clermont-Ferrand, France
- Haizhou Li 0003 — City University of Hong Kong, Department of Computer Science, Hong Kong
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2024
- [j180]Qianhui Liu, Meng Ge, Haizhou Li:
Intelligent event-based lip reading word classification with spiking neural networks using spatio-temporal attention features and triplet loss. Inf. Sci. 675: 120660 (2024) - [j179]Jiaqi Yan, Qianhui Liu, Malu Zhang, Lang Feng, De Ma, Haizhou Li, Gang Pan:
Efficient spiking neural network design via neural architecture search. Neural Networks 173: 106172 (2024) - [j178]Xinyi Chen, Qu Yang, Jibin Wu, Haizhou Li, Kay Chen Tan:
A Hybrid Neural Coding Approach for Pattern Recognition With Spiking Neural Networks. IEEE Trans. Pattern Anal. Mach. Intell. 46(5): 3064-3078 (2024) - [j177]Shuai Wang, Zhengyang Chen, Bing Han, Hongji Wang, Chengdong Liang, Binbin Zhang, Xu Xiang, Wen Ding, Johan Rohdin, Anna Silnova, Yanmin Qian, Haizhou Li:
Advancing speaker embedding learning: Wespeaker toolkit for research and production. Speech Commun. 162: 103104 (2024) - [j176]Jingru Lin, Meng Ge, Wupeng Wang, Haizhou Li, Mengling Feng:
Selective HuBERT: Self-Supervised Pre-Training for Target Speaker in Clean and Mixture Speech. IEEE Signal Process. Lett. 31: 1014-1018 (2024) - [j175]Duo Ma, Xianghu Yue, Junyi Ao, Xiaoxue Gao, Haizhou Li:
Text-Guided HuBERT: Self-Supervised Speech Pre-Training via Generative Adversarial Networks. IEEE Signal Process. Lett. 31: 2055-2059 (2024) - [j174]Xiaoxue Gao, Zexin Li, Yiming Chen, Cong Liu, Haizhou Li:
Transferable Adversarial Attacks Against ASR. IEEE Signal Process. Lett. 31: 2200-2204 (2024) - [j173]Koichiro Yoshino, Yun-Nung Chen, Paul A. Crook, Satwik Kottur, Jinchao Li, Behnam Hedayatnia, Seungwhan Moon, Zhengcong Fei, Zekang Li, Jinchao Zhang, Yang Feng, Jie Zhou, Seokhwan Kim, Yang Liu, Di Jin, Alexandros Papangelis, Karthik Gopalakrishnan, Dilek Hakkani-Tur, Babak Damavandi, Alborz Geramifard, Chiori Hori, Ankit Shah, Chen Zhang, Haizhou Li, João Sedoc, Luis F. D'Haro, Rafael E. Banchs, Alexander Rudnicky:
Overview of the Tenth Dialog System Technology Challenge: DSTC10. IEEE ACM Trans. Audio Speech Lang. Process. 32: 765-778 (2024) - [j172]Lei Liu, Li Liu, Haizhou Li:
Computation and Parameter Efficient Multi-Modal Fusion Transformer for Cued Speech Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 32: 1559-1572 (2024) - [j171]Xuehao Zhou, Mingyang Zhang, Yi Zhou, Zhizheng Wu, Haizhou Li:
Accented Text-to-Speech Synthesis With Limited Data. IEEE ACM Trans. Audio Speech Lang. Process. 32: 1699-1711 (2024) - [j170]Rui Liu, Berrak Sisman, Guanglai Gao, Haizhou Li:
Controllable Accented Text-to-Speech Synthesis With Fine and Coarse-Grained Intensity Rendering. IEEE ACM Trans. Audio Speech Lang. Process. 32: 2188-2201 (2024) - [j169]Tianchi Liu, Kong Aik Lee, Qiongqiong Wang, Haizhou Li:
Golden Gemini is All You Need: Finding the Sweet Spots for Speaker Verification. IEEE ACM Trans. Audio Speech Lang. Process. 32: 2324-2337 (2024) - [j168]Congcong Sun, Hui Tian, Peng Tian, Haizhou Li, Zhenxing Qian:
Multi-Agent Deep Learning for the Detection of Multiple Speech Steganography Methods. IEEE ACM Trans. Audio Speech Lang. Process. 32: 2957-2972 (2024) - [j167]Mingyang Zhang, Yi Zhou, Yi Ren, Chen Zhang, Xiang Yin, Haizhou Li:
RefXVC: Cross-Lingual Voice Conversion With Enhanced Reference Leveraging. IEEE ACM Trans. Audio Speech Lang. Process. 32: 4146-4156 (2024) - [j166]Wupeng Wang, Zexu Pan, Xinke Li, Shuai Wang, Haizhou Li:
Speech Separation With Pretrained Frontend to Minimize Domain Mismatch. IEEE ACM Trans. Audio Speech Lang. Process. 32: 4184-4198 (2024) - [j165]Siqi Cai, Tanja Schultz, Haizhou Li:
Brain Topology Modeling With EEG-Graphs for Auditory Spatial Attention Detection. IEEE Trans. Biomed. Eng. 71(1): 171-182 (2024) - [j164]Miao Liu, Jing Wang, Xinyuan Qian, Haizhou Li:
Audio-Visual Temporal Forgery Detection Using Embedding-Level Fusion and Multi-Dimensional Contrastive Loss. IEEE Trans. Circuits Syst. Video Technol. 34(8): 6937-6948 (2024) - [j163]Zhenyu Weng, Huiping Zhuang, Fulin Luo, Haizhou Li, Zhiping Lin:
Few-Shot Contrastive Transfer Learning With Pretrained Model for Masked Face Verification. IEEE Trans. Multim. 26: 3871-3883 (2024) - [j162]Xinyuan Qian, Wei Xue, Qiquan Zhang, Ruijie Tao, Haizhou Li:
Deep Cross-Modal Retrieval Between Spatial Image and Acoustic Speech. IEEE Trans. Multim. 26: 4480-4489 (2024) - [j161]Ruihang Ji, Shuzhi Sam Ge, Kai Zhao, Haizhou Li:
Event-Triggered Tracking Control for Nonlinear Systems With Prescribed Performance. IEEE Trans. Syst. Man Cybern. Syst. 54(6): 3547-3557 (2024) - [c711]Shimin Zhang, Qu Yang, Chenxiang Ma, Jibin Wu, Haizhou Li, Kay Chen Tan:
TC-LIF: A Two-Compartment Spiking Neuron Model for Long-Term Sequential Modelling. AAAI 2024: 16838-16847 - [c710]Rui Liu, Yifan Hu, Yi Ren, Xiang Yin, Haizhou Li:
Emotion Rendering for Conversational Speech Synthesis with Heterogeneous Graph-Based Context Modeling. AAAI 2024: 18698-18706 - [c709]Jiadong Wang, Zexu Pan, Malu Zhang, Robby T. Tan, Haizhou Li:
Restoring Speaking Lips from Occlusion for Audio-Visual Speech Recognition. AAAI 2024: 19144-19152 - [c708]Chen Zhang, Luis Fernando D'Haro, Yiming Chen, Malu Zhang, Haizhou Li:
A Comprehensive Analysis of the Effectiveness of Large Language Models as Automatic Dialogue Evaluators. AAAI 2024: 19515-19524 - [c707]Yiming Chen, Chen Zhang, Danqing Luo, Luis Fernando D'Haro, Robby T. Tan, Haizhou Li:
Unveiling the Achilles' Heel of NLG Evaluators: A Unified Adversarial Framework Driven by Large Language Models. ACL (Findings) 2024: 1359-1375 - [c706]Feng Jiang, Weihao Liu, Xiaomin Chu, Peifeng Li, Qiaoming Zhu, Haizhou Li:
Advancing Topic Segmentation and Outline Generation in Chinese Texts: The Paragraph-level Topic Representation, Corpus, and Benchmark. LREC/COLING 2024: 495-506 - [c705]Danqing Luo, Chen Zhang, Yan Zhang, Haizhou Li:
CrossTune: Black-Box Few-Shot Classification with Label Enhancement. LREC/COLING 2024: 4185-4197 - [c704]Yaxin Fan, Feng Jiang, Peifeng Li, Haizhou Li:
Uncovering the Potential of ChatGPT for Discourse Analysis in Dialogue: An Empirical Study. LREC/COLING 2024: 16998-17010 - [c703]Qu Yang, Qianhui Liu, Nan Li, Meng Ge, Zeyang Song, Haizhou Li:
SVAD: A Robust, Low-Power, and Light-Weight Voice Activity Detection with Spiking Neural Networks. ICASSP 2024: 221-225 - [c702]Zeyang Song, Jibin Wu, Malu Zhang, Mike Zheng Shou, Haizhou Li:
Spiking-Leaf: A Learnable Auditory Front-End for Spiking Neural Networks. ICASSP 2024: 226-230 - [c701]Qiquan Zhang, Meng Ge, Hongxu Zhu, Eliathamby Ambikairajah, Qi Song, Zhaoheng Ni, Haizhou Li:
An Empirical Study on the Impact of Positional Encoding in Transformer-Based Monaural Speech Enhancement. ICASSP 2024: 1001-1005 - [c700]Siqi Cai, Ran Zhang, Haizhou Li:
Robust Decoding of the Auditory Attention from EEG Recordings Through Graph Convolutional Networks. ICASSP 2024: 2320-2324 - [c699]Yu Chen, Xinyuan Qian, Zexu Pan, Kainan Chen, Haizhou Li:
LOCSELECT: Target Speaker Localization with an Auditory Selective Hearing Mechanism. ICASSP 2024: 8696-8700 - [c698]Sho Inoue, Kun Zhou, Shuai Wang, Haizhou Li:
Hierarchical Emotion Prediction and Control in Text-to-Speech Synthesis. ICASSP 2024: 10601-10605 - [c697]Junjie Li, Ruijie Tao, Zexu Pan, Meng Ge, Shuai Wang, Haizhou Li:
Audio-Visual Active Speaker Extraction for Sparsely Overlapped Multi-Talker Speech. ICASSP 2024: 10666-10670 - [c696]Shuai Wang, Qibing Bai, Qi Liu, Jianwei Yu, Zhengyang Chen, Bing Han, Yanmin Qian, Haizhou Li:
Leveraging in-the-wild Data for Effective Self-supervised Pretraining in Speaker Recognition. ICASSP 2024: 10901-10905 - [c695]Yidi Jiang, Zhengyang Chen, Ruijie Tao, Liqun Deng, Yanmin Qian, Haizhou Li:
Prompt-Driven Target Speech Diarization. ICASSP 2024: 11086-11090 - [c694]Yi Ma, Kong Aik Lee, Ville Hautamäki, Meng Ge, Haizhou Li:
Gradient Weighting for Speaker Verification in Extremely Low Signal-to-Noise Ratio. ICASSP 2024: 11311-11315 - [c693]Wenxuan Wu, Xueyuan Chen, Xixin Wu, Haizhou Li, Helen Meng:
Target Speech Extraction with Pre-trained AV-HuBERT and Mask-And-Recover Strategy. IJCNN 2024: 1-8 - [c692]Chuang Li, Yan Zhang, Min-Yen Kan, Haizhou Li:
UNO-DST: Leveraging Unlabelled Data in Zero-Shot Dialogue State Tracking. NAACL-HLT (Findings) 2024: 2972-2983 - [c691]Xidong Wang, Guiming Chen, Dingjie Song, Zhiyi Zhang, Zhihong Chen, Qingying Xiao, Junying Chen, Feng Jiang, Jianquan Li, Xiang Wan, Benyou Wang, Haizhou Li:
CMB: A Comprehensive Medical Benchmark in Chinese. NAACL-HLT 2024: 6184-6205 - [c690]Huang Huang, Fei Yu, Jianqing Zhu, Xuening Sun, Hao Cheng, Dingjie Song, Zhihong Chen, Mosen Alharthi, Bang An, Juncai He, Ziche Liu, Junying Chen, Jianquan Li, Benyou Wang, Lian Zhang, Ruoyu Sun, Xiang Wan, Haizhou Li, Jinchao Xu:
AceGPT, Localizing Large Language Models in Arabic. NAACL-HLT 2024: 8139-8163 - [c689]Kun Zhou, Berrak Sisman, Carlos Busso, Bin Ma, Haizhou Li:
Mixed-EVC: Mixed Emotion Synthesis and Control in Voice Conversion. Odyssey 2024: 180-186 - [c688]Ganjun Liu, Xiaohui Hou, Meng Ge, Tao Zhang, Haizhou Li:
A Non-Intrusive Approach to Assessing Dysarthria Severity: Advancing Clinical Diagnosis. WWW (Companion Volume) 2024: 1134-1137 - [i202]Yi Ma, Kong Aik Lee, Ville Hautamäki, Meng Ge, Haizhou Li:
Gradient weighting for speaker verification in extremely low Signal-to-Noise Ratio. CoRR abs/2401.02626 (2024) - [i201]Feng Jiang, Kuang Wang, Haizhou Li:
Bridging Research and Readers: A Multi-Modal Automated Academic Papers Interpretation System. CoRR abs/2401.09150 (2024) - [i200]Qiquan Zhang, Meng Ge, Hongxu Zhu, Eliathamby Ambikairajah, Qi Song, Zhaoheng Ni, Haizhou Li:
An Empirical Study on the Impact of Positional Encoding in Transformer-based Monaural Speech Enhancement. CoRR abs/2401.09686 (2024) - [i199]Xianghu Yue, Xiaohai Tian, Malu Zhang, Zhizheng Wu, Haizhou Li:
CoAVT: A Cognition-Inspired Unified Audio-Visual-Text Pre-Training Model for Multimodal Processing. CoRR abs/2401.12264 (2024) - [i198]Qianhui Liu, Jiaqi Yan, Malu Zhang, Gang Pan, Haizhou Li:
LitE-SNN: Designing Lightweight and Efficient Spiking Neural Network through Spatial-Temporal Compressive Network Search and Joint Optimization. CoRR abs/2401.14652 (2024) - [i197]Lei Liu, Li Liu, Haizhou Li:
Computation and Parameter Efficient Multi-Modal Fusion Transformer for Cued Speech Recognition. CoRR abs/2401.17604 (2024) - [i196]Wenjie Wei, Malu Zhang, Jilin Zhang, Ammar Belatreche, Jibin Wu, Zijing Xu, Xuerui Qiu, Hong Chen, Yang Yang, Haizhou Li:
Event-Driven Learning for Spiking Neural Networks. CoRR abs/2403.00270 (2024) - [i195]Sho Inoue, Kun Zhou, Shuai Wang, Haizhou Li:
Fine-Grained Quantitative Emotion Editing for Speech Generation. CoRR abs/2403.02002 (2024) - [i194]Xidong Wang, Nuo Chen, Junyin Chen, Yan Hu, Yidong Wang, Xiangbo Wu, Anningzhe Gao, Xiang Wan, Haizhou Li, Benyou Wang:
Apollo: An Lightweight Multilingual Medical LLM towards Democratizing Medical AI to 6B People. CoRR abs/2403.03640 (2024) - [i193]Qu Yang, Qianhui Liu, Nan Li, Meng Ge, Zeyang Song, Haizhou Li:
sVAD: A Robust, Low-Power, and Light-Weight Voice Activity Detection with Spiking Neural Networks. CoRR abs/2403.05772 (2024) - [i192]Danqing Luo, Chen Zhang, Yan Zhang, Haizhou Li:
CrossTune: Black-Box Few-Shot Classification with Label Enhancement. CoRR abs/2403.12468 (2024) - [i191]Wenxuan Wu, Xueyuan Chen, Xixin Wu, Haizhou Li, Helen Meng:
Target Speech Extraction with Pre-trained AV-HuBERT and Mask-And-Recover Strategy. CoRR abs/2403.16078 (2024) - [i190]Yicheng Gu, Xueyao Zhang, Liumeng Xue, Haizhou Li, Zhizheng Wu:
An Investigation of Time-Frequency Representation Discriminators for High-Fidelity Vocoder. CoRR abs/2404.17161 (2024) - [i189]Ruijie Tao, Xinyuan Qian, Yidi Jiang, Junjie Li, Jiadong Wang, Haizhou Li:
Audio-Visual Target Speaker Extraction with Reverse Selective Auditory Attention. CoRR abs/2404.18501 (2024) - [i188]Chuang Li, Yang Deng, Hengchang Hu, Min-Yen Kan, Haizhou Li:
Incorporating External Knowledge and Goal Guidance for LLM-based Conversational Recommender Systems. CoRR abs/2405.01868 (2024) - [i187]Sho Inoue, Kun Zhou, Shuai Wang, Haizhou Li:
Hierarchical Emotion Prediction and Control in Text-to-Speech Synthesis. CoRR abs/2405.09171 (2024) - [i186]Xiangyu Zhang, Qiquan Zhang, Hexin Liu, Tianyi Xiao, Xinyuan Qian, Beena Ahmed, Eliathamby Ambikairajah, Haizhou Li, Julien Epps:
Mamba in Speech: Towards an Alternative to Self-Attention. CoRR abs/2405.12609 (2024) - [i185]Yiming Chen, Chen Zhang, Danqing Luo, Luis Fernando D'Haro, Robby T. Tan, Haizhou Li:
Unveiling the Achilles' Heel of NLG Evaluators: A Unified Adversarial Framework Driven by Large Language Models. CoRR abs/2405.14646 (2024) - [i184]Jiahui Xu, Feng Jiang, Anningzhe Gao, Haizhou Li:
Unsupervised Mutual Learning of Dialogue Discourse Parsing and Topic Segmentation. CoRR abs/2405.19799 (2024) - [i183]Chen Zhang, Chengguang Tang, Dading Chong, Ke Shi, Guohua Tang, Feng Jiang, Haizhou Li:
TS-Align: A Teacher-Student Collaborative Framework for Scalable Iterative Finetuning of Large Language Models. CoRR abs/2405.20215 (2024) - [i182]Tianchi Liu, Lin Zhang, Rohan Kumar Das, Yi Ma, Ruijie Tao, Haizhou Li:
How Do Neural Spoofing Countermeasures Detect Partially Spoofed Audio? CoRR abs/2406.02483 (2024) - [i181]Zhijun Liu, Shuai Wang, Sho Inoue, Qibing Bai, Haizhou Li:
Autoregressive Diffusion Transformer for Text-to-Speech Synthesis. CoRR abs/2406.05551 (2024) - [i180]Yidi Jiang, Ruijie Tao, Zhengyang Chen, Yanmin Qian, Haizhou Li:
Target Speech Diarization with Multimodal Prompts. CoRR abs/2406.07198 (2024) - [i179]Xuehao Zhou, Mingyang Zhang, Yi Zhou, Zhiwu Li, Haizhou Li:
Multi-Scale Accent Modeling with Disentangling for Multi-Speaker Multi-Accent TTS Synthesis. CoRR abs/2406.10844 (2024) - [i178]Zeyang Song, Qianhui Liu, Qu Yang, Yizhou Peng, Haizhou Li:
ED-sKWS: Early-Decision Spiking Neural Networks for Rapid,and Energy-Efficient Keyword Spotting. CoRR abs/2406.12726 (2024) - [i177]Junyi Ao, Yuancheng Wang, Xiaohai Tian, Dekun Chen, Jun Zhang, Lu Lu, Yuxuan Wang, Haizhou Li, Zhizheng Wu:
SD-Eval: A Benchmark Dataset for Spoken Dialogue Understanding Beyond Words. CoRR abs/2406.13340 (2024) - [i176]Ziche Liu, Rui Ke, Feng Jiang, Haizhou Li:
Take the essence and discard the dross: A Rethinking on Data Selection for Fine-Tuning Large Language Models. CoRR abs/2406.14115 (2024) - [i175]Jiabao Pan, Yan Zhang, Chen Zhang, Zuozhu Liu, Hongwei Wang, Haizhou Li:
DynaThink: Fast or Slow? A Dynamic Decision-Making Framework for Large Language Models. CoRR abs/2407.01009 (2024) - [i174]Rui Liu, Haolin Zuo, Zheng Lian, Xiaofen Xing, Björn W. Schuller, Haizhou Li:
Emotion and Intent Joint Understanding in Multimodal Conversation: A Benchmarking Dataset. CoRR abs/2407.02751 (2024) - [i173]Yang Wang, Haiyang Mei, Qirui Bao, Ziqi Wei, Mike Zheng Shou, Haizhou Li, Bo Dong, Xin Yang:
Apprenticeship-Inspired Elegance: Synergistic Knowledge Distillation Empowers Spiking Neural Networks for Efficient Single-Eye Emotion Recognition. CoRR abs/2407.09521 (2024) - [i172]Weizhi Liu, Yue Li, Dongdong Lin, Hui Tian, Haizhou Li:
GROOT: Generating Robust Watermark for Diffusion-Model-Based Audio Synthesis. CoRR abs/2407.10471 (2024) - [i171]Shuai Wang, Zhengyang Chen, Kong Aik Lee, Yanmin Qian, Haizhou Li:
Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning. CoRR abs/2407.15188 (2024) - [i170]Rui Liu, Yifan Hu, Yi Ren, Xiang Yin, Haizhou Li:
Generative Expressive Conversational Speech Synthesis. CoRR abs/2407.21491 (2024) - [i169]Dashanka De Silva, Siqi Cai, Saurav Pahuja, Tanja Schultz, Haizhou Li:
NeuroSpex: Neuro-Guided Speaker Extraction with Cross-Modal Attention. CoRR abs/2409.02489 (2024) - 2023
- [j160]Tao Luo, Weng-Fai Wong, Rick Siow Mong Goh, Anh Tuan Do, Zhixian Chen, Haizhou Li, Wenyu Jiang, Weiyun Yau:
Achieving Green AI with Energy-Efficient Deep Learning Using Neuromorphic Computing. Commun. ACM 66(7): 52-57 (2023) - [j159]Buddhi Wickramasinghe, Eliathamby Ambikairajah, Vidhyasaharan Sethu, Julien Epps, Haizhou Li, Ting Dang:
DNN controlled adaptive front-end for replay attack detection systems. Speech Commun. 154: 102973 (2023) - [j158]Tingting Wang, Zexu Pan, Meng Ge, Zhen Yang, Haizhou Li:
Time-Domain Speech Separation Networks With Graph Encoding Auxiliary. IEEE Signal Process. Lett. 30: 110-114 (2023) - [j157]Yi Zhou, Zhizheng Wu, Mingyang Zhang, Xiaohai Tian, Haizhou Li:
TTS-Guided Training for Accent Conversion Without Parallel Data. IEEE Signal Process. Lett. 30: 533-537 (2023) - [j156]Mingyang Zhang, Xuehao Zhou, Zhizheng Wu, Haizhou Li:
Towards Zero-Shot Multi-Speaker Multi-Accent Text-to-Speech Synthesis. IEEE Signal Process. Lett. 30: 947-951 (2023) - [j155]Kun Zhou, Berrak Sisman, Rajib Rana, Björn W. Schuller, Haizhou Li:
Emotion Intensity and its Control for Emotional Voice Conversion. IEEE Trans. Affect. Comput. 14(1): 31-48 (2023) - [j154]Kun Zhou, Berrak Sisman, Rajib Rana, Björn W. Schuller, Haizhou Li:
Speech Synthesis With Mixed Emotions. IEEE Trans. Affect. Comput. 14(4): 3120-3134 (2023) - [j153]Hui Tian, Yiqin Qiu, Wojciech Mazurczyk, Haizhou Li, Zhenxing Qian:
STFF-SM: Steganalysis Model Based on Spatial and Temporal Feature Fusion for Speech Streams. IEEE ACM Trans. Audio Speech Lang. Process. 31: 277-289 (2023) - [j152]Qiquan Zhang, Xinyuan Qian, Zhaoheng Ni, Aaron Nicolson, Eliathamby Ambikairajah, Haizhou Li:
A Time-Frequency Attention Module for Neural Speech Enhancement. IEEE ACM Trans. Audio Speech Lang. Process. 31: 462-475 (2023) - [j151]Xinyuan Qian, Zhengdong Wang, Jiadong Wang, Guohui Guan, Haizhou Li:
Audio-Visual Cross-Attention Network for Robotic Speaker Tracking. IEEE ACM Trans. Audio Speech Lang. Process. 31: 550-562 (2023) - [j150]Chen Zhang, Luis Fernando D'Haro, Qiquan Zhang, Thomas Friedrichs, Haizhou Li:
PoE: A Panel of Experts for Generalized Automatic Dialogue Assessment. IEEE ACM Trans. Audio Speech Lang. Process. 31: 1234-1250 (2023) - [j149]Ruijie Tao, Kong Aik Lee, Rohan Kumar Das, Ville Hautamäki, Haizhou Li:
Self-Supervised Training of Speaker Encoder With Multi-Modal Diverse Positive Pairs. IEEE ACM Trans. Audio Speech Lang. Process. 31: 1706-1719 (2023) - [j148]Yi Zhou, Zhizheng Wu, Xiaohai Tian, Haizhou Li:
Optimization of Cross-Lingual Voice Conversion With Linguistics Losses to Reduce Foreign Accents. IEEE ACM Trans. Audio Speech Lang. Process. 31: 1916-1926 (2023) - [j147]Xiaoxue Gao, Chitralekha Gupta, Haizhou Li:
PoLyScriber: Integrated Fine-Tuning of Extractor and Lyrics Transcriber for Polyphonic Music. IEEE ACM Trans. Audio Speech Lang. Process. 31: 1968-1981 (2023) - [j146]Zhenyu Weng, Huiping Zhuang, Haizhou Li, Balakrishnan Ramalingam, Rajesh Elara Mohan, Zhiping Lin:
Online Multi-Face Tracking With Multi-Modality Cascaded Matching. IEEE Trans. Circuits Syst. Video Technol. 33(6): 2738-2752 (2023) - [j145]Yiqin Qiu, Hui Tian, Haizhou Li, Chin-Chen Chang, Athanasios V. Vasilakos:
Separable Convolution Network With Dual-Stream Pyramid Enhanced Strategy for Speech Steganalysis. IEEE Trans. Inf. Forensics Secur. 18: 2737-2750 (2023) - [j144]Jibin Wu, Yansong Chua, Malu Zhang, Guoqi Li, Haizhou Li, Kay Chen Tan:
A Tandem Learning Rule for Effective Training and Rapid Inference of Deep Spiking Neural Networks. IEEE Trans. Neural Networks Learn. Syst. 34(1): 446-460 (2023) - [c687]Yiming Chen, Simin Chen, Zexin Li, Wei Yang, Cong Liu, Robby T. Tan, Haizhou Li:
Dynamic Transformers Provide a False Sense of Efficiency. ACL (1) 2023: 7164-7180 - [c686]Mingyang Zhang, Yi Zhou, Zhizheng Wu, Haizhou Li:
Zero-shot multi-speaker accent TTS with limited accent data. APSIPA ASC 2023: 1931-1936 - [c685]Jiawei Du, Yidi Jiang, Vincent Y. F. Tan, Joey Tianyi Zhou, Haizhou Li:
Minimizing the Accumulated Trajectory Error to Improve Dataset Distillation. CVPR 2023: 3749-3758 - [c684]Jiadong Wang, Xinyuan Qian, Malu Zhang, Robby T. Tan, Haizhou Li:
Seeing What You Said: Talking Face Generation Guided by a Lip Reading Expert. CVPR 2023: 14653-14662 - [c683]Jiangyan Yi, Jianhua Tao, Ruibo Fu, Xinrui Yan, Chenglong Wang, Tao Wang, Chu Yuan Zhang, Xiaohui Zhang, Yan Zhao, Yong Ren, Le Xu, Junzuo Zhou, Hao Gu, Zhengqi Wen, Shan Liang, Zheng Lian, Shuai Nie, Haizhou Li:
ADD 2023: the Second Audio Deepfake Detection Challenge. DADA@IJCAI 2023: 125-130 - [c682]Siqi Cai, Jia Li, Hongmeng Yang, Haizhou Li:
RGCnet: An Efficient Recursive Gated Convolutional Network for EEG-based Auditory Attention Detection. EMBC 2023: 1-4 - [c681]Chen Zhang, Luis F. D'Haro, Chengguang Tang, Ke Shi, Guohua Tang, Haizhou Li:
xDial-Eval: A Multilingual Open-Domain Dialogue Evaluation Benchmark. EMNLP (Findings) 2023: 5579-5601 - [c680]Yan Zhang, Zhaopeng Feng, Zhiyang Teng, Zuozhu Liu, Haizhou Li:
How Well Do Text Embedding Models Understand Syntax? EMNLP (Findings) 2023: 9717-9728 - [c679]Hongbo Zhang, Junying Chen, Feng Jiang, Fei Yu, Zhihong Chen, Guiming Chen, Jianquan Li, Xiangbo Wu, Zhiyi Zhang, Qingying Xiao, Xiang Wan, Benyou Wang, Haizhou Li:
HuatuoGPT, Towards Taming Language Model to Be a Doctor. EMNLP (Findings) 2023: 10859-10885 - [c678]Marvin Borsdorf, Saurav Pahuja, Gabriel Ivucic, Siqi Cai, Haizhou Li, Tanja Schultz:
Multi-Head Attention and GRU for Improved Match-Mismatch Classification of Speech Stimulus and EEG Response. ICASSP 2023: 1-2 - [c677]Xiaoxue Gao, Xianghu Yue, Haizhou Li:
Self-Transriber: Few-Shot Lyrics Transcription With Self-Training. ICASSP 2023: 1-5 - [c676]Zexu Pan, Wupeng Wang, Marvin Borsdorf, Haizhou Li:
ImagineNet: Target Speaker Extraction with Intermittent Visual Cue Through Embedding Inpainting. ICASSP 2023: 1-5 - [c675]