default search action

combined dblp search
author search
venue search
publication search

ask others

Xize Cheng

> Home > Persons

Person information

Refine list

refinements active!

zoomed in on ?? of ?? records

view refined list in

export refined list as

showing all ?? records

2020 – today

see FAQ

What is the meaning of the colors in the publication lists?

2025
[c34]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/aaai/GuoCWJ25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aaai/GuoCWJ25
Zirun Guo, Xize Cheng, Yangyang Wu, Tao Jin:
A Wander Through the Multimodal Landscape: Efficient Transfer Learning via Low-rank Sequence Multimodal Adapter. AAAI 2025: 16996-17004
[c33]
- view
  - electronic edition @ aclanthology.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/coling/0003BCZJJ0YYZ25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/coling/0003BCZJJ0YYZ25
Wenrui Liu, Jionghao Bai, Xize Cheng, Jialong Zuo, Ziyue Jiang, Shengpeng Ji, Minghui Fang, Xiaoda Yang, Qian Yang, Zhou Zhao:
VoxpopuliTTS: a large-scale multilingual TTS corpus for zero-shot speech generation. COLING 2025: 10293-10297
[c32]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/iclr/0001ZHZL0CJ0Z025
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/0001ZHZL0CJ0Z025
Zehan Wang, Ziang Zhang, Minjie Hong, Hang Zhang, Luping Liu, Rongjie Huang, Xize Cheng, Shengpeng Ji, Tao Jin, Hengshuang Zhao, Zhou Zhao:
OmniBind: Large-scale Omni Multimodal Representation via Binding Spaces. ICLR 2025
[c31]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/iclr/ChengHYLF0J0Z0025
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/ChengHYLF0J0Z0025
Xize Cheng, Ruofan Hu, Xiaoda Yang, Jingyu Lu, Dongjie Fu, Zehan Wang, Shengpeng Ji, Rongjie Huang, Boyang Zhang, Tao Jin, Zhou Zhao:
VoxDialogue: Can Spoken Dialogue Systems Understand Information Beyond Words? ICLR 2025
[c30]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/iclr/ChengZW0Z0JZ0025
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/ChengZW0Z0JZ0025
Xize Cheng, Siqi Zheng, Zehan Wang, Minghui Fang, Ziang Zhang, Rongjie Huang, Shengpeng Ji, Jialong Zuo, Tao Jin, Zhou Zhao:
OmniSep: Unified Omni-Modality Sound Separation with Query-Mixup. ICLR 2025
[i26]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2501-01384
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2501-01384
Xize Cheng, Dongjie Fu, Xiaoda Yang, Minghui Fang, Ruofan Hu, Jingyu Lu, Jionghao Bai, Zehan Wang, Shengpeng Ji, Rongjie Huang, Linjun Li, Yu Chen, Tao Jin, Zhou Zhao:
OmniChat: Enhancing Spoken Dialogue Systems with Scalable Synthetic Data for Diverse Scenarios. CoRR abs/2501.01384 (2025)
[i25]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2502-05471
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2502-05471
Jialong Zuo, Shengpeng Ji, Minghui Fang, Ziyue Jiang, Xize Cheng, Qian Yang, Wenrui Liu, Guangyan Zhang, Zehai Tu, Yiwen Guo, Zhou Zhao:
Enhancing Expressive Voice Conversion with Discrete Pitch-Conditioned Flow Matching Model. CoRR abs/2502.05471 (2025)
[i24]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2504-17782
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2504-17782
Xize Cheng, Slytherin Wang, Zehan Wang, Rongjie Huang, Tao Jin, Zhou Zhao:
Unleashing the Power of Natural Audio Featuring Multiple Sound Sources. CoRR abs/2504.17782 (2025)
2024
[c29]
- view
  authority control:
- export record
  dblp key:
  - conf/acl/LiuHHSSCZ24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acl/LiuHHSSCZ24
Huadai Liu, Rongjie Huang, Jinzheng He, Gang Sun, Ran Shen, Xize Cheng, Zhou Zhao:
Wav2SQL: Direct Generalizable Speech-To-SQL Parsing. ACL (Findings) 2024: 4230-4242
[c28]
- view
  authority control:
- export record
  dblp key:
  - conf/acl/JinLWLCZ24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acl/JinLWLCZ24
Tao Jin, Wang Lin, Ye Wang, Linjun Li, Xize Cheng, Zhou Zhao:
Rethinking the Multimodal Correlation of Multimodal Sequential Learning via Generalizable Attentional Results Alignment. ACL (1) 2024: 5247-5265
[c27]
- view
  authority control:
- export record
  dblp key:
  - conf/acl/HongHCWLYZZ24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acl/HongHCWLYZZ24
Zhiqing Hong, Rongjie Huang, Xize Cheng, Yongqi Wang, Ruiqi Li, Fuming You, Zhou Zhao, Zhimeng Zhang:
Text-to-Song: Towards Controllable Music Generation Incorporating Vocal and Accompaniment. ACL (1) 2024: 6248-6261
[c26]
- view
  authority control:
- export record
  dblp key:
  - conf/acl/ChengHLWJYCDHZ24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acl/ChengHLWJYCDHZ24
Xize Cheng, Rongjie Huang, Linjun Li, Zehan Wang, Tao Jin, Aoxiong Yin, Feiyang Chen, Xinyu Duan, Baoxing Huai, Zhou Zhao:
TransFace: Unit-Based Audio-Visual Speech Synthesizer for Talking Head Translation. ACL (Findings) 2024: 9973-9986
[c25]
- view
  authority control:
- export record
  dblp key:
  - conf/acl/LeiCLHTLX0LZ24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acl/LeiCLHTLX0LZ24
Songju Lei, Xize Cheng, Mengjiao Lyu, Jianqiao Hu, Jintao Tan, Runlin Liu, Lingyu Xiong, Tao Jin, Xiandong Li, Zhou Zhao:
Uni-Dubbing: Zero-Shot Speech Synthesis from Visual Articulation. ACL (1) 2024: 10082-10099
[c24]
- view
  authority control:
- export record
  dblp key:
  - conf/emnlp/YangCDQH0JZHZ024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/emnlp/YangCDQH0JZHZ024
Xiaoda Yang, Xize Cheng, Jiaqi Duan, Hongshun Qiu, Minjie Hong, Minghui Fang, Shengpeng Ji, Jialong Zuo, Zhiqing Hong, Zhimeng Zhang, Tao Jin:
AudioVSR: Enhancing Video Speech Recognition with Audio Data. EMNLP 2024: 15352-15361
[c23]
- view
  authority control:
- export record
  dblp key:
  - conf/icmcs/TanCXZLWGLC24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icmcs/TanCXZLWGLC24
Jintao Tan, Xize Cheng, Lingyu Xiong, Lei Zhu, Xiandong Li, Xianjia Wu, Kai Gong, Minglei Li, Yi Cai:
Landmark-guided Diffusion Model for High-fidelity and Temporally Coherent Talking Head Generation. ICME 2024: 1-6
[c22]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/0001ZCHLYHZ0GZ24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/0001ZCHLYHZ0GZ24
Zehan Wang, Ziang Zhang, Xize Cheng, Rongjie Huang, Luping Liu, Zhenhui Ye, Haifeng Huang, Yang Zhao, Tao Jin, Peng Gao, Zhou Zhao:
FreeBind: Free Lunch in Unified Multimodal Space via Knowledge Fusion. ICML 2024
[c21]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/HuangHW0C0YYLGZ24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/HuangHW0C0YYLGZ24
Rongjie Huang, Ruofan Hu, Yongqi Wang, Zehan Wang, Xize Cheng, Ziyue Jiang, Zhenhui Ye, Dongchao Yang, Luping Liu, Peng Gao, Zhou Zhao:
InstructSpeech: Following Speech Editing Instructions via Large Language Models. ICML 2024
[c20]
- view
  authority control:
- export record
  dblp key:
  - conf/mm/XiongCTWLZMLXH24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/mm/XiongCTWLZMLXH24
Lingyu Xiong, Xize Cheng, Jintao Tan, Xianjia Wu, Xiandong Li, Lei Zhu, Fei Ma, Minglei Li, Huang Xu, Zhihui Hu:
SegTalker: Segmentation-based Talking Face Generation with Mask-guided Local Editing. ACM Multimedia 2024: 3170-3179
[c19]
- view
  authority control:
- export record
  dblp key:
  - conf/mm/FuCYWZJ24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/mm/FuCYWZJ24
Dongjie Fu, Xize Cheng, Xiaoda Yang, Hanting Wang, Zhou Zhao, Tao Jin:
Boosting Speech Recognition Robustness to Modality-Distortion with Contrast-Augmented Prompts. ACM Multimedia 2024: 3838-3847
[c18]
- view
  authority control:
- export record
  dblp key:
  - conf/mm/LiuHLCWCZZ24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/mm/LiuHLCWCZZ24
Huadai Liu, Rongjie Huang, Yang Liu, Hengyuan Cao, Jialei Wang, Xize Cheng, Siqi Zheng, Zhou Zhao:
AudioLCM: Efficient and High-Quality Text-to-Audio Generation with Minimal Inference Steps. ACM Multimedia 2024: 7008-7017
[c17]
- view
  authority control:
- export record
  dblp key:
  - conf/mm/YangCF0ZJZ024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/mm/YangCF0ZJZ024
Xiaoda Yang, Xize Cheng, Dongjie Fu, Minghui Fang, Jialong Zuo, Shengpeng Ji, Zhou Zhao, Tao Jin:
SyncTalklip: Highly Synchronized Lip-Readable Speaker Generation with Multi-Task Learning. ACM Multimedia 2024: 8149-8158
[c16]
- view
  authority control:
- export record
  dblp key:
  - conf/mm/HuangWHXHYC00YL24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/mm/HuangWHXHYC00YL24
Rongjie Huang, Yongqi Wang, Ruofan Hu, Xiaoshan Xu, Zhiqing Hong, Dongchao Yang, Xize Cheng, Zehan Wang, Ziyue Jiang, Zhenhui Ye, Luping Liu, Siqi Zheng, Zhou Zhao:
VoiceTuner: Self-Supervised Pre-training and Efficient Fine-tuning For Voice Generation. ACM Multimedia 2024: 10630-10639
[c15]
- view
  - electronic edition @ nips.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/nips/HuangC0HXWLCZPZ24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/HuangC0HXWLCZPZ24
Haifeng Huang, Yilun Chen, Zehan Wang, Rongjie Huang, Runsen Xu, Tai Wang, Luping Liu, Xize Cheng, Yang Zhao, Jiangmiao Pang, Zhou Zhao:
Chat-Scene: Bridging 3D Scene and Large Language Models with Object Identifiers. NeurIPS 2024
[c14]
- view
  - electronic edition @ nips.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/nips/YeZR00HLH00C0Z24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/YeZR00HLH00C0Z24
Zhenhui Ye, Tianyun Zhong, Yi Ren, Ziyue Jiang, Jiawei Huang, Rongjie Huang, Jinglin Liu, Jinzheng He, Chen Zhang, Zehan Wang, Xize Cheng, Xiang Yin, Zhou Zhao:
MimicTalk: Mimicking a personalized and expressive 3D talking face in minutes. NeurIPS 2024
[c13]
- view
  - electronic edition @ nips.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/nips/Zhang0LHCYLLHZJ24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/Zhang0LHCYLLHZJ24
Ziang Zhang, Zehan Wang, Luping Liu, Rongjie Huang, Xize Cheng, Zhenhui Ye, Wang Lin, Huadai Liu, Haifeng Huang, Yang Zhao, Tao Jin, Siqi Zheng, Zhou Zhao:
Extending Multi-modal Contrastive Representations. NeurIPS 2024
[i23]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2404-09313
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2404-09313
Zhiqing Hong, Rongjie Huang, Xize Cheng, Yongqi Wang, Ruiqi Li, Fuming You, Zhou Zhao, Zhimeng Zhang:
Text-to-Song: Towards Controllable Music Generation Incorporating Vocals and Accompaniment. CoRR abs/2404.09313 (2024)
[i22]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2405-04883
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2405-04883
Zehan Wang, Ziang Zhang, Xize Cheng, Rongjie Huang, Luping Liu, Zhenhui Ye, Haifeng Huang, Yang Zhao, Tao Jin, Peng Gao, Zhou Zhao:
FreeBind: Free Lunch in Unified Multimodal Space via Knowledge Fusion. CoRR abs/2405.04883 (2024)
[i21]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-00356
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-00356
Huadai Liu, Rongjie Huang, Yang Liu, Hengyuan Cao, Jialei Wang, Xize Cheng, Siqi Zheng, Zhou Zhao:
AudioLCM: Text-to-Audio Generation with Latent Consistency Models. CoRR abs/2406.00356 (2024)
[i20]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-01205
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-01205
Shengpeng Ji, Jialong Zuo, Minghui Fang, Siqi Zheng, Qian Chen, Wen Wang, Ziyue Jiang, Hai Huang, Xize Cheng, Rongjie Huang, Zhou Zhao:
ControlSpeech: Towards Simultaneous Zero-shot Speaker Cloning and Zero-shot Language Style Control With Decoupled Codec. CoRR abs/2406.01205 (2024)
[i19]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-17507
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-17507
Minghui Fang, Shengpeng Ji, Jialong Zuo, Hai Huang, Yan Xia, Jieming Zhu, Xize Cheng, Xiaoda Yang, Wenrui Liu, Gang Wang, Zhenhua Dong, Zhou Zhao:
ACE: A Generative Cross-Modal Retrieval Framework with Coarse-To-Fine Semantic Modeling. CoRR abs/2406.17507 (2024)
[i18]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2407-11895
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2407-11895
Zehan Wang, Ziang Zhang, Hang Zhang, Luping Liu, Rongjie Huang, Xize Cheng, Hengshuang Zhao, Zhou Zhao:
OmniBind: Large-scale Omni Multimodal Representation via Binding Spaces. CoRR abs/2407.11895 (2024)
[i17]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2408-01732
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2408-01732
Jintao Tan, Xize Cheng, Lingyu Xiong, Lei Zhu, Xiandong Li, Xianjia Wu, Kai Gong, Minglei Li, Yi Cai:
Landmark-guided Diffusion Model for High-fidelity and Temporally Coherent Talking Head Generation. CoRR abs/2408.01732 (2024)
[i16]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2408-16532
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2408-16532
Shengpeng Ji, Ziyue Jiang, Xize Cheng, Yifu Chen, Minghui Fang, Jialong Zuo, Qian Yang, Ruiqi Li, Ziang Zhang, Xiaoda Yang, Rongjie Huang, Yidi Jiang, Qian Chen, Siqi Zheng, Wen Wang, Zhou Zhao:
WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling. CoRR abs/2408.16532 (2024)
[i15]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-03605
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-03605
Lingyu Xiong, Xize Cheng, Jintao Tan, Xianjia Wu, Xiandong Li, Lei Zhu, Fei Ma, Minglei Li, Huang Xu, Zhihu Hu:
SegTalker: Segmentation-based Talking Face Generation with Mask-guided Local Editing. CoRR abs/2409.03605 (2024)
[i14]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2410-12957
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2410-12957
Ruiqi Li, Siqi Zheng, Xize Cheng, Ziang Zhang, Shengpeng Ji, Zhou Zhao:
MuVi: Video-to-Music Generation with Semantic Alignment and Rhythmic Synchronization. CoRR abs/2410.12957 (2024)
[i13]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2410-21269
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2410-21269
Xize Cheng, Siqi Zheng, Zehan Wang, Minghui Fang, Ziang Zhang, Rongjie Huang, Ziyang Ma, Shengpeng Ji, Jialong Zuo, Tao Jin, Zhou Zhao:
OmniSep: Unified Omni-Modality Sound Separation with Query-Mixup. CoRR abs/2410.21269 (2024)
[i12]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2411-13577
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2411-13577
Shengpeng Ji, Yifu Chen, Minghui Fang, Jialong Zuo, Jingyu Lu, Hanting Wang, Ziyue Jiang, Long Zhou, Shujie Liu, Xize Cheng, Xiaoda Yang, Zehan Wang, Qian Yang, Jian Li, Yidi Jiang, Jingzhen He, Yunfei Chu, Jin Xu, Zhou Zhao:
WavChat: A Survey of Spoken Dialogue Models. CoRR abs/2411.13577 (2024)
[i11]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2412-08979
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2412-08979
Zirun Guo, Xize Cheng, Yangyang Wu, Tao Jin:
A Wander Through the Multimodal Landscape: Efficient Transfer Learning via Low-rank Sequence Multimodal Adapter. CoRR abs/2412.08979 (2024)
2023
[c12]
- view
  authority control:
- export record
  dblp key:
  - conf/acl/ChengJLLDZ23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acl/ChengJLLDZ23
Xize Cheng, Tao Jin, Linjun Li, Wang Lin, Xinyu Duan, Zhou Zhao:
OpenSR: Open-Modality Speech Recognition via Maintaining Multi-Modality Alignment. ACL (1) 2023: 6592-6607
[c11]
- view
  authority control:
- export record
  dblp key:
  - conf/acl/HuangLC0LYHZLYZ23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acl/HuangLC0LYHZLYZ23
Rongjie Huang, Huadai Liu, Xize Cheng, Yi Ren, Linjun Li, Zhenhui Ye, Jinzheng He, Lichao Zhang, Jinglin Liu, Xiang Yin, Zhou Zhao:
AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation. ACL (1) 2023: 8590-8604
[c10]
- view
  authority control:
- export record
  dblp key:
  - conf/acl/WangJLCLZ23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acl/WangJLCLZ23
Ye Wang, Tao Jin, Wang Lin, Xize Cheng, Linjun Li, Zhou Zhao:
Semantic-conditioned Dual Adaptation for Cross-domain Query-based Visual Segmentation. ACL (Findings) 2023: 9797-9815
[c9]
- view
  authority control:
- export record
  dblp key:
  - conf/acl/WangLZJLCZ23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acl/WangLZJLCZ23
Ye Wang, Wang Lin, Shengyu Zhang, Tao Jin, Linjun Li, Xize Cheng, Zhou Zhao:
Weakly-Supervised Spoken Video Grounding via Semantic Interaction Learning. ACL (1) 2023: 10914-10932
[c8]
- view
  authority control:
- export record
  dblp key:
  - conf/acl/LiJCWLHZ23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acl/LiJCWLHZ23
Linjun Li, Tao Jin, Xize Cheng, Ye Wang, Wang Lin, Rongjie Huang, Zhou Zhao:
Contrastive Token-Wise Meta-Learning for Unseen Performer Visual Temporal-Aligned Translation. ACL (Findings) 2023: 10993-11007
[c7]
- view
  authority control:
- export record
  dblp key:
  - conf/acl/LinJPLCWZ23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acl/LinJPLCWZ23
Wang Lin, Tao Jin, Wenwen Pan, Linjun Li, Xize Cheng, Ye Wang, Zhou Zhao:
TAVT: Towards Transferable Audio-Visual Text Generation. ACL (1) 2023: 14983-14999
[c6]
- view
  authority control:
- export record
  dblp key:
  - conf/emnlp/WangHZLCZYZ23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/emnlp/WangHZLCZYZ23
Zehan Wang, Haifeng Huang, Yang Zhao, Linjun Li, Xize Cheng, Yichen Zhu, Aoxiong Yin, Zhou Zhao:
3DRP-Net: 3D Relative Position-aware Network for 3D Visual Grounding. EMNLP 2023: 10612-10625
[c5]
- view
  authority control:
- export record
  dblp key:
  - conf/iccv/WangHZLCZYZ23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iccv/WangHZLCZYZ23
Zehan Wang, Haifeng Huang, Yang Zhao, Linjun Li, Xize Cheng, Yichen Zhu, Aoxiong Yin, Zhou Zhao:
Distilling Coarse-to-Fine Semantic Matching Knowledge for Weakly Supervised 3D Visual Grounding. ICCV 2023: 2662-2671
[c4]
- view
  authority control:
- export record
  dblp key:
  - conf/iccv/LinJWPLCZ23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iccv/LinJWPLCZ23
Wang Lin, Tao Jin, Ye Wang, Wenwen Pan, Linjun Li, Xize Cheng, Zhou Zhao:
Exploring Group Video Captioning with Efficient Relational Approximation. ICCV 2023: 15235-15244
[c3]
- view
  authority control:
- export record
  dblp key:
  - conf/iccv/ChengJHLLWWLYZ23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iccv/ChengJHLLWWLYZ23
Xize Cheng, Tao Jin, Rongjie Huang, Linjun Li, Wang Lin, Zehan Wang, Ye Wang, Huadai Liu, Aoxiong Yin, Zhou Zhao:
MixSpeech: Cross-Modality Self-Learning with Audio-Visual Stream Mixup for Visual Speech Translation and Recognition. ICCV 2023: 15689-15699
[c2]
- view
  authority control:
- export record
  dblp key:
  - conf/mm/JinCLLWZ23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/mm/JinCLLWZ23
Tao Jin, Xize Cheng, Linjun Li, Wang Lin, Ye Wang, Zhou Zhao:
Rethinking Missing Modality Learning from a Decoding Perspective. ACM Multimedia 2023: 4431-4439
[c1]
- view
  - electronic edition @ nips.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/nips/WangZCHLYTLWZZ23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/WangZCHLYTLWZZ23
Zehan Wang, Yang Zhao, Xize Cheng, Haifeng Huang, Jiageng Liu, Aoxiong Yin, Li Tang, Linjun Li, Yongqi Wang, Ziang Zhang, Zhou Zhao:
Connecting Multi-modal Contrastive Representations. NeurIPS 2023
[i10]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2303-05309
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2303-05309
Xize Cheng, Linjun Li, Tao Jin, Rongjie Huang, Wang Lin, Zehan Wang, Huangdai Liu, Ye Wang, Aoxiong Yin, Zhou Zhao:
MixSpeech: Cross-Modality Self-Learning with Audio-Visual Stream Mixup for Visual Speech Translation and Recognition. CoRR abs/2303.05309 (2023)
[i9]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-12552
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-12552
Huadai Liu, Rongjie Huang, Jinzheng He, Gang Sun, Ran Shen, Xize Cheng, Zhou Zhao:
Wav2SQL: Direct Generalizable Speech-To-SQL Parsing. CoRR abs/2305.12552 (2023)
[i8]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-14381
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-14381
Zehan Wang, Yang Zhao, Xize Cheng, Haifeng Huang, Jiageng Liu, Li Tang, Linjun Li, Yongqi Wang, Aoxiong Yin, Ziang Zhang, Zhou Zhao:
Connecting Multi-modal Contrastive Representations. CoRR abs/2305.14381 (2023)
[i7]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-15403
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-15403
Rongjie Huang, Huadai Liu, Xize Cheng, Yi Ren, Linjun Li, Zhenhui Ye, Jinzheng He, Lichao Zhang, Jinglin Liu, Xiang Yin, Zhou Zhao:
AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation. CoRR abs/2305.15403 (2023)
[i6]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2306-06410
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2306-06410
Xize Cheng, Tao Jin, Linjun Li, Wang Lin, Xinyu Duan, Zhou Zhao:
OpenSR: Open-Modality Speech Recognition via Maintaining Multi-Modality Alignment. CoRR abs/2306.06410 (2023)
[i5]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2307-09267
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2307-09267
Zehan Wang, Haifeng Huang, Yang Zhao, Linjun Li, Xize Cheng, Yichen Zhu, Aoxiong Yin, Zhou Zhao:
Distilling Coarse-to-Fine Semantic Matching Knowledge for Weakly Supervised 3D Visual Grounding. CoRR abs/2307.09267 (2023)
[i4]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2307-13363
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2307-13363
Zehan Wang, Haifeng Huang, Yang Zhao, Linjun Li, Xize Cheng, Yichen Zhu, Aoxiong Yin, Zhou Zhao:
3DRP-Net: 3D Relative Position-aware Network for 3D Visual Grounding. CoRR abs/2307.13363 (2023)
[i3]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2312-08168
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2312-08168
Haifeng Huang, Zehan Wang, Rongjie Huang, Luping Liu, Xize Cheng, Yang Zhao, Tao Jin, Zhou Zhao:
Chat-3D v2: Bridging 3D Scene and Large Language Models with Object Identifiers. CoRR abs/2312.08168 (2023)
[i2]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2312-15197
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2312-15197
Xize Cheng, Rongjie Huang, Linjun Li, Tao Jin, Zehan Wang, Aoxiong Yin, Minglei Li, Xinyu Duan, Changpeng Yang, Zhou Zhao:
TransFace: Unit-Based Audio-Visual Speech Synthesizer for Talking Head Translation. CoRR abs/2312.15197 (2023)
2022
[i1]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2211-11255
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2211-11255
Luping Liu, Yi Ren, Xize Cheng, Zhou Zhao:
Diffusion Denoising Process for Perceptron Bias in Out-of-distribution Detection. CoRR abs/2211.11255 (2022)

Coauthor Index

see FAQ

manage site settings

To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.