default search action
Search dblp
Full-text search
- > Home
Please enter a search query
- case-insensitive prefix search: default
e.g., sig matches "SIGIR" as well as "signal" - exact word search: append dollar sign ($) to word
e.g., graph$ matches "graph", but not "graphics" - boolean and: separate words by space
e.g., codd model - boolean or: connect words by pipe symbol (|)
e.g., graph|network
Update May 7, 2017: Please note that we had to disable the phrase search operator (.) and the boolean not operator (-) due to technical problems. For the time being, phrase search queries will yield regular prefix search result, and search terms preceded by a minus will be interpreted as regular (positive) search terms.
Author search results
Venue search results
no matches
Refine list
refine by author
- no options
- temporarily not available
refine by venue
- no options
- temporarily not available
refine by type
- no options
- temporarily not available
refine by access
- no options
- temporarily not available
refine by year
- no options
- temporarily not available
Publication search results
found 90 matches
- 2024
- Zhe Chen, Jiannan Wu, Wenhai Wang, Weijie Su, Guo Chen, Sen Xing, Muyan Zhong, Qinglong Zhang, Xizhou Zhu, Lewei Lu, Bin Li, Ping Luo, Tong Lu, Yu Qiao, Jifeng Dai:
Intern VL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks. CVPR 2024: 24185-24198 - Hao Li, Xue Yang, Zhaokai Wang, Xizhou Zhu, Jie Zhou, Yu Qiao, Xiaogang Wang, Hongsheng Li, Lewei Lu, Jifeng Dai:
Auto MC-Reward: Automated Dense Reward Design with Large Language Models for Minecraft. CVPR 2024: 16426-16435 - Yuwen Xiong, Zhiqi Li, Yuntao Chen, Feng Wang, Xizhou Zhu, Jiapeng Luo, Wenhai Wang, Tong Lu, Hongsheng Li, Yu Qiao, Lewei Lu, Jie Zhou, Jifeng Dai:
Efficient Deformable ConvNets: Rethinking Dynamic and Sparse Operator for Vision Applications. CVPR 2024: 5652-5661 - Weiyun Wang, Yiming Ren, Haowen Luo, Tiantong Li, Chenxiang Yan, Zhe Chen, Wenhai Wang, Qingyun Li, Lewei Lu, Xizhou Zhu, Yu Qiao, Jifeng Dai:
The All-Seeing Project V2: Towards General Relation Comprehension of the Open World. ECCV (33) 2024: 471-490 - Changyao Tian, Chenxin Tao, Jifeng Dai, Hao Li, Ziheng Li, Lewei Lu, Xiaogang Wang, Hongsheng Li, Gao Huang, Xizhou Zhu:
ADDP: Learning General Representations for Image Recognition and Generation with Alternating Denoising Diffusion Process. ICLR 2024 - Weiyun Wang, Min Shi, Qingyun Li, Wenhai Wang, Zhenhang Huang, Linjie Xing, Zhe Chen, Hao Li, Xizhou Zhu, Zhiguo Cao, Yushi Chen, Tong Lu, Jifeng Dai, Yu Qiao:
The All-Seeing Project: Towards Panoptic Visual Recognition and Understanding of the Open World. ICLR 2024 - Yuwen Xiong, Zhiqi Li, Yuntao Chen, Feng Wang, Xizhou Zhu, Jiapeng Luo, Wenhai Wang, Tong Lu, Hongsheng Li, Yu Qiao, Lewei Lu, Jie Zhou, Jifeng Dai:
Efficient Deformable ConvNets: Rethinking Dynamic and Sparse Operator for Vision Applications. CoRR abs/2401.06197 (2024) - Chongzhi Zhang, Mingyuan Zhang, Zhiyang Teng, Jiayi Li, Xizhou Zhu, Lewei Lu, Ziwei Liu, Aixin Sun:
Multi-scale 2D Temporal Map Diffusion Models for Natural Language Video Localization. CoRR abs/2401.08232 (2024) - Changyao Tian, Xizhou Zhu, Yuwen Xiong, Weiyun Wang, Zhe Chen, Wenhai Wang, Yuntao Chen, Lewei Lu, Tong Lu, Jie Zhou, Hongsheng Li, Yu Qiao, Jifeng Dai:
MM-Interleaved: Interleaved Image-Text Generative Modeling via Multi-modal Feature Synchronizer. CoRR abs/2401.10208 (2024) - Weiyun Wang, Yiming Ren, Haowen Luo, Tiantong Li, Chenxiang Yan, Zhe Chen, Wenhai Wang, Qingyun Li, Lewei Lu, Xizhou Zhu, Yu Qiao, Jifeng Dai:
The All-Seeing Project V2: Towards General Relation Comprehension of the Open World. CoRR abs/2402.19474 (2024) - Yuchen Duan, Weiyun Wang, Zhe Chen, Xizhou Zhu, Lewei Lu, Tong Lu, Yu Qiao, Hongsheng Li, Jifeng Dai, Wenhai Wang:
Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures. CoRR abs/2403.02308 (2024) - Zhe Chen, Weiyun Wang, Hao Tian, Shenglong Ye, Zhangwei Gao, Erfei Cui, Wenwen Tong, Kongzhi Hu, Jiapeng Luo, Zheng Ma, Ji Ma, Jiaqi Wang, Xiaoyi Dong, Hang Yan, Hewei Guo, Conghui He, Botian Shi, Zhenjiang Jin, Chao Xu, Bin Wang, Xingjian Wei, Wei Li, Wenjian Zhang, Bo Zhang, Pinlong Cai, Licheng Wen, Xiangchao Yan, Min Dou, Lewei Lu, Xizhou Zhu, Tong Lu, Dahua Lin, Yu Qiao, Jifeng Dai, Wenhai Wang:
How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites. CoRR abs/2404.16821 (2024) - Xizhou Zhu, Xue Yang, Zhaokai Wang, Hao Li, Wenhan Dou, Junqi Ge, Lewei Lu, Yu Qiao, Jifeng Dai:
Parameter-Inverted Image Pyramid Networks. CoRR abs/2406.04330 (2024) - Chenxin Tao, Xizhou Zhu, Shiqian Su, Lewei Lu, Changyao Tian, Xuan Luo, Gao Huang, Hongsheng Li, Yu Qiao, Jie Zhou, Jifeng Dai:
Learning 1D Causal Visual Representation with De-focus Attention Networks. CoRR abs/2406.04342 (2024) - Weiyun Wang, Shuibo Zhang, Yiming Ren, Yuchen Duan, Tiantong Li, Shuo Liu, Mengkang Hu, Zhe Chen, Kaipeng Zhang, Lewei Lu, Xizhou Zhu, Ping Luo, Yu Qiao, Jifeng Dai, Wenqi Shao, Wenhai Wang:
Needle In A Multimodal Haystack. CoRR abs/2406.07230 (2024) - Chenyu Yang, Xizhou Zhu, Jinguo Zhu, Weijie Su, Junjie Wang, Xuan Dong, Wenhai Wang, Lewei Lu, Bin Li, Jie Zhou, Yu Qiao, Jifeng Dai:
Vision Model Pre-training on Interleaved Image-Text Data via Latent Compression Learning. CoRR abs/2406.07543 (2024) - Jiannan Wu, Muyan Zhong, Sen Xing, Zeqiang Lai, Zhaoyang Liu, Wenhai Wang, Zhe Chen, Xizhou Zhu, Lewei Lu, Tong Lu, Ping Luo, Yu Qiao, Jifeng Dai:
VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks. CoRR abs/2406.08394 (2024) - Qingyun Li, Zhe Chen, Weiyun Wang, Wenhai Wang, Shenglong Ye, Zhenjiang Jin, Guanzhou Chen, Yinan He, Zhangwei Gao, Erfei Cui, Jiashuo Yu, Hao Tian, Jiasheng Zhou, Chao Xu, Bin Wang, Xingjian Wei, Wei Li, Wenjian Zhang, Bo Zhang, Pinlong Cai, Licheng Wen, Xiangchao Yan, Zhenxiang Li, Pei Chu, Yi Wang, Min Dou, Changyao Tian, Xizhou Zhu, Lewei Lu, Yushi Chen, Junjun He, Zhongying Tu, Tong Lu, Yali Wang, Limin Wang, Dahua Lin, Yu Qiao, Botian Shi, Conghui He, Jifeng Dai:
OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text. CoRR abs/2406.08418 (2024) - Renjie Liang, Li Li, Chongzhi Zhang, Jing Wang, Xizhou Zhu, Aixin Sun:
TVR-Ranking: A Dataset for Ranked Video Moment Retrieval with Imprecise Queries. CoRR abs/2407.06597 (2024) - Yangzhou Liu, Yue Cao, Zhangwei Gao, Weiyun Wang, Zhe Chen, Wenhai Wang, Hao Tian, Lewei Lu, Xizhou Zhu, Tong Lu, Yu Qiao, Jifeng Dai:
MMInstruct: A High-Quality Multi-Modal Instruction Tuning Dataset with Extensive Diversity. CoRR abs/2407.15838 (2024) - Fanqing Meng, Jin Wang, Chuanhao Li, Quanfeng Lu, Hao Tian, Jiaqi Liao, Xizhou Zhu, Jifeng Dai, Yu Qiao, Ping Luo, Kaipeng Zhang, Wenqi Shao:
MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models. CoRR abs/2408.02718 (2024) - 2023
- Weijie Su, Xizhou Zhu, Chenxin Tao, Lewei Lu, Bin Li, Gao Huang, Yu Qiao, Xiaogang Wang, Jie Zhou, Jifeng Dai:
Towards All-in-One Pre-Training via Maximizing Multi-Modal Mutual Information. CVPR 2023: 15888-15899 - Yihan Hu, Jiazhi Yang, Li Chen, Keyu Li, Chonghao Sima, Xizhou Zhu, Siqi Chai, Senyao Du, Tianwei Lin, Wenhai Wang, Lewei Lu, Xiaosong Jia, Qiang Liu, Jifeng Dai, Yu Qiao, Hongyang Li:
Planning-oriented Autonomous Driving. CVPR 2023: 17853-17862 - Hao Li, Jinguo Zhu, Xiaohu Jiang, Xizhou Zhu, Hongsheng Li, Chun Yuan, Xiaohua Wang, Yu Qiao, Xiaogang Wang, Wenhai Wang, Jifeng Dai:
Uni-Perceiver v2: A Generalist Model for Large-Scale Vision and Vision-Language Tasks. CVPR 2023: 2691-2700 - Chenxin Tao, Xizhou Zhu, Weijie Su, Gao Huang, Bin Li, Jie Zhou, Yu Qiao, Xiaogang Wang, Jifeng Dai:
Siamese Image Modeling for Self-Supervised Vision Representation Learning. CVPR 2023: 2132-2141 - Wenhai Wang, Jifeng Dai, Zhe Chen, Zhenhang Huang, Zhiqi Li, Xizhou Zhu, Xiaowei Hu, Tong Lu, Lewei Lu, Hongsheng Li, Xiaogang Wang, Yu Qiao:
InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions. CVPR 2023: 14408-14419 - Chenyu Yang, Yuntao Chen, Hao Tian, Chenxin Tao, Xizhou Zhu, Zhaoxiang Zhang, Gao Huang, Hongyang Li, Yu Qiao, Lewei Lu, Jie Zhou, Jifeng Dai:
BEVFormer v2: Adapting Modern Image Backbones to Bird's-Eye-View Recognition via Perspective Supervision. CVPR 2023: 17830-17839 - Wenhai Wang, Zhe Chen, Xiaokang Chen, Jiannan Wu, Xizhou Zhu, Gang Zeng, Ping Luo, Tong Lu, Jie Zhou, Yu Qiao, Jifeng Dai:
VisionLLM: Large Language Model is also an Open-Ended Decoder for Vision-Centric Tasks. NeurIPS 2023 - Zhaoyang Liu, Yinan He, Wenhai Wang, Weiyun Wang, Yi Wang, Shoufa Chen, Qinglong Zhang, Zeqiang Lai, Yang Yang, Qingyun Li, Jiashuo Yu, Kunchang Li, Zhe Chen, Xue Yang, Xizhou Zhu, Yali Wang, Limin Wang, Ping Luo, Jifeng Dai, Yu Qiao:
InternGPT: Solving Vision-Centric Tasks by Interacting with Chatbots Beyond Language. CoRR abs/2305.05662 (2023) - Wenhai Wang, Zhe Chen, Xiaokang Chen, Jiannan Wu, Xizhou Zhu, Gang Zeng, Ping Luo, Tong Lu, Jie Zhou, Yu Qiao, Jifeng Dai:
VisionLLM: Large Language Model is also an Open-Ended Decoder for Vision-Centric Tasks. CoRR abs/2305.11175 (2023)
skipping 60 more matches
loading more results
failed to load more results, please try again later
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
retrieved on 2024-11-11 21:00 CET from data curated by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint