


default search action
18th ECCV 2024: Milan, Italy - Part XXII
- Ales Leonardis
, Elisa Ricci
, Stefan Roth
, Olga Russakovsky
, Torsten Sattler
, Gül Varol
:
Computer Vision - ECCV 2024 - 18th European Conference, Milan, Italy, September 29-October 4, 2024, Proceedings, Part XXIII. Lecture Notes in Computer Science 15081, Springer 2025, ISBN 978-3-031-73336-9 - Kwanyong Park, Kuniaki Saito, Donghyun Kim:
Weak-to-Strong Compositional Learning from Generative Models for Language-Based Object Detection. 1-19 - Wanting Zhang, Huisi Wu, Jing Qin:
Domesticating SAM for Breast Ultrasound Image Segmentation via Spatial-Frequency Fusion and Uncertainty Correction. 20-37 - Jisu Shin
, Junmyeong Lee
, Seongmin Lee
, Min-Gyu Park
, Ju-Mi Kang
, Ju Hong Yoon
, Hae-Gon Jeon
:
CanonicalFusion: Generating Drivable 3D Human Avatars from Multiple Images. 38-56 - Genki Kinoshita, Ko Nishino
:
Camera Height Doesn't Change: Unsupervised Training for Metric Monocular Road-Scene Depth Estimation. 57-73 - Xiang Li
, Jian Ding
, Zhaoyang Chen
, Mohamed Elhoseiny
:
Uni3DL: A Unified Model for 3D Vision-Language Understanding. 74-92 - Yunyi Gao, Lin Gu, Qiankun Liu, Ying Fu:
Object-Aware NIR-to-Visible Translation. 93-109 - Tanvir Mahmud, Burhaneddin Yaman, Chun-Hao Liu, Diana Marculescu:
PaPr: Training-Free One-Step Patch Pruning with Lightweight ConvNets for Faster Inference. 110-128 - Henry Hengyuan Zhao, Pan Zhou
, Mike Zheng Shou:
GENIXER: Empowering Multimodal Large Language Model as a Powerful Data Generator. 129-147 - Xingyu Fu
, Yushi Hu
, Bangzheng Li, Yu Feng, Haoyu Wang
, Xudong Lin, Dan Roth
, Noah A. Smith
, Wei-Chiu Ma
, Ranjay Krishna:
BLINK: Multimodal Large Language Models Can See but Not Perceive. 148-166 - Lorenzo Mur-Labadia
, Ruben Martinez-Cantin
, José J. Guerrero
, Giovanni Maria Farinella
, Antonino Furnari
:
AFF-ttention! Affordances and Attention Models for Short-Term Object Interaction Anticipation. 167-184 - Lixuan Zhang
, Meina Kan
, Shiguang Shan
, Xilin Chen
:
PreLAR: World Model Pre-training with Learnable Action Representation. 185-201 - Fabien Baradel, Matthieu Armando, Salma Galaaoui, Romain Brégier, Philippe Weinzaepfel, Grégory Rogez, Thomas Lucas:
Multi-HMR: Multi-person Whole-Body Human Mesh Recovery in a Single Shot. 202-218 - Ziyang Liang, Yiwei Bao, Feng Lu:
De-confounded Gaze Estimation. 219-235 - Fabio Tosi
, Pierluigi Zama Ramirez
, Matteo Poggi
:
Diffusion Models for Monocular Depth Estimation: Overcoming Challenging Conditions. 236-257 - Hao Li
, Yanhao Jia
, Peng Jin
, Zesen Cheng
, Kehan Li, Jialu Sui, Chang Liu
, Li Yuan
:
FreestyleRet: Retrieving Images from Style-Diversified Queries. 258-274 - Phillip Y. Lee
, Minhyuk Sung
:
ReGround: Improving Textual and Spatial Grounding at No Cost. 275-292 - Jiewen Yang, Yiqun Lin, Bin Pu, Jiarong Guo, Xiaowei Xu, Xiaomeng Li:
CardiacNet: Learning to Reconstruct Abnormalities for Cardiac Disease Assessment from Echocardiogram Videos. 293-311 - Penghui Du, Yu Wang, Yifan Sun, Luting Wang, Yue Liao, Gang Zhang, Errui Ding, Yan Wang, Jingdong Wang, Si Liu:
LaMI-DETR: Open-Vocabulary Detection with Language Model Instruction. 312-328 - Lingyu Zhu
, Wenhan Yang, Baoliang Chen, Hanwei Zhu
, Zhangkai Ni, Qi Mao, Shiqi Wang
:
Unrolled Decomposed Unpaired Learning for Controllable Low-Light Video Enhancement. 329-347 - Alexandre Eymaël
, Renaud Vandeghen
, Anthony Cioppa
, Silvio Giancola
, Bernard Ghanem
, Marc Van Droogenbroeck
:
Efficient Image Pre-training with Siamese Cropped Masked Autoencoders. 348-366 - Zhixue Fang, Yuzhi Liu, Huisi Wu, Jing Qin:
VP-SAM: Taming Segment Anything Model for Video Polyp Segmentation via Disentanglement and Spatio-Temporal Side Network. 367-383 - Orest Kupyn, Christian Rupprecht:
Dataset Enhancement with Instance-Level Augmentations. 384-402 - Zhikai Zhang, Yitang Li, Haofeng Huang, Mingxian Lin, Li Yi:
FreeMotion: MoCap-Free Human Motion Synthesis with Multimodal Large Language Models. 403-421 - Donggyun Kim, Seongwoong Cho, Semin Kim, Chong Luo, Seunghoon Hong:
Chameleon: A Data-Efficient Generalist for Dense Visual Prediction in the Wild. 422-441 - Thibaut Loiseau, Tuan-Hung Vu, Mickaël Chen, Patrick Pérez, Matthieu Cord:
Reliability in Semantic Segmentation: Can We Use Synthetic Data? 442-459 - Runmin Zhang, Jun Ma, Si-Yuan Cao, Lun Luo, Beinan Yu, Shu-Jie Chen, Junwei Li, Hui-Liang Shen:
SCPNet: Unsupervised Cross-Modal Homography Estimation via Intra-modal Self-supervised Learning. 460-477 - Yujia Liang, Zixuan Ye
, Wenze Liu
, Hao Lu
:
SCAPE: A Simple and Strong Category-Agnostic Pose Estimator. 478-494

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.