default search action
Gedas Bertasius
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c39]Xiyao Wang, Yuhang Zhou, Xiaoyu Liu, Hongjin Lu, Yuancheng Xu, Feihong He, Jaehong Yoon, Taixi Lu, Fuxiao Liu, Gedas Bertasius, Mohit Bansal, Huaxiu Yao, Furong Huang:
Mementos: A Comprehensive Benchmark for Multimodal Large Language Model Reasoning over Image Sequences. ACL (1) 2024: 416-442 - [c38]Jun Myeong Choi, Johnathan Leung, Noah Frahm, Max Christman, Gedas Bertasius, Roni Sengupta:
Building Secure and Engaging Video Communication by Using Monitor Illumination. CVPR Workshops 2024: 4377-4386 - [c37]Md Mohaiminul Islam, Ngan Ho, Xitong Yang, Tushar Nagarajan, Lorenzo Torresani, Gedas Bertasius:
Video ReCap: Recursive Captioning of Hour-Long Videos. CVPR 2024: 18198-18208 - [c36]Xizi Wang, Feng Cheng, Gedas Bertasius:
LoCoNet: Long-Short Context Network for Active Speaker Detection. CVPR 2024: 18462-18472 - [c35]Kristen Grauman, Andrew Westbury, Lorenzo Torresani, Kris Kitani, Jitendra Malik, Triantafyllos Afouras, Kumar Ashutosh, Vijay Baiyya, Siddhant Bansal, Bikram Boote, Eugene Byrne, Zachary Chavis, Joya Chen, Feng Cheng, Fu-Jen Chu, Sean Crane, Avijit Dasgupta, Jing Dong, María Escobar, Cristhian Forigua, Abrham Gebreselasie, Sanjay Haresh, Jing Huang, Md Mohaiminul Islam, Suyog Dutt Jain, Rawal Khirodkar, Devansh Kukreja, Kevin J. Liang, Jia-Wei Liu, Sagnik Majumder, Yongsen Mao, Miguel Martin, Effrosyni Mavroudi, Tushar Nagarajan, Francesco Ragusa, Santhosh Kumar Ramakrishnan, Luigi Seminara, Arjun Somayazulu, Yale Song, Shan Su, Zihui Xue, Edward Zhang, Jinxu Zhang, Angela Castillo, Changan Chen, Xinzhu Fu, Ryosuke Furuta, Cristina González, Prince Gupta, Jiabo Hu, Yifei Huang, Yiming Huang, Weslie Khoo, Anush Kumar, Robert Kuo, Sach Lakhavani, Miao Liu, Mi Luo, Zhengyi Luo, Brighid Meredith, Austin Miller, Oluwatumininu Oguntola, Xiaqing Pan, Penny Peng, Shraman Pramanick, Merey Ramazanova, Fiona Ryan, Wei Shan, Kiran Somasundaram, Chenan Song, Audrey Southerland, Masatoshi Tateno, Huiyu Wang, Yuchen Wang, Takuma Yagi, Mingfei Yan, Xitong Yang, Zecheng Yu, Shengxin Cindy Zha, Chen Zhao, Ziwei Zhao, Zhifan Zhu, Jeff Zhuo, Pablo Arbeláez, Gedas Bertasius, Dima Damen, Jakob Engel, Giovanni Maria Farinella, Antonino Furnari, Bernard Ghanem, Judy Hoffman, C. V. Jawahar, Richard A. Newcombe, Hyun Soo Park, James M. Rehg, Yoichi Sato, Manolis Savva, Jianbo Shi, Mike Zheng Shout, Michael Wray:
Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives. CVPR 2024: 19383-19400 - [c34]Tanveer Hannan, Md Mohaiminul Islam, Thomas Seidl, Gedas Bertasius:
RGNet: A Unified Clip Retrieval and Grounding Network for Long Videos. ECCV (21) 2024: 352-369 - [c33]Feng Cheng, Mi Luo, Huiyu Wang, Alex Dimakis, Lorenzo Torresani, Gedas Bertasius, Kristen Grauman:
4DIFF: 3D-Aware Diffusion Model for Third-to-First Viewpoint Translation. ECCV (24) 2024: 409-427 - [c32]Ce Zhang, Taixi Lu, Md Mohaiminul Islam, Ziyang Wang, Shoubin Yu, Mohit Bansal, Gedas Bertasius:
A Simple LLM Framework for Long-Range Video Question-Answering. EMNLP 2024: 21715-21737 - [i43]Xiyao Wang, Yuhang Zhou, Xiaoyu Liu, Hongjin Lu, Yuancheng Xu, Feihong He, Jaehong Yoon, Taixi Lu, Gedas Bertasius, Mohit Bansal, Huaxiu Yao, Furong Huang:
Mementos: A Comprehensive Benchmark for Multimodal Large Language Model Reasoning over Image Sequences. CoRR abs/2401.10529 (2024) - [i42]Md Mohaiminul Islam, Ngan Ho, Xitong Yang, Tushar Nagarajan, Lorenzo Torresani, Gedas Bertasius:
Video ReCap: Recursive Captioning of Hour-Long Videos. CoRR abs/2402.13250 (2024) - [i41]Feng Cheng, Ziyang Wang, Yi-Lin Sung, Yan-Bo Lin, Mohit Bansal, Gedas Bertasius:
DAM: Dynamic Adapter Merging for Continual Video QA Learning. CoRR abs/2403.08755 (2024) - [i40]Yue Yang, Bryce Ikeda, Gedas Bertasius, Daniel Szafir:
Augmented Reality Demonstrations for Scalable Robot Imitation Learning. CoRR abs/2403.13910 (2024) - [i39]Yan-Bo Lin, Gedas Bertasius:
Siamese Vision Transformers are Scalable Audio-visual Learners. CoRR abs/2403.19638 (2024) - [i38]Ziyang Wang, Shoubin Yu, Elias Stengel-Eskin, Jaehong Yoon, Feng Cheng, Gedas Bertasius, Mohit Bansal:
VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos. CoRR abs/2405.19209 (2024) - [i37]Yan-Bo Lin, Yu Tian, Linjie Yang, Gedas Bertasius, Heng Wang:
VMAS: Video-to-Music Generation via Semantic Alignment in Web Music Videos. CoRR abs/2409.07450 (2024) - [i36]Md Mohaiminul Islam, Tushar Nagarajan, Huiyu Wang, Fu-Jen Chu, Kris Kitani, Gedas Bertasius, Xitong Yang:
Propose, Assess, Search: Harnessing LLMs for Goal-Oriented Planning in Instructional Videos. CoRR abs/2409.20557 (2024) - 2023
- [j1]Avinash Madasu, Estelle Aflalo, Gabriela Ben Melech Stan, Shachar Rosenman, Shao-Yen Tseng, Gedas Bertasius, Vasudev Lal:
MuMUR: Multilingual Multimodal Universal Retrieval. Inf. Retr. J. 26(1): 5 (2023) - [c31]Yan-Bo Lin, Yi-Lin Sung, Jie Lei, Mohit Bansal, Gedas Bertasius:
Vision Transformers are Parameter-Efficient Audio-Visual Learners. CVPR 2023: 2299-2309 - [c30]Feng Cheng, Xizi Wang, Jie Lei, David J. Crandall, Mohit Bansal, Gedas Bertasius:
VindLU: A Recipe for Effective Video-and-Language Pretraining. CVPR 2023: 10739-10750 - [c29]Md Mohaiminul Islam, Mahmudul Hasan, Kishan Shamsundar Athrey, Tony Braskich, Gedas Bertasius:
Efficient Movie Scene Detection using State-Space Transformers. CVPR 2023: 18749-18758 - [c28]Avinash Madasu, Estelle Aflalo, Gabriela Ben Melech Stan, Shao-Yen Tseng, Gedas Bertasius, Vasudev Lal:
Improving Video Retrieval Using Multilingual Knowledge Transfer. ECIR (1) 2023: 669-684 - [c27]Ziyang Wang, Yi-Lin Sung, Feng Cheng, Gedas Bertasius, Mohit Bansal:
Unified Coarse-to-Fine Alignment for Video-Text Retrieval. ICCV 2023: 2804-2815 - [c26]Qin Liu, Zhenlin Xu, Gedas Bertasius, Marc Niethammer:
SimpleClick: Interactive Image Segmentation with Simple Vision Transformers. ICCV 2023: 22233-22243 - [i35]Xizi Wang, Feng Cheng, Gedas Bertasius, David Crandall:
LoCoNet: Long-Short Context Network for Active Speaker Detection. CoRR abs/2301.08237 (2023) - [i34]Ziyang Wang, Yi-Lin Sung, Feng Cheng, Gedas Bertasius, Mohit Bansal:
Unified Coarse-to-Fine Alignment for Video-Text Retrieval. CoRR abs/2309.10091 (2023) - [i33]Tanveer Hannan, Md Mohaiminul Islam, Thomas Seidl, Gedas Bertasius:
RGNet: A Unified Retrieval and Grounding Network for Long Videos. CoRR abs/2312.06729 (2023) - [i32]Ce Zhang, Taixi Lu, Md Mohaiminul Islam, Ziyang Wang, Shoubin Yu, Mohit Bansal, Gedas Bertasius:
A Simple LLM Framework for Long-Range Video Question-Answering. CoRR abs/2312.17235 (2023) - 2022
- [c25]Xudong Lin, Fabio Petroni, Gedas Bertasius, Marcus Rohrbach, Shih-Fu Chang, Lorenzo Torresani:
Learning To Recognize Procedural Activities with Distant Supervision. CVPR 2022: 13843-13853 - [c24]Jue Wang, Gedas Bertasius, Du Tran, Lorenzo Torresani:
Long-Short Temporal Contrastive Learning of Video Transformers. CVPR 2022: 13990-14000 - [c23]Md Mohaiminul Islam, Gedas Bertasius:
Long Movie Clip Classification with State-Space Video Models. ECCV (35) 2022: 87-104 - [c22]Yan-Bo Lin, Jie Lei, Mohit Bansal, Gedas Bertasius:
EclipSE: Efficient Long-Range Video Retrieval Using Sight and Sound. ECCV (34) 2022: 413-430 - [c21]Feng Cheng, Gedas Bertasius:
TallFormer: Temporal Action Localization with a Long-Memory Transformer. ECCV (34) 2022: 503-521 - [c20]Avinash Madasu, Junier Oliva, Gedas Bertasius:
Learning to Retrieve Videos by Asking Questions. ACM Multimedia 2022: 356-365 - [i31]Xudong Lin, Fabio Petroni, Gedas Bertasius, Marcus Rohrbach, Shih-Fu Chang, Lorenzo Torresani:
Learning To Recognize Procedural Activities with Distant Supervision. CoRR abs/2201.10990 (2022) - [i30]Feng Cheng, Gedas Bertasius:
TALLFormer: Temporal Action Localization with Long-memory Transformer. CoRR abs/2204.01680 (2022) - [i29]Md Mohaiminul Islam, Gedas Bertasius:
Long Movie Clip Classification with State-Space Video Models. CoRR abs/2204.01692 (2022) - [i28]Yan-Bo Lin, Jie Lei, Mohit Bansal, Gedas Bertasius:
ECLIPSE: Efficient Long-range Video Retrieval using Sight and Sound. CoRR abs/2204.02874 (2022) - [i27]Avinash Madasu, Junier Oliva, Gedas Bertasius:
Learning to Retrieve Videos by Asking Questions. CoRR abs/2205.05739 (2022) - [i26]Md Mohaiminul Islam, Gedas Bertasius:
Object State Change Classification in Egocentric Videos using the Divided Space-Time Attention Mechanism. CoRR abs/2207.11814 (2022) - [i25]Avinash Madasu, Estelle Aflalo, Gabriela Ben Melech Stan, Shao-Yen Tseng, Gedas Bertasius, Vasudev Lal:
Improving video retrieval using multilingual knowledge transfer. CoRR abs/2208.11553 (2022) - [i24]Qin Liu, Zhenlin Xu, Gedas Bertasius, Marc Niethammer:
SimpleClick: Interactive Image Segmentation with Simple Vision Transformers. CoRR abs/2210.11006 (2022) - [i23]Feng Cheng, Xizi Wang, Jie Lei, David J. Crandall, Mohit Bansal, Gedas Bertasius:
VindLU: A Recipe for Effective Video-and-Language Pretraining. CoRR abs/2212.05051 (2022) - [i22]Yan-Bo Lin, Yi-Lin Sung, Jie Lei, Mohit Bansal, Gedas Bertasius:
Vision Transformers are Parameter-Efficient Audio-Visual Learners. CoRR abs/2212.07983 (2022) - [i21]Md Mohaiminul Islam, Mahmudul Hasan, Kishan Shamsundar Athrey, Tony Braskich, Gedas Bertasius:
Efficient Movie Scene Detection using State-Space Transformers. CoRR abs/2212.14427 (2022) - 2021
- [c19]Xudong Lin, Gedas Bertasius, Jue Wang, Shih-Fu Chang, Devi Parikh, Lorenzo Torresani:
Vx2Text: End-to-End Learning of Video-Based Text Generation From Multimodal Inputs. CVPR 2021: 7005-7015 - [c18]Gedas Bertasius, Heng Wang, Lorenzo Torresani:
Is Space-Time Attention All You Need for Video Understanding? ICML 2021: 813-824 - [c17]Yang Wang, Gedas Bertasius, Tae-Hyun Oh, Abhinav Gupta, Minh Hoai, Lorenzo Torresani:
Supervoxel Attention Graphs for Long-Range Video Modeling. WACV 2021: 155-166 - [i20]Xudong Lin, Gedas Bertasius, Jue Wang, Shih-Fu Chang, Devi Parikh, Lorenzo Torresani:
VX2TEXT: End-to-End Learning of Video-Based Text Generation From Multimodal Inputs. CoRR abs/2101.12059 (2021) - [i19]Gedas Bertasius, Heng Wang, Lorenzo Torresani:
Is Space-Time Attention All You Need for Video Understanding? CoRR abs/2102.05095 (2021) - [i18]Jue Wang, Gedas Bertasius, Du Tran, Lorenzo Torresani:
Long-Short Temporal Contrastive Learning of Video Transformers. CoRR abs/2106.09212 (2021) - 2020
- [c16]Yang Wang, Vinh Tran, Gedas Bertasius, Lorenzo Torresani, Minh Hoai Nguyen:
Attentive Action and Context Factorization. BMVC 2020 - [c15]Gedas Bertasius, Lorenzo Torresani:
Classifying, Segmenting, and Tracking Object Instances in Video with Mask Propagation. CVPR 2020: 9736-9745 - [c14]Gedas Bertasius, Lorenzo Torresani:
COBE: Contextualized Object Embeddings from Narrated Instructional Video. NeurIPS 2020 - [i17]Gedas Bertasius, Lorenzo Torresani:
COBE: Contextualized Object Embeddings from Narrated Instructional Video. CoRR abs/2007.07306 (2020)
2010 – 2019
- 2019
- [c13]Gedas Bertasius, Christoph Feichtenhofer, Du Tran, Jianbo Shi, Lorenzo Torresani:
Learning Temporal Pose Estimation from Sparsely-Labeled Videos. NeurIPS 2019: 3021-3032 - [i16]Yang Wang, Vinh Tran, Gedas Bertasius, Lorenzo Torresani, Minh Hoai:
Attentive Action and Context Factorization. CoRR abs/1904.05410 (2019) - [i15]Gedas Bertasius, Christoph Feichtenhofer, Du Tran, Jianbo Shi, Lorenzo Torresani:
Learning Temporal Pose Estimation from Sparsely-Labeled Videos. CoRR abs/1906.04016 (2019) - [i14]Gedas Bertasius, Lorenzo Torresani:
Classifying, Segmenting, and Tracking Object Instances in Video with Mask Propagation. CoRR abs/1912.04573 (2019) - 2018
- [c12]Gedas Bertasius, Aaron Chan, Jianbo Shi:
Egocentric Basketball Motion Planning From a Single First-Person Image. CVPR 2018: 5889-5898 - [c11]Gedas Bertasius, Lorenzo Torresani, Jianbo Shi:
Object Detection in Video with Spatiotemporal Sampling Networks. ECCV (12) 2018: 342-357 - [i13]Gedas Bertasius, Aaron Chan, Jianbo Shi:
Egocentric Basketball Motion Planning from a Single First-Person Image. CoRR abs/1803.01413 (2018) - [i12]Gedas Bertasius, Lorenzo Torresani, Jianbo Shi:
Object Detection in Video with Spatiotemporal Sampling Networks. CoRR abs/1803.05549 (2018) - [i11]Gedas Bertasius, Christoph Feichtenhofer, Du Tran, Jianbo Shi, Lorenzo Torresani:
Learning Discriminative Motion Features Through Detection. CoRR abs/1812.04172 (2018) - 2017
- [c10]Gedas Bertasius, Qiang Liu, Lorenzo Torresani, Jianbo Shi:
Local Perturb-and-MAP for Structured Prediction. AISTATS 2017: 585-594 - [c9]Gedas Bertasius, Lorenzo Torresani, Stella X. Yu, Jianbo Shi:
Convolutional Random Walk Networks for Semantic Image Segmentation. CVPR 2017: 6137-6145 - [c8]Gedas Bertasius, Hyun Soo Park, Stella X. Yu, Jianbo Shi:
Unsupervised Learning of Important Objects from First-Person Videos. ICCV 2017: 1974-1982 - [c7]Gedas Bertasius, Hyun Soo Park, Stella X. Yu, Jianbo Shi:
Am I a Baller? Basketball Performance Assessment from First-Person Videos. ICCV 2017: 2196-2204 - [c6]Jianbo Shi, Gedas Bertasius:
Using Cross-Model EgoSupervision to Learn Cooperative Basketball Intention. ICCV Workshops 2017: 2355-2363 - [c5]Gedas Bertasius, Hyun Soo Park, Stella X. Yu, Jianbo Shi:
First-Person Action-Object Detection with EgoNet. Robotics: Science and Systems 2017 - [i10]Gedas Bertasius, Jianbo Shi:
Using Cross-Model EgoSupervision to Learn Cooperative Basketball Intention. CoRR abs/1709.01630 (2017) - 2016
- [c4]Gedas Bertasius, Jianbo Shi, Lorenzo Torresani:
Semantic Segmentation with Boundary Neural Fields. CVPR 2016: 3602-3610 - [c3]Isabella Nogues, Le Lu, Xiaosong Wang, Holger Roth, Gedas Bertasius, Nathan Lay, Jianbo Shi, Yohannes Tsehay, Ronald M. Summers:
Automatic Lymph Node Cluster Segmentation Using Holistically-Nested Neural Networks and Structured Optimization in CT Images. MICCAI (2) 2016: 388-397 - [i9]Gedas Bertasius, Hyun Soo Park, Stella X. Yu, Jianbo Shi:
First Person Action-Object Detection with EgoNet. CoRR abs/1603.04908 (2016) - [i8]Gedas Bertasius, Lorenzo Torresani, Stella X. Yu, Jianbo Shi:
Convolutional Random Walk Networks for Semantic Image Segmentation. CoRR abs/1605.07681 (2016) - [i7]Gedas Bertasius, Qiang Liu, Lorenzo Torresani, Jianbo Shi:
Local Perturb-and-MAP for Structured Prediction. CoRR abs/1605.07686 (2016) - [i6]Gedas Bertasius, Stella X. Yu, Jianbo Shi:
Exploiting Visual-Spatial First-Person Co-Occurrence for Action-Object Detection without Labels. CoRR abs/1611.05335 (2016) - [i5]Gedas Bertasius, Stella X. Yu, Hyun Soo Park, Jianbo Shi:
Am I a Baller? Basketball Skill Assessment using First-Person Cameras. CoRR abs/1611.05365 (2016) - 2015
- [c2]Gedas Bertasius, Jianbo Shi, Lorenzo Torresani:
DeepEdge: A multi-scale bifurcated deep network for top-down contour detection. CVPR 2015: 4380-4389 - [c1]Gedas Bertasius, Jianbo Shi, Lorenzo Torresani:
High-for-Low and Low-for-High: Efficient Boundary Detection from Deep Object Features and Its Applications to High-Level Vision. ICCV 2015: 504-512 - [i4]Gedas Bertasius, Jianbo Shi, Lorenzo Torresani:
High-for-Low and Low-for-High: Efficient Boundary Detection from Deep Object Features and its Applications to High-Level Vision. CoRR abs/1504.06201 (2015) - [i3]Gedas Bertasius, Jianbo Shi, Lorenzo Torresani:
Semantic Segmentation with Boundary Neural Fields. CoRR abs/1511.02674 (2015) - [i2]Gedas Bertasius, Hyun Soo Park, Jianbo Shi:
Exploiting Egocentric Object Prior for 3D Saliency Detection. CoRR abs/1511.02682 (2015) - 2014
- [i1]Gedas Bertasius, Jianbo Shi, Lorenzo Torresani:
DeepEdge: A Multi-Scale Bifurcated Deep Network for Top-Down Contour Detection. CoRR abs/1412.1123 (2014)
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-11-15 20:38 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint