


Остановите войну!
for scientists:


default search action
ICCV 2021: Montreal, QC, Canada
- 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, October 10-17, 2021. IEEE 2021, ISBN 978-1-6654-2812-5
- Abdullah Hamdi
, Silvio Giancola, Bernard Ghanem
:
MVTN: Multi-View Transformation Network for 3D Shape Recognition. 1-11 - Boyu Chen, Peixia Li, Chuming Li, Baopu Li, Lei Bai, Chen Lin, Ming Sun, Junjie Yan, Wanli Ouyang:
GLiT: Neural Architecture Search for Global and Local Image Transformer. 12-21 - Haiping Wu, Bin Xiao, Noel Codella, Mengchen Liu, Xiyang Dai, Lu Yuan, Lei Zhang:
CvT: Introducing Convolutions to Vision Transformers. 22-31 - Hugo Touvron, Matthieu Cord, Alexandre Sablayrolles, Gabriel Synnaeve, Hervé Jégou:
Going deeper with Image Transformers. 32-42 - Bin Xiao, Haifeng Wu, Xiuli Bi:
DTMNet: A Discrete Tchebichef Moments-based Deep Neural Network for Multi-focus Image Fusion. 43-51 - Zhiqiang Tang, Yunhe Gao, Yi Zhu, Zhi Zhang, Mu Li, Dimitris N. Metaxas:
CrossNorm and SelfNorm for Generalization under Distribution Shifts. 52-61 - Zhi-Fan Wu, Tong Wei, Jianwen Jiang, Chaojie Mao, Mingqian Tang, Yu-Feng Li:
NGC: A Unified Framework for Learning with Open-World Noisy Data. 62-71 - Xiong Zhou, Xianming Liu, Chenyang Wang, Deming Zhai, Junjun Jiang, Xiangyang Ji:
Learning with Noisy Labels via Sparse Regularization. 72-81 - Tal Ridnik, Emanuel Ben Baruch, Nadav Zamir, Asaf Noy, Itamar Friedman, Matan Protter, Lihi Zelnik-Manor:
Asymmetric Loss For Multi-Label Classification. 82-91 - Han-Jia Ye, De-Chuan Zhan, Wei-Lun Chao:
Procrustean Training for Imbalanced Deep Learning. 92-102 - Yunrui Guo, Guglielmo Camporese, Wenjing Yang, Alessandro Sperduti, Lamberto Ballan:
Conditional Variational Capsule Network for Open Set Recognition. 103-111 - Jiarui Cai, Yizhou Wang, Jenq-Neng Hwang:
ACE: Ally Complementary Experts for Solving Long-Tailed Recognition in One-Shot. 112-121 - Shiming Chen, Wenjie Wang, Beihao Xia, Qinmu Peng, Xinge You, Feng Zheng, Ling Shao:
FREE: Feature Refinement for Generalized Zero-Shot Learning. 122-131 - Jinheng Xie, Cheng Luo, Xiangping Zhu, Ziqi Jin, Weizeng Lu, Linlin Shen:
Online Refinement of Low-level Feature Based Activation Map for Weakly Supervised Object Localization. 132-141 - Nanyi Fei, Yizhao Gao, Zhiwu Lu, Tao Xiang:
Z-Score Normalization, Hubness, and Few-Shot Learning. 142-151 - Abhishek Aich, Meng Zheng, Srikrishna Karanam, Terrence Chen, Amit K. Roy-Chowdhury, Ziyan Wu:
Spatio-Temporal Representation Factorization for Video-based Person Re-Identification. 152-162 - Jiawei Zhao, Ke Yan, Yifan Zhao, Xiaowei Guo, Feiyue Huang, Jia Li:
Transformer-based Dual Relation Graph for Multi-label Image Recognition. 163-172 - Didik Purwanto, Yie-Tarng Chen, Wen-Hsien Fang:
Dance with Self-Attention: A New Look of Conditional Random Fields on Anomaly Detection in Videos. 173-183 - Ke Zhu, Jianxin Wu:
Residual Attention: A Simple but Effective Method for Multi-Label Recognition. 184-193 - Ming Li, Xinming Huang, Ziming Zhang:
Self-supervised Geometric Features Discovery via Interpretable Attention for Vehicle Re-Identification and Beyond. 194-204 - Jiajian Zhao
, Yifan Zhao, Jia Li, Ke Yan, Yonghong Tian:
Heterogeneous Relational Complement for Vehicle Re-identification. 205-214 - Yukun Huang, Xueyang Fu, Zheng-Jun Zha:
Attack-Guided Perceptual Data Generation for Real-world Re-Identification. 215-224 - Ziyu Wei, Xi Yang, Nannan Wang, Xinbo Gao:
Syncretic Modality Collaborative Learning for Visible Infrared Person Re-Identification. 225-234 - Yin-Yin He, Jianxin Wu, Xiu-Shen Wei:
Distilling Virtual Examples for Long-tailed Recognition. 235-244 - Florian Strohm, Ekta Sood, Sven Mayer
, Philipp Müller, Mihai Bâce, Andreas Bulling:
Neural Photofit: Gaze-based Mental Image Reconstruction. 245-254 - Philipp Bomatter, Mengmi Zhang, Dimitar Karev, Spandan Madan, Claire Tseng, Gabriel Kreiman:
When Pigs Fly: Contextual Reasoning in Synthetic and Natural Scenes. 255-264 - Juan León Alcázar, Fabian Caba Heilbron, Ali K. Thabet
, Bernard Ghanem
:
MAAS: Multi-modal Assignation for Active Speaker Detection. 265-274 - Sagnik Majumder, Ziad Al-Halah, Kristen Grauman:
Move2Hear: Active Audio-Visual Source Separation. 275-285 - Nikhil Singh, Jeff Mentch, Jerry Ng, Matthew Beveridge, Iddo Drori:
Image2Reverb: Cross-Modal Reverb Impulse Response Synthesis. 286-295 - Minsu Kim, Joanna Hong, Se Jin Park, Yong Man Ro
:
Multi-modality Associative Bridging through Memory: Speech Sound Recollected from Face Video. 296-306 - Boyu Chen, Peixia Li, Baopu Li, Chen Lin, Chuming Li, Ming Sun, Junjie Yan, Wanli Ouyang:
BN-NAS: Neural Architecture Search with Batch Normalization. 307-316 - Kun Yuan, Quanquan Li, Shaopeng Guo, Dapeng Chen, Aojun Zhou, Fengwei Yu, Ziwei Liu:
Differentiable Dynamic Wirings for Neural Networks. 317-326 - Daquan Zhou, Xiaojie Jin, Xiaochen Lian, Linjie Yang, Yujing Xue, Qibin Hou, Jiashi Feng:
AutoSpace: Neural Architecture Search with Less Human Interference. 327-336 - Ming Lin, Pichao Wang, Zhenhong Sun, Hesen Chen, Xiuyu Sun, Qi Qian, Hao Li, Rong Jin:
Zen-NAS: A Zero-Shot NAS for High-Performance Image Recognition. 337-346 - Chun-Fu (Richard) Chen, Quanfu Fan, Rameswar Panda:
CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification. 347-356 - Zhiliang Peng, Wei Huang, Shanzhi Gu, Lingxi Xie, Yaowei Wang, Jianbin Jiao, Qixiang Ye:
Conformer: Local Features Coupling Global Representations for Visual Recognition. 357-366 - Zizheng Pan, Bohan Zhuang, Jing Liu, Haoyu He, Jianfei Cai:
Scalable Vision Transformers with Hierarchical Pooling. 367-376 - Xiaoyu Yue, Shuyang Sun, Zhanghui Kuang, Meng Wei, Philip H. S. Torr, Wayne Zhang
, Dahua Lin:
Vision Transformer with Progressive Sampling. 377-386 - Hila Chefer, Shir Gur, Lior Wolf:
Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers. 387-396 - Xin Wei, Yifei Gong, Fudong Wang, Xing Sun, Jian Sun:
Learning Canonical View Representation for 3D Shape Recognition with Arbitrary Views. 397-406 - Cheng Zhang, Tai-Yu Pan, Yandong Li, Hexiang Hu, Dong Xuan, Soravit Changpinyo, Boqing Gong, Wei-Lun Chao:
MosaicOS: A Simple and Effective Use of Object-Centric Images for Long-Tailed Object Detection. 407-417 - Xiaoshi Wu, Hadar Averbuch-Elor, Jin Sun, Noah Snavely:
Towers of Babel: Combining Images, Language, and 3D Geometry for Learning Multimodal Vision. 418-427 - Bo Xu, Han Huang, Cheng Lu, Ziwen Li, Yandong Guo:
Virtual Multi-Modality Self-Supervised Foreground Matting for Human-Object Interaction. 428-437 - Ziwei Wang, Yonhon Ng, Cedric Scheerlinck, Robert E. Mahony:
An Asynchronous Kalman Filter for Hybrid Event Cameras. 438-447 - Guangyao Chen, Peixi Peng, Li Ma
, Jia Li, Lin Du, Yonghong Tian:
Amplitude-Phase Recombination: Rethinking Robustness of Convolutional Neural Networks in Frequency Domain. 448-457 - Yunsheng Li, Yinpeng Chen, Xiyang Dai, Dongdong Chen, Mengchen Liu, Lu Yuan, Zicheng Liu, Lei Zhang, Nuno Vasconcelos
:
MicroNet: Improving Image Recognition with Extremely Low FLOPs. 458-467 - Haozhe Liu, Haoqian Wu, Weicheng Xie, Feng Liu, Linlin Shen:
Group-wise Inhibition based Feature Regularization for Robust Classification. 468-476 - Yanfu Zhang, Shangqian Gao, Heng Huang:
Exploration and Estimation for Model Compression. 477-486 - Hossein Talebi, Peyman Milanfar:
Learning to Resize Images for Computer Vision Tasks. 487-496 - Zhonghua Wu, Xiangxi Shi, Guosheng Lin, Jianfei Cai:
Learning Meta-class Memory for Few-Shot Semantic Segmentation. 497-506 - Shuyang Sun, Xiaoyu Yue, Xiaojuan Qi, Wanli Ouyang, Victor Prisacariu, Philip H. S. Torr:
Aggregation with Feature Detection. 507-516 - Chris Dongjoo Kim, Jinseo Jeong, Sangwoo Moon, Gunhee Kim:
Continual Learning on Noisy Data Streams via Self-Purified Replay. 517-527 - Sihyeon Kim, Sanghyeok Lee, Dasol Hwang, Jaewon Lee, Seong Jae Hwang, Hyunwoo J. Kim:
Point Cloud Augmentation with Weighted Local Transformations. 528-537 - Li Yuan, Yunpeng Chen, Tao Wang, Weihao Yu, Yujun Shi, Zihang Jiang, Francis E. H. Tay, Jiashi Feng, Shuicheng Yan:
Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet. 538-547 - Wenhai Wang, Enze Xie, Xiang Li, Deng-Ping Fan
, Kaitao Song, Ding Liang, Tong Lu, Ping Luo, Ling Shao:
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions. 548-558 - Kun Yuan, Shaopeng Guo, Ziwei Liu, Aojun Zhou, Fengwei Yu, Wei Wu:
Incorporating Convolution Designs into Visual Transformers. 559-568 - Zhengsu Chen, Lingxi Xie, Jianwei Niu, Xuefeng Liu, Longhui Wei, Qi Tian:
Visformer: The Vision-friendly Transformer. 569-578 - Bichen Wu, Chenfeng Xu, Xiaoliang Dai, Alvin Wan, Peizhao Zhang, Zhicheng Yan, Masayoshi Tomizuka, Joseph Gonzalez, Kurt Keutzer, Peter Vajda:
Visual Transformers: Where Do Transformers Really Belong in Vision Models? 579-589 - Xuhui Jia, Kai Han, Yukun Zhu, Bradley Green:
Joint Representation Learning and Novel Category Discovery on Single- and Multi-modal Data. 590-599 - Shaoli Huang, Xinchao Wang, Dacheng Tao:
Stochastic Partial Swap: Enhanced Model Generalization and Interpretability for Fine-grained Recognition. 600-609 - Tianhao Li, Limin Wang, Gangshan Wu:
Self Supervision to Distillation for Long-Tailed Visual Recognition. 610-619 - Avi Ben-Cohen, Nadav Zamir, Emanuel Ben Baruch, Itamar Friedman, Lihi Zelnik-Manor:
Semantic Diversity Learning for Zero-Shot Multi-label Classification. 620-630 - Xueting Zhang, Debin Meng, Henry Gouk, Timothy M. Hospedales:
Shallow Bayesian Meta Learning for Real-World Few-Shot Recognition. 631-640 - Chengzhi Mao, Mia Chiquier, Hao Wang, Junfeng Yang, Carl Vondrick:
Adversarial Attacks are Reversible with Natural Supervision. 641-651 - Jie Hu, Liujuan Cao, Tong Tong, Qixiang Ye, Shengchuan Zhang, Ke Li, Feiyue Huang, Ling Shao, Rongrong Ji:
Architecture Disentanglement for Deep Neural Networks. 652-661 - Xuejun Zhao, Wencan Zhang, Xiaokui Xiao, Brian Y. Lim:
Exploiting Explanations for Model Inversion Attacks. 662-672 - Oran Lang, Yossi Gandelsman, Michal Yarom, Yoav Wald, Gal Elidan, Avinatan Hassidim, William T. Freeman, Phillip Isola, Amir Globerson, Michal Irani, Inbar Mosseri:
Explaining in Style: Training a GAN to explain a classifier in StyleSpace. 673-682 - Stephan J. Lemmer, Jason J. Corso:
Ground-truth or DAER: Selective Re-query of Secondary Information. 683-694 - Jiequan Cui, Zhisheng Zhong, Shu Liu, Bei Yu, Jiaya Jia:
Parametric Contrastive Learning. 695-704 - Zizhao Zhang, Tomas Pfister:
Learning Fast Sample Re-weighting Without Reward Data. 705-714 - Seulki Park, Jongin Lim, Younghan Jeon, Jin Young Choi:
Influence-Balanced Loss for Imbalanced Visual Classification. 715-724 - Shunyan Luo, Emre Barut, Fang Jin:
Statistically Consistent Saliency Estimation. 725-733 - Yunze Liu, Qingnan Fan, Shanghang Zhang, Hao Dong, Thomas A. Funkhouser, Li Yi:
Contrastive Multimodal Fusion with TupleInfoNCE. 734-743 - Xiaofeng Liu, Site Li, Yubin Ge, Pengyi Ye, Jane You, Jun Lu:
Recursively Conditional Gaussian for Ordinal Unsupervised Domain Adaptation. 744-753 - Samuel G. Müller, Frank Hutter:
TrivialAugment: Tuning-free Yet State-of-the-Art Data Augmentation. 754-762 - Zequn Qin, Pengyi Zhang, Fei Wu, Xi Li:
FcaNet: Frequency Channel Attention Networks. 763-772 - Md. Amirul Islam, Matthew Kowal, Sen Jia
, Konstantinos G. Derpanis, Neil D. B. Bruce:
Global Pooling, More than Meets the Eye: Position Information is Encoded Channel-Wise in CNNs. 773-781 - Longwen Zhang, Qixuan Zhang, Minye Wu, Jingyi Yu, Lan Xu
:
Neural Video Portrait Relighting in Real-time via Consistency Modeling. 782-792 - Shu Kong, Deva Ramanan:
OpenGAN: Open-Set Recognition via Open Data Generation. 793-802 - Alexandre Ramé, Rémy Sun, Matthieu Cord:
MixMo: Mixing Multiple Inputs for Multiple Outputs via Deep Subnetworks. 803-813 - Zijian Wang, Yadan Luo
, Ruihong Qiu, Zi Huang
, Mahsa Baktashmotlagh:
Learning to Diversify for Single Domain Generalization. 814-823 - Hongjoon Ahn, Jihwan Kwak, Subin Lim, Hyeonsu Bang, Hyojun Kim, Taesup Moon:
SS-IL: Separated Softmax for Incremental Learning. 824-833 - Zihui Xue, Sucheng Ren, Zhengqi Gao, Hang Zhao:
Multimodal Knowledge Expansion. 834-843 - Shihua Huang, Zhichao Lu, Ran Cheng, Cheng He:
FaPN: Feature-aligned Pyramid Network for Dense Image Prediction. 844-853 - Hugo Touvron, Alexandre Sablayrolles, Matthijs Douze, Matthieu Cord, Hervé Jégou:
Grafit: Learning fine-grained image representations with coarse labels. 854-864 - Guohao Peng, Jun Zhang, Heshan Li, Danwei Wang:
Attentional Pyramid Pooling of Salient Visual Residuals for Place Recognition. 865-874 - Jiaqi Wang, Huafeng Liu, Xinyue Wang, Liping Jing:
Interpretable Image Recognition by Constructing Transparent Embedding Space. 875-884 - Adria Ruiz, Antonio Agudo
, Francesc Moreno-Noguer:
Generating Attribution Maps with Disentangled Masked Backpropagation. 885-894 - Tiange Xiang, Chaoyi Zhang, Yang Song, Jianhui Yu, Weidong Cai:
Walk in the Cloud: Learning Curves for Point Clouds Shape Analysis. 895-904 - Byeong-Ju Han, Kuhyeun Ko, Jae-Young Sim:
End-to-End Trainable Trident Person Search Network Using Adaptive Gradient Propagation. 905-913 - Yijin Li, Han Zhou, Bangbang Yang, Ye Zhang, Zhaopeng Cui, Hujun Bao, Guofeng Zhang:
Graph-based Asynchronous Event Processing for Rapid Object Recognition. 914-923 - Rujiao Long, Wen Wang, Nan Xue, Feiyu Gao, Zhibo Yang, Yongpan Wang, Gui-Song Xia
:
Parsing Table Structures in the Wild. 924-932 - Yonggang Qi, Guoyao Su, Pinaki Nath Chowdhury, Mingkang Li, Yi-Zhe Song:
SketchLattice: Latticed Representation for Sketch Manipulation. 933-941 - Jian Jia, Xiaotang Chen, Kaiqi Huang:
Spatial and Semantic Consistency Regularizations for Pedestrian Attribute Recognition. 942-951 - Meiqi Guo, Rebecca Hwa
, Adriana Kovashka:
Detecting Persuasive Atypicality by Modeling Contextual Compatibility. 952-962 - Ayan Kumar Bhunia, Aneeshan Sain, Pinaki Nath Chowdhury, Yi-Zhe Song:
Text is Text, No Matter What: Unifying Text Recognition using Knowledge Distillation. 963-972 - Srikar Appalaraju, Bhavan Jasani, Bhargava Urala Kota, Yusheng Xie, R. Manmatha:
DocFormer: End-to-End Transformer for Document Understanding. 973-983 - Kamal Gupta, Justin Lazarow, Alessandro Achille, Larry Davis, Vijay Mahadevan, Abhinav Shrivastava:
LayoutTransformer: Layout Generation and Completion with Self-attention. 984-994 - Samarth Mishra, Zhongping Zhang, Yuan Shen, Ranjitha Kumar, Venkatesh Saligrama, Bryan A. Plummer:
Effectively Leveraging Attributes for Visual Similarity. 995-1004 - Yongming Rao, Guangyi Chen, Jiwen Lu, Jie Zhou:
Counterfactual Attention Learning for Fine-Grained Visual Categorization and Re-identification. 1005-1014 - Sunghun Joung, Seungryong Kim, Minsu Kim, Ig-Jae Kim, Kwanghoon Sohn:
Learning Canonical 3D Object Representation for Fine-Grained Recognition. 1015-1025 - Liangzhi Li
, Bowen Wang, Manisha Verma, Yuta Nakashima, Ryo Kawasaki, Hajime Nagahara:
SCOUTER: Slot Attention-based Classifier for Explainable Image Recognition. 1026-1035 - Pau Rodríguez
, Massimo Caccia, Alexandre Lacoste, Lee Zamparo, Issam H. Laradji, Laurent Charlin, David Vázquez:
Beyond Trivial Counterfactual Explanations with Diverse Valuable Explanations. 1036-1045 - Wei-Lin Hsiao, Kristen Grauman:
From Culture to Clothing: Discovering the World Events Behind A Century of Fashion Images. 1046-1055 - Wataru Shimoda, Daichi Haraguchi, Seiichi Uchida, Kota Yamaguchi
:
De-rendering Stylized Texts. 1056-1065 - Ankan Kumar Bhunia, Salman H. Khan, Hisham Cholakkal, Rao Muhammad Anwer, Fahad Shahbaz Khan, Mubarak Shah
:
Handwriting Transformers. 1066-1074 - Xin Wang, Shuyun Lin, Hao Zhang, Yufei Zhu, Quanshi Zhang:
Interpreting Attributions and Interactions of Adversarial Attacks. 1075-1084 - Thanh-Dat Truong, Chi Nhan Duong, The De Vu, Hoang Anh Pham, Bhiksha Raj, Ngan Le
, Khoa Luu:
The Right to Talk: An Audio-Visual Transformer Approach. 1085-1094 - Yue Song, Nicu Sebe
, Wei Wang:
Why Approximate Matrix Square Root Outperforms Accurate SVD in Global Covariance Pooling? 1095-1103 - Guile Wu, Shaogang Gong, Pan Li:
Striking a Balance between Stability and Plasticity for Class-Incremental Learning. 1104-1113 - Devin Guillory, Vaishaal Shankar, Sayna Ebrahimi, Trevor Darrell, Ludwig Schmidt:
Predicting with Confidence on Unseen Distributions. 1114-1124 - Canyi Lu:
Transforms based Tensor Robust PCA: Corrupted Low-Rank Tensors Recovery via Convex Optimization. 1125-1132 - Keke Tang
, Dingruibo Miao, Weilong Peng, Jianpeng Wu, Yawen Shi, Zhaoquan Gu, Zhihong Tian, Wenping Wang:
CODEs: Chamfer Out-of-Distribution Examples against Overconfidence Issue. 1133-1142 - Song Xue, Runqi Wang, Baochang Zhang, Tian Wang, Guodong Guo, David S. Doermann:
IDARTS: Interactive Differentiable Architecture Search. 1143-1152 - Alexander Richard, Michael Zollhöfer, Yandong Wen, Fernando De la Torre, Yaser Sheikh:
MeshTalk: 3D Face Animation from Speech using Cross-Modality Disentanglement. 1153-1162 - Senthil Purushwalkam, Sebastia Vicenc Amengual Gari, Vamsi Krishna Ithapu, Carl Schissler, Philip W. Robinson, Abhinav Gupta, Kristen Grauman:
Audio-Visual Floorplan Reconstruction. 1163-1172 - Okan Köpüklü, Maja Taseska, Gerhard Rigoll:
How to Design a Three-Stage Architecture for Audio-Visual Active Speaker Detection in the Wild. 1173-1183