default search action
30th ACM Multimedia 2022: Lisboa, Portugal
- João Magalhães, Alberto Del Bimbo, Shin'ichi Satoh, Nicu Sebe, Xavier Alameda-Pineda, Qin Jin, Vincent Oria, Laura Toni:
MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10 - 14, 2022. ACM 2022, ISBN 978-1-4503-9203-7
Keynote Talks
- Yoelle Maarek:
Alexa, let's work together! How Alexa Helps Customers Complete Tasks with Verbal and Visual Guidance in the Alexa Prize TaskBot Challenge. 1-2 - Nuria Oliver:
Data Science against COVID-19: The Valencian Experience. 3-4 - Douwe Kiela:
Grounding, Meaning and Foundation Models: Adventures in Multimodal Machine Learning. 5
Oral Session I: Engaging Users with Multimedia -- Emotional and Social Signals
- Rui Li, Yiting Wang, Wei-Long Zheng, Bao-Liang Lu:
A Multi-view Spectral-Spatial-Temporal Masked Autoencoder for Decoding Emotions with Self-supervised Learning. 6-14 - Teng Sun, Wenjie Wang, Liqiang Jing, Yiran Cui, Xuemeng Song, Liqiang Nie:
Counterfactual Reasoning for Out-of-distribution Multimodal Sentiment Analysis. 15-23 - Yuanyuan Liu, Wei Dai, Chuanxu Feng, Wenbin Wang, Guanghao Yin, Jiabei Zeng, Shiguang Shan:
MAFW: A Large-scale, Multi-modal, Compound Affective Database for Dynamic Facial Expression Recognition in the Wild. 24-32 - Shengzhe Liu, Xin Zhang, Jufeng Yang:
SER30K: A Large-Scale Dataset for Sticker Emotion Recognition. 33-41
Poster Session I: Engaging Users with Multimedia -- Emotional and Social Signals
- Jicai Pan, Shangfei Wang, Lin Fang:
Representation Learning through Multimodal Attention and Time-Sync Comments for Affective Video Content Analysis. 42-50 - Xujin Li, Wei Wei, Shuang Qiu, Huiguang He:
TFF-Former: Temporal-Frequency Fusion Transformer for Zero-training Decoding of Two BCI Tasks. 51-59 - Yuedong Chen, Xu Yang, Tat-Jen Cham, Jianfei Cai:
Towards Unbiased Visual Emotion Recognition via Causal Intervention. 60-69 - Michal Balazia, Philipp Müller, Ákos Levente Tánczos, August von Liechtenstein, François Brémond:
Bodily Behaviors in Social Interaction: Novel Annotations and State-of-the-Art Evaluation. 70-79 - Niki Maria Foteinopoulou, Ioannis Patras:
Learning from Label Relationships in Human Affect. 80-89 - Ziyi Ye, Xiaohui Xie, Yiqun Liu, Zhihong Wang, Xuesong Chen, Min Zhang, Shaoping Ma:
Brain Topography Adaptive Network for Satisfaction Modeling in Interactive Information Access System. 90-100 - Yan Wang, Yixuan Sun, Wei Song, Shuyong Gao, Yiwen Huang, Zhaoyu Chen, Weifeng Ge, Wenqiang Zhang:
DPCNet: Dual Path Multi-Excitation Collaborative Network for Facial Expression Representation Learning in Videos. 101-110 - Yingjie Chen, Chong Chen, Xiao Luo, Jianqiang Huang, Xian-Sheng Hua, Tao Wang, Yun Liang:
Pursuing Knowledge Consistency: Supervised Hierarchical Contrastive Learning for Facial Action Unit Recognition. 111-119 - Shiqing Zhang, Ruixin Liu, Yijiao Yang, Xiaoming Zhao, Jun Yu:
Unsupervised Domain Adaptation Integrating Transformer and Mutual Information for Cross-Corpus Speech Emotion Recognition. 120-129 - Zhen Xing, Weimin Tan, Ruian He, Yangle Lin, Bo Yan:
Co-Completion for Occluded Facial Expression Recognition. 130-140 - Weichen Yu, Hongyuan Yu, Yan Huang, Liang Wang:
Generalized Inter-class Loss for Gait Recognition. 141-150 - Fan Qi, Zixin Zhang, Xianshan Yang, Huaiwen Zhang, Changsheng Xu:
Feeling Without Sharing: A Federated Video Emotion Recognition Framework Via Privacy-Agnostic Hybrid Aggregation. 151-160 - Jianjian Shao, Zhenqian Wu, Yuanyan Luo, Shudong Huang, Xiaorong Pu, Yazhou Ren:
Self-Paced Label Distribution Learning for In-The-Wild Facial Expression Recognition. 161-169 - Yong Zhao, Haifeng Chen, Hichem Sahli, Ke Lu, Dongmei Jiang:
Uncertainty-Aware Semi-Supervised Learning of 3D Face Rigging from Single Image. 170-179 - Junyu Chen, Qianqian Xu, Zhiyong Yang, Xiaochun Cao, Qingming Huang:
A Unified Framework against Topology and Class Imbalance. 180-188 - Yang Yu, Dong Zhang, Shoushan Li:
Unified Multi-modal Pre-training for Few-shot Sentiment Analysis with Prompt-based Learning. 189-198 - Zhicheng Zhang, Jufeng Yang:
Temporal Sentiment Localization: Listen and Look in Untrimmed Videos. 199-208 - Xinyu Cheng, Wei Wei, Changde Du, Shuang Qiu, Sanli Tian, Xiaojun Ma, Huiguang He:
VigilanceNet: Decouple Intra- and Inter-Modality Learning for Multimodal Vigilance Estimation in RSVP-Based BCI. 209-217 - Lijuan Wang, Guoli Jia, Ning Jiang, Haiying Wu, Jufeng Yang:
EASE: Robust Facial Expression Recognition via Emotion Ambiguity-SEnsitive Cooperative Networks. 218-227 - Bo-Kai Ruan, Ling Lo, Hong-Han Shuai, Wen-Huang Cheng:
Mimicking the Annotation Process for Recognizing the Micro Expressions. 228-236
Oral Session II: Engaging User with Multimedia -- Multimedia Search and Recommendation
- Peng-Fei Zhang, Guangdong Bai, Zi Huang, Xin-Shun Xu:
Machine Unlearning for Image Retrieval: A Generative Scrubbing Approach. 237-245 - Jianfeng Dong, Xianke Chen, Minsong Zhang, Xun Yang, Shujie Chen, Xirong Li, Xun Wang:
Partially Relevant Video Retrieval. 246-257 - Fangxiong Xiao, Lixi Deng, Jingjing Chen, Houye Ji, Xiaorui Yang, Zhuoye Ding, Bo Long:
From Abstract to Details: A Generative Multimodal Fusion Framework for Recommendation. 258-267 - Weili Guan, Xuemeng Song, Haoyu Zhang, Meng Liu, Chung-Hsing Yeh, Xiaojun Chang:
Bi-directional Heterogeneous Graph Hashing towards Efficient Outfit Recommendation. 268-276 - MeiYu Liang, Junping Du, Xiaowen Cao, Yang Yu, Kangkang Lu, Zhe Xue, Min Zhang:
Semantic Structure Enhanced Contrastive Adversarial Hash Network for Cross-media Representation Learning. 277-285 - Dan Song, Yue Yang, Weizhi Nie, Xuanya Li, An-An Liu:
Cross-Domain 3D Model Retrieval Based On Contrastive Learning And Label Propagation. 286-295 - Zhixin Ma, Chong-Wah Ngo:
Interactive Video Corpus Moment Retrieval using Reinforcement Learning. 296-306 - Chao Huang, Yabo Liu, Zheng Zhang, Chengliang Liu, Jie Wen, Yong Xu, Yaowei Wang:
Hierarchical Graph Embedded Pose Regularity Learning via Spatio-Temporal Transformer for Abnormal Behavior Detection. 307-315 - Yue Zhao, Weizhi Nie, Zan Gao, Anan Liu:
HMTN: Hierarchical Multi-scale Transformer Network for 3D Shape Recognition. 316-324 - Peng-Fei Zhang, Zi Huang, Guangdong Bai, Xin-Shun Xu:
IDEAL: High-Order-Ensemble Adaptation Network for Learning with Noisy Labels. 325-333 - Yu Zheng, Chen Gao, Jingtao Ding, Lingling Yi, Depeng Jin, Yong Li, Meng Wang:
DVR: Micro-Video Recommendation Optimizing Watch-Time-Gain under Duration Bias. 334-345
Poster Session II: Engaging User with Multimedia -- Multimedia Search and Recommendation
- Bolin Zhang, Chao Yang, Bin Jiang, Xiaokang Zhou:
Video Moment Retrieval with Hierarchical Contrastive Learning. 346-355 - Avinash Madasu, Junier Oliva, Gedas Bertasius:
Learning to Retrieve Videos by Asking Questions. 356-365 - Jinan Sun, Haixin Wang, Xiao Luo, Shikun Zhang, Wei Xiang, Chong Chen, Xian-Sheng Hua:
HEART: Towards Effective Hash Codes under Label Noise. 366-375 - Zongshen Mu, Yueting Zhuang, Jie Tan, Jun Xiao, Siliang Tang:
Learning Hybrid Behavior Patterns for Multimedia Recommendation. 376-384 - Feiyu Chen, Junjie Wang, Yinwei Wei, Hai-Tao Zheng, Jie Shao:
Breaking Isolation: Multimodal Graph Fusion for Multimedia Recommendation by Edge-wise Modulation. 385-394 - Jianwei Zhu, Zhixin Li, Yufei Zeng, Jiahui Wei, Huifang Ma:
Image-Text Matching with Fine-Grained Relational Dependency and Bidirectional Attention-Based Generative Networks. 395-403 - Yuxi Sun, Shanshan Feng, Xutao Li, Yunming Ye, Jian Kang, Xu Huang:
Visual Grounding in Remote Sensing Images. 404-412 - Guolong Wang, Xun Wu, Zhaoyuan Liu, Junchi Yan:
Prompt-based Zero-shot Video Moment Retrieval. 413-421 - Yabing Wang, Jianfeng Dong, Tianxiang Liang, Minsong Zhang, Rui Cai, Xun Wang:
Cross-Lingual Cross-Modal Retrieval with Noise-Robust Learning. 422-433 - Ziyue Wang, Aozhu Chen, Fan Hu, Xirong Li:
Learn to Understand Negation in Video Retrieval. 434-443 - Yongjie Zhu, Chunhui Han, Yuefeng Zhan, Bochen Pang, Zhaoju Li, Hao Sun, Si Li, Boxin Shi, Nan Duan, Weiwei Deng, Ruofei Zhang, Liangjie Zhang, Qi Zhang:
AdsCVLR: Commercial Visual-Linguistic Representation Modeling in Sponsored Search. 444-452 - Junfeng Tu, Xueliang Liu, Zongxiang Lin, Richang Hong, Meng Wang:
Differentiable Cross-modal Hashing via Multimodal Transformers. 453-461 - Zhixin Ling, Zhen Xing, Jiangtong Li, Li Niu:
Multi-Level Region Matching for Fine-Grained Sketch-Based Image Retrieval. 462-470 - Xiaolin Zheng, Jiajie Su, Weiming Liu, Chaochao Chen:
DDGHM: Dual Dynamic Graph with Hybrid Metric Training for Cross-Domain Sequential Recommendation. 471-481 - Yong Zhuang, Tong Yu, Junda Wu, Shiqu Wu, Shuai Li:
Spatial-Temporal Aligned Multi-Agent Learning for Visual Dialog Systems. 482-490 - Huafeng Liu, Liping Jing, Dahai Yu, Mingjie Zhou, Michael Ng:
Learning Intrinsic and Extrinsic Intentions for Cold-start Recommendation with Neural Stochastic Processes. 491-500 - Pingting Hong, Dayan Wu, Bo Li, Weiping Wang:
Camera-specific Informative Data Augmentation Module for Unbalanced Person Re-identification. 501-510 - Zhiqiang Guo, Guohui Li, Jianjun Li, Huaicong Chen:
TopicVAE: Topic-aware Disentanglement Representation Learning for Enhanced Recommendation. 511-520 - Chao Huang, Chengliang Liu, Zheng Zhang, Zhihao Wu, Jie Wen, Qiuping Jiang, Yong Xu:
Pixel-Level Anomaly Detection via Uncertainty-aware Prototypical Transformer. 521-530 - Lei Tan, Pingyang Dai, Rongrong Ji, Yongjian Wu:
Dynamic Prototype Mask for Occluded Person Re-Identification. 531-540 - Nan Pu, Yu Liu, Wei Chen, Erwin M. Bakker, Michael S. Lew:
Meta Reconciliation Normalization for Lifelong Person Re-Identification. 541-549 - Lin Wang, Wanqian Zhang, Dayan Wu, Fei Zhu, Bo Li:
Attack is the Best Defense: Towards Preemptive-Protection Person Re-Identification. 550-559 - Kai Chen, Weihua Chen, Tao He, Rong Du, Fan Wang, Xiuyu Sun, Yuchen Guo, Guiguang Ding:
TAGPerson: A Target-Aware Generation Pipeline for Person Re-identification. 560-571 - Dayan Wu, Qinghang Su, Bo Li, Weiping Wang:
Efficient Hash Code Expansion by Recycling Old Bits. 572-580 - Desheng Cai, Shengsheng Qian, Quan Fang, Jun Hu, Changsheng Xu:
Adaptive Anti-Bottleneck Multi-Modal Graph Learning Network for Personalized Micro-video Recommendation. 581-590 - Uttaran Bhattacharya, Gang Wu, Stefano Petrangeli, Viswanathan Swaminathan, Dinesh Manocha:
Show Me What I Like: Detecting User-Specific Video Highlights Using Content-Based Multi-Head Attention. 591-600 - Kai Wang, Yifan Wang, Xing Xu, Xin Liu, Weihua Ou, Huimin Lu:
Prototype-based Selective Knowledge Distillation for Zero-Shot Sketch Based Image Retrieval. 601-609 - Siyuan Li, Xing Xu, Zailei Zhou, Yang Yang, Guoqing Wang, Heng Tao Shen:
ARRA: Absolute-Relative Ranking Attack against Image Retrieval. 610-618 - Xiaoyu Du, Zike Wu, Fuli Feng, Xiangnan He, Jinhui Tang:
Invariant Representation Learning for Multimedia Recommendation. 619-628 - Tianyuan Xu, Xueliang Liu, Zhen Huang, Dan Guo, Richang Hong, Meng Wang:
Early-Learning regularized Contrastive Learning for Cross-Modal Retrieval with Noisy Labels. 629-637 - Yiwei Ma, Guohai Xu, Xiaoshuai Sun, Ming Yan, Ji Zhang, Rongrong Ji:
X-CLIP: End-to-End Multi-grained Contrastive Learning for Video-Text Retrieval. 638-647 - Yi Zhong, Chengyao Wang, Shiyong Li, Zhu Zhou, Yaowei Wang, Wei-Shi Zheng:
Mixed Supervision for Instance Learning in Object Detection with Few-shot Annotation. 648-658 - Zeyu Ma, Wei Ju, Xiao Luo, Chong Chen, Xian-Sheng Hua, Guangming Lu:
Improved Deep Unsupervised Hashing via Prototypical Learning. 659-667 - Rui Wang, Feng Chen, Jun Tang, Pu Yan:
Adaptive Camera Margin for Mask-guided Domain Adaptive Person Re-identification. 668-677 - Shengshan Hu, Ziqi Zhou, Yechao Zhang, Leo Yu Zhang, Yifeng Zheng, Yuanyuan He, Hai Jin:
BadHash: Invisible Backdoor Attacks against Deep Hashing with Clean Label. 678-686 - Xiaohao Liu, Zhulin Tao, Jiahong Shao, Lifang Yang, Xianglin Huang:
EliMRec: Eliminating Single-modal Bias in Multimedia Recommendation. 687-695 - Zhicheng Sun, Yadong Mu:
Patch-based Knowledge Distillation for Lifelong Person Re-Identification. 696-707
Oral Session III: Engaging User with Multimedia -- Summarization, Analytics, and Storytelling
- Xiaodong Chen, Wu Liu, Xinchen Liu, Yongdong Zhang, Jungong Han, Tao Mei:
MAPLE: Masked Pseudo-Labeling autoEncoder for Semi-supervised Point Cloud Action Recognition. 708-718 - Xun Jiang, Xing Xu, Zhiguo Chen, Jingran Zhang, Jingkuan Song, Fumin Shen, Huimin Lu, Heng Tao Shen:
DHHN: Dual Hierarchical Hybrid Network for Weakly-Supervised Audio-Visual Video Parsing. 719-727
Poster Session III: Engaging User with Multimedia -- Summarization, Analytics, and Storytelling
- Dixin Luo, Yutong Wang, Angxiao Yue, Hongteng Xu:
Weakly-Supervised Temporal Action Alignment Driven by Unbalanced Spectral Fused Gromov-Wasserstein Distance. 728-739 - Jiehang Xie, Xuanbai Chen, Shao-Ping Lu, Yulu Yang:
A Knowledge Augmented and Multimodal-Based Framework for Video Summarization. 740-749 - Dizhan Xue, Shengsheng Qian, Quan Fang, Changsheng Xu:
MMT: Image-guided Story Ending Generation with Multimodal Memory Transformer. 750-758 - Niankai Zhang, Junli Zhao, Fuqing Duan, Zhenkuan Pan, Zhongke Wu, Mingquan Zhou, Xianfeng Gu:
An End-to-End Conditional Generative Adversarial Network Based on Depth Map for 3D Craniofacial Reconstruction. 759-768 - Bowen Li, Philip H. S. Torr, Thomas Lukasiewicz:
Clustering Generative Adversarial Networks for Story Visualization. 769-778 - Jiayin Cai, Changlin Li, Xin Tao, Chun Yuan, Yu-Wing Tai:
DeViT: Deformed Vision Transformers in Video Inpainting. 779-789 - Ming Yao, Yu Bai, Wei Du, Xuejun Zhang, Heng Quan, Fuli Cai, Hongwei Kang:
Multi-Level Spatiotemporal Network for Video Summarization. 790-798
Oral Session IV: Experience -- Interactions and Quality of Experience
- Li Yang, Mai Xu, Tie Liu, Liangyu Huo, Xinbo Gao:
TVFormer: Trajectory-guided Visual Quality Assessment on 360° Images with Transformers. 799-808 - Zheng Lin, Zheng-Peng Duan, Zhao Zhang, Chun-Le Guo, Ming-Ming Cheng:
KnifeCut: Refining Thin Part Segmentation with Cutting Lines. 809-817 - Minju Kim, Yuhyun Lee, Jungjin Lee:
Multi-view Layout Design for VR Concert Experience. 818-826 - Kui Jiang, Zhongyuan Wang, Chen Chen, Zheng Wang, Laizhong Cui, Chia-Wen Lin:
Magic ELF: Image Deraining Meets Association Learning and Transformer. 827-836 - Liang Liao, Kangmin Xu, Haoning Wu, Chaofeng Chen, Wenxiu Sun, Qiong Yan, Weisi Lin:
Exploring the Effectiveness of Video Perceptual Representation in Blind Video Quality Assessment. 837-846 - Mengshun Hu, Kui Jiang, Zhixiang Nie, Zheng Wang:
You Only Align Once: Bidirectional Interaction for Spatial-Temporal Video Super-Resolution. 847-855 - Wei Sun, Xiongkuo Min, Wei Lu, Guangtao Zhai:
A Deep Learning based No-reference Quality Assessment Model for UGC Videos. 856-865
Poster Session IV: Experience - Interactions and Quality of Experience
- Szu-Wei Fu, Yaran Fan, Yasaman Hosseinkashi, Jayant Gupchup, Ross Cutler:
Improving Meeting Inclusiveness using Speech Interruption Analysis. 887-895 - Yaohui Li, Yuzhe Yang, Huaxiong Li, Haoxing Chen, Liwu Xu, Leida Li, Yaqian Li, Yandong Guo:
Transductive Aesthetic Preference Propagation for Personalized Image Aesthetics Assessment. 896-904 - Zheng Lin, Zhao Zhang, Linghao Han, Shao-Ping Lu:
Multi-Mode Interactive Image Segmentation. 905-914 - Nasim Jamshidi Avanaki, Steven Schmidt, Thilo Michael, Saman Zadtootaghaj, Sebastian Möller:
Deep-BVQM: A Deep-learning Bitstream-based Video Quality Model. 915-923 - Anton Ratnarajah, Zhenyu Tang, Rohith Aralikatti, Dinesh Manocha:
MESH2IR: Neural Acoustic Impulse Response Generator for Complex 3D Scenes. 924-933 - Wei Zhou, Zhou Wang:
Quality Assessment of Image Super-Resolution: Balancing Deterministic and Statistical Fidelity. 934-942 - Chaofan Zhang, Shiguang Liu:
No-reference Omnidirectional Image Quality Assessment Based on Joint Network. 943-951 - Abhishek Kumar, Lik-Hang Lee, Jagmohan Chauhan, Xiang Su, Mohammad Ashraful Hoque, Susanna Pirttikangas, Sasu Tarkoma, Pan Hui:
PassWalk: Spatial Authentication Leveraging Lateral Shift and Gaze on Mobile Headsets. 952-960 - Jun Fu, Chen Hou, Wei Zhou, Jiahua Xu, Zhibo Chen:
Adaptive Hypergraph Convolutional Network for No-Reference 360-degree Image Quality Assessment. 961-969 - Xingran Liao, Baoliang Chen, Hanwei Zhu, Shiqi Wang, Mingliang Zhou, Sam Kwong:
DeepWSD: Projecting Degradations in Perceptual Space to Wasserstein Distance in Deep Feature Space. 970-978 - Bohua Peng, Mobarakol Islam, Mei Tu:
Angular Gap: Reducing the Uncertainty of Image Difficulty through Model Calibration. 979-987 - Min Wang, Hao Yang, Qing Cheng:
GCL: Graph Calibration Loss for Trustworthy Graph Neural Network. 988-996 - Yixuan Gao, Xiongkuo Min, Yucheng Zhu, Jing Li, Xiao-Ping Zhang, Guangtao Zhai:
Image Quality Assessment: From Mean Opinion Score to Opinion Score Distribution. 997-1005 - Zihan Zhou, Yong Xu, Ruotao Xu, Yuhui Quan:
No-Reference Image Quality Assessment Using Dynamic Complex-Valued Neural Model. 1006-1015 - Tong Shao, Deming Zhai, Junjun Jiang, Xianming Liu:
Hybrid Conditional Deep Inverse Tone Mapping. 1016-1024 - Yili Jin, Junhua Liu, Fangxin Wang, Shuguang Cui:
Where Are You Looking?: A Large-Scale Dataset of Head and Gaze Behavior for 360-Degree Videos and a Pilot Study. 1025-1034
Oral Session V: Experience -- Art and Culture
- Zhengyan Tong, Xiaohang Wang, Shengchao Yuan, Xuanhong Chen, Junjie Wang, Xiangzhong Fang:
Im2Oil: Stroke-Based Oil Painting Rendering with Linearly Controllable Fineness Via Adaptive Sampling. 1035-1046 - Chen Zhang, LuChin Chang, Songruoyao Wu, Xu Tan, Tao Qin, Tie-Yan Liu, Kejun Zhang:
ReLyMe: Improving Lyric-to-Melody Generation by Incorporating Lyric-Melody Relationships. 1047-1056 - Zihao Wang, Kejun Zhang, Yuxing Wang, Chen Zhang, Qihao Liang, Pengfei Yu, Yongsheng Feng, Wenbo Liu, Yikai Wang, Yuntai Bao, Yiheng Yang:
SongDriver: Real-time Music Accompaniment Generation without Logical Latency nor Exposure Bias. 1057-1067 - Yijun Wang, Tao Liang, Jianxin Lin:
CACOLIT: Cross-domain Adaptive Co-learning for Imbalanced Image-to-Image Translation. 1068-1076 - Kyungwon Lee, Yu-Kyung Jang, Jaewoo Jung, Dong Hwan Kim, Hyun-Jean Lee, Seung Ah Lee:
EuglPollock: Rethinking Interspecies Collaboration through Art Making. 1077-1084
Poster Session V: Experience -- Art and Culture
- Nisha Huang, Fan Tang, Weiming Dong, Changsheng Xu:
Draw Your Art Dream: Diverse Digital Art Synthesis with Multimodal Guided Diffusion. 1085-1094 - Zhizhong Wang, Zhanjie Zhang, Lei Zhao,