Timezone: »
Cross-modal retrieval between videos and texts has gained increasing interest because of the rapid emergence of videos on the web. Generally, a video contains rich instance and event information and the query text only describes a part of the information. Thus, a video can have multiple different text descriptions and queries. We call it the Video-Text Correspondence Ambiguity problem. Current techniques mostly concentrate on mining local or multi-level alignment between contents of video and text (e.g., object to entity and action to verb). It is difficult for these methods to alleviate video-text correspondence ambiguity by describing a video using only one feature, which is required to be matched with multiple different text features at the same time. To address this problem, we propose a Text-Adaptive Multiple Visual Prototype Matching Model. It automatically captures multiple prototypes to describe a video by adaptive aggregation on video token features. Given a query text, the similarity is determined by the most similar prototype to find correspondence in the video, which is called text-adaptive matching. To learn diverse prototypes for representing the rich information in videos, we propose a variance loss to encourage different prototypes to attend to different contents of the video. Our method outperforms the state-of-the-art methods on four public video retrieval datasets.
Author Information
Chengzhi Lin (SUN YAT-SEN UNIVERSITY)
Ancong Wu (SUN YAT-SEN UNIVERSITY)
Junwei Liang (Hong Kong University of Science and Technology (Guangzhou))

I am an assistant professor at The Hong Kong University of Science and Technology (Guangzhou campus) in the AI Thrust. I am interested in building AI systems that can understand and predict human behaviors. I received my Ph.D. from CMU. Please see these [projects](https://junweiliang.me/projects.html#projects) for an overview. My mission: develop AI technologies for social good.
Jun Zhang (Tencent Youtu Lab)
Wenhang Ge (SUN YAT-SEN UNIVERSITY)
Wei-Shi Zheng (SUN YAT-SEN UNIVERSITY)
Chunhua Shen (University of Adelaide)
More from the Same Authors
-
2022 Poster: Multi-dataset Training of Transformers for Robust Action Recognition »
Junwei Liang · Enwei Zhang · Jun Zhang · Chunhua Shen -
2022 Poster: SegViT: Semantic Segmentation with Plain Vision Transformers »
Bowen Zhang · Zhi Tian · Quan Tang · Xiangxiang Chu · Xiaolin Wei · Chunhua Shen · Yifan liu -
2022 Poster: Fully Convolutional One-Stage 3D Object Detection on LiDAR Range Images »
Zhi Tian · Xiangxiang Chu · Xiaoming Wang · Xiaolin Wei · Chunhua Shen -
2022 Poster: PyramidCLIP: Hierarchical Feature Alignment for Vision-language Model Pretraining »
Yuting Gao · Jinfeng Liu · Zihan Xu · Jun Zhang · Ke Li · Rongrong Ji · Chunhua Shen -
2023 Poster: Temporal Continual Learning with Prior Compensation for Human Motion Prediction »
Jianwei Tang · Jiangxin Sun · Xiaotong Lin · lifang zhang · Jian-Fang Hu · Wei-Shi Zheng -
2023 Poster: Diversifying Spatial-Temporal Perception for Video Domain Generalization »
Kun-Yu Lin · Jia-Run Du · Yipeng Gao · Jiaming Zhou · Wei-Shi Zheng -
2023 Poster: Inner-Outer Aware Reconstruction Model for Monocular 3D Scene Reconstruction »
Yu-Kun Qiu · Guo-Hao Xu · Wei-Shi Zheng -
2023 Poster: DatasetDM: Synthesizing Data with Perception Annotations Using Diffusion Models »
威佳 吴 · Yuzhong Zhao · Hao Chen · Yuchao Gu · Rui Zhao · Yefei He · Hong Zhou · Mike Zheng Shou · Chunhua Shen -
2022 Spotlight: Lightning Talks 6A-4 »
Xiu-Shen Wei · Konstantina Dritsa · Guillaume Huguet · ABHRA CHAUDHURI · Zhenbin Wang · Kevin Qinghong Lin · Yutong Chen · Jianan Zhou · Yongsen Mao · Junwei Liang · Jinpeng Wang · Mao Ye · Yiming Zhang · Aikaterini Thoma · H.-Y. Xu · Daniel Sumner Magruder · Enwei Zhang · Jianing Zhu · Ronglai Zuo · Massimiliano Mancini · Hanxiao Jiang · Jun Zhang · Fangyun Wei · Faen Zhang · Ioannis Pavlopoulos · Zeynep Akata · Xiatian Zhu · Jingfeng ZHANG · Alexander Tong · Mattia Soldan · Chunhua Shen · Yuxin Peng · Liuhan Peng · Michael Wray · Tongliang Liu · Anjan Dutta · Yu Wu · Oluwadamilola Fasina · Panos Louridas · Angel Chang · Manik Kuchroo · Manolis Savva · Shujie LIU · Wei Zhou · Rui Yan · Gang Niu · Liang Tian · Bo Han · Eric Z. XU · Guy Wolf · Yingying Zhu · Brian Mak · Difei Gao · Masashi Sugiyama · Smita Krishnaswamy · Rong-Cheng Tu · Wenzhe Zhao · Weijie Kong · Chengfei Cai · WANG HongFa · Dima Damen · Bernard Ghanem · Wei Liu · Mike Zheng Shou -
2022 Spotlight: Multi-dataset Training of Transformers for Robust Action Recognition »
Junwei Liang · Enwei Zhang · Jun Zhang · Chunhua Shen -
2022 Spotlight: Lightning Talks 3A-4 »
Jinzhi Zhang · Hao Jiang · Hongrui Cai · Qi Yi · Yang Jin · Zhi Tian · Rui Zhang · Wanquan Feng · Xiangxiang Chu · Ruofan Tang · yongzhi li · Yadong Mu · Zehuan Yuan · shaohui peng · Zheng Cao · Xiaoming Wang · Xuetao Feng · Xiaolin Wei · Jiaming Guo · Yadong Mu · Yan Wang · Jing Xiao · Xing Hu · Chunhua Shen · Ruqi Huang · Juyong Zhang · Zidong Du · LU FANG · xishan zhang · Qi Guo · Yunji Chen -
2022 Spotlight: Fully Convolutional One-Stage 3D Object Detection on LiDAR Range Images »
Zhi Tian · Xiangxiang Chu · Xiaoming Wang · Xiaolin Wei · Chunhua Shen -
2022 Poster: Adv-Attribute: Inconspicuous and Transferable Adversarial Attack on Face Recognition »
Shuai Jia · Bangjie Yin · Taiping Yao · Shouhong Ding · Chunhua Shen · Xiaokang Yang · Chao Ma -
2022 Poster: DENSE: Data-Free One-Shot Federated Learning »
Jie Zhang · Chen Chen · Bo Li · Lingjuan Lyu · Shuang Wu · Shouhong Ding · Chunhua Shen · Chao Wu -
2022 Poster: Hierarchical Normalization for Robust Monocular Depth Estimation »
Chi Zhang · Wei Yin · Billzb Wang · Gang Yu · BIN FU · Chunhua Shen -
2021 Poster: Action-guided 3D Human Motion Prediction »
Jiangxin Sun · Zihang Lin · Xintong Han · Jian-Fang Hu · Jia Xu · Wei-Shi Zheng -
2014 Poster: Encoding High Dimensional Local Features by Sparse Coding Based Fisher Vectors »
Lingqiao Liu · Chunhua Shen · Lei Wang · Anton van den Hengel · Chao Wang -
2009 Poster: Positive Semidefinite Metric Learning with Boosting »
Chunhua Shen · Junae Kim · Lei Wang · Anton van den Hengel -
2008 Poster: PSDBoost: Matrix-Generation Linear Programming for Positive Semidefinite Matrices Learning »
Chunhua Shen · Alan Welsh · Lei Wang