Timezone: »
Poster
Wukong: A 100 Million Large-scale Chinese Cross-modal Pre-training Benchmark
Jiaxi Gu · Xiaojun Meng · Guansong Lu · Lu Hou · Niu Minzhe · Xiaodan Liang · Lewei Yao · Runhui Huang · Wei Zhang · Xin Jiang · Chunjing XU · Hang Xu
Vision-Language Pre-training (VLP) models have shown remarkable performance on various downstream tasks. Their success heavily relies on the scale of pre-trained cross-modal datasets. However, the lack of large-scale datasets and benchmarks in Chinese hinders the development of Chinese VLP models and broader multilingual applications. In this work, we release a large-scale Chinese cross-modal dataset named Wukong, which contains 100 million Chinese image-text pairs collected from the web. Wukong aims to benchmark different multi-modal pre-training methods to facilitate the VLP research and community development. Furthermore, we release a group of models pre-trained with various image encoders (ViT-B/ViT-L/SwinT) and also apply advanced pre-training techniques into VLP such as locked-image text tuning, token-wise similarity in contrastive learning, and reduced-token interaction. Extensive experiments and a benchmarking of different downstream tasks including a new largest human-verified image-text test dataset are also provided. Experiments show that Wukong can serve as a promising Chinese pre-training dataset and benchmark for different cross-modal learning methods. For the zero-shot image classification task on 10 datasets, $Wukong_\text{ViT-L}$ achieves an average accuracy of 73.03%. For the image-text retrieval task, it achieves a mean recall of 71.6% on AIC-ICC which is 12.9% higher than WenLan 2.0. Also, our Wukong models are benchmarked on downstream tasks with other variants on multiple datasets, e.g., Flickr8K-CN, Flickr-30K-CN, COCO-CN, et al. More information can be referred to https://wukong-dataset.github.io/wukong-dataset/.
Author Information
Jiaxi Gu (Huawei Noah's Ark Lab)
Xiaojun Meng (Huawei Technologies Ltd.)
Guansong Lu (Huawei Noah‘s Ark Lab)
Lu Hou (Huawei Technologies Co., Ltd)
Niu Minzhe (Huawei)
Xiaodan Liang (Sun Yat-sen University)
Lewei Yao (Harbin Institute of Technology)
Runhui Huang (SUN YAT-SEN UNIVERSITY)
Wei Zhang (Noah's Ark Lab, Huawei Inc.)
Xin Jiang (Noah’s Ark Lab, Huawei Technologies)
Chunjing XU (Huawei Technologies)
Hang Xu (Huawei Noah’s Ark Lab)
More from the Same Authors
-
2021 : One Million Scenes for Autonomous Driving: ONCE Dataset »
Jiageng Mao · Niu Minzhe · ChenHan Jiang · hanxue liang · Jingheng Chen · Xiaodan Liang · Yamin Li · Chaoqiang Ye · Wei Zhang · Zhenguo Li · Jie Yu · Hang Xu · Chunjing XU -
2021 Spotlight: SOFT: Softmax-free Transformer with Linear Complexity »
Jiachen Lu · Jinghan Yao · Junge Zhang · Xiatian Zhu · Hang Xu · Weiguo Gao · Chunjing XU · Tao Xiang · Li Zhang -
2021 : FFA-IR: Towards an Explainable and Reliable Medical Report Generation Benchmark »
Mingjie Li · Wenjia Cai · Rui Liu · Yuetian Weng · Xiaoyun Zhao · Cong Wang · Xin Chen · Zhong Liu · Caineng Pan · Mengke Li · yingfeng zheng · Yizhi Liu · Flora Salim · Karin Verspoor · Xiaodan Liang · Xiaojun Chang -
2021 : IconQA: A New Benchmark for Abstract Diagram Understanding and Visual Language Reasoning »
Pan Lu · Liang Qiu · Jiaqi Chen · Tanglin Xia · Yizhou Zhao · Wei Zhang · Zhou Yu · Xiaodan Liang · Song-Chun Zhu -
2021 : SODA10M: A Large-Scale 2D Self/Semi-Supervised Object Detection Dataset for Autonomous Driving »
Jianhua Han · Xiwen Liang · Hang Xu · Kai Chen · Lanqing Hong · Jiageng Mao · Chaoqiang Ye · Wei Zhang · Zhenguo Li · Xiaodan Liang · Chunjing XU -
2021 : Theorem-Aware Geometry Problem Solving with Symbolic Reasoning and Theorem Prediction »
Pan Lu · Ran Gong · Shibiao Jiang · Liang Qiu · Siyuan Huang · Xiaodan Liang · Song-Chun Zhu · Ran Gong -
2021 : Towards Diagram Understanding and Cognitive Reasoning in Icon Question Answering »
Pan Lu · Liang Qiu · Jiaqi Chen · Tanglin Xia · Yizhou Zhao · Wei Zhang · Zhou Yu · Xiaodan Liang · Song-Chun Zhu -
2021 : Geometric Question Answering Towards Multimodal Numerical Reasoning »
Jiaqi Chen · Jianheng Tang · Jinghui Qin · Xiaodan Liang · Lingbo Liu · Eric Xing · Liang Lin -
2022 Poster: Towards Hard-pose Virtual Try-on via 3D-aware Global Correspondence Learning »
Zaiyu Huang · Hanhui Li · Zhenyu Xie · Michael Kampffmeyer · qingling Cai · Xiaodan Liang -
2022 Spotlight: Lightning Talks 4B-3 »
Zicheng Zhang · Mancheng Meng · Antoine Guedon · Yue Wu · Wei Mao · Zaiyu Huang · Peihao Chen · Shizhe Chen · Yongwei Chen · Keqiang Sun · Yi Zhu · chen rui · Hanhui Li · Dongyu Ji · Ziyan Wu · miaomiao Liu · Pascal Monasse · Yu Deng · Shangzhe Wu · Pierre-Louis Guhur · Jiaolong Yang · Kunyang Lin · Makarand Tapaswi · Zhaoyang Huang · Terrence Chen · Jiabao Lei · Jianzhuang Liu · Vincent Lepetit · Zhenyu Xie · Richard I Hartley · Dinggang Shen · Xiaodan Liang · Runhao Zeng · Cordelia Schmid · Michael Kampffmeyer · Mathieu Salzmann · Ning Zhang · Fangyun Wei · Yabin Zhang · Fan Yang · Qifeng Chen · Wei Ke · Quan Wang · Thomas Li · qingling Cai · Kui Jia · Ivan Laptev · Mingkui Tan · Xin Tong · Hongsheng Li · Xiaodan Liang · Chuang Gan -
2022 Panel: Panel 4C-3: Wukong: A 100… & Addressing Resource Scarcity… »
Jiaxi Gu · Gokul NC -
2022 Spotlight: Towards Hard-pose Virtual Try-on via 3D-aware Global Correspondence Learning »
Zaiyu Huang · Hanhui Li · Zhenyu Xie · Michael Kampffmeyer · qingling Cai · Xiaodan Liang -
2022 Spotlight: CoupAlign: Coupling Word-Pixel with Sentence-Mask Alignments for Referring Image Segmentation »
Zicheng Zhang · Yi Zhu · Jianzhuang Liu · Xiaodan Liang · Wei Ke -
2022 : Fine-grained Interactive Vision Language Pre-training »
Lu Hou · Lu Hou -
2022 Poster: Towards Efficient Post-training Quantization of Pre-trained Language Models »
Haoli Bai · Lu Hou · Lifeng Shang · Xin Jiang · Irwin King · Michael R Lyu -
2022 Poster: Structure-Preserving 3D Garment Modeling with Neural Sewing Machines »
Xipeng Chen · Guangrun Wang · Dizhong Zhu · Xiaodan Liang · Philip Torr · Liang Lin -
2022 Poster: DetCLIP: Dictionary-Enriched Visual-Concept Paralleled Pre-training for Open-world Detection »
Lewei Yao · Jianhua Han · Youpeng Wen · Xiaodan Liang · Dan Xu · Wei Zhang · Zhenguo Li · Chunjing XU · Hang Xu -
2022 Poster: Effective Adaptation in Multi-Task Co-Training for Unified Autonomous Driving »
Xiwen Liang · Yangxin Wu · Jianhua Han · Hang Xu · Chunjing XU · Xiaodan Liang -
2022 Poster: CoupAlign: Coupling Word-Pixel with Sentence-Mask Alignments for Referring Image Segmentation »
Zicheng Zhang · Yi Zhu · Jianzhuang Liu · Xiaodan Liang · Wei Ke -
2021 Workshop: Math AI for Education (MATHAI4ED): Bridging the Gap Between Research and Smart Education »
Pan Lu · Yuhuai Wu · Sean Welleck · Xiaodan Liang · Eric Xing · James McClelland -
2021 : Panel Discussion »
Pascal Poupart · Ali Ghodsi · Luke Zettlemoyer · Sameer Singh · Kevin Duh · Yejin Choi · Lu Hou -
2021 : Compression and Acceleration of Pre-trained Language Models »
Lu Hou -
2021 Poster: SOFT: Softmax-free Transformer with Linear Complexity »
Jiachen Lu · Jinghan Yao · Junge Zhang · Xiatian Zhu · Hang Xu · Weiguo Gao · Chunjing XU · Tao Xiang · Li Zhang -
2021 Poster: Towards Scalable Unpaired Virtual Try-On via Patch-Routed Spatially-Adaptive GAN »
Zhenyu Xie · Zaiyu Huang · Fuwei Zhao · Haoye Dong · Michael Kampffmeyer · Xiaodan Liang -
2021 Poster: Post-Training Quantization for Vision Transformer »
Zhenhua Liu · Yunhe Wang · Kai Han · Wei Zhang · Siwei Ma · Wen Gao -
2021 Poster: Transformer in Transformer »
Kai Han · An Xiao · Enhua Wu · Jianyuan Guo · Chunjing XU · Yunhe Wang -
2021 Poster: An Empirical Study of Adder Neural Networks for Object Detection »
Xinghao Chen · Chang Xu · Minjing Dong · Chunjing XU · Yunhe Wang -
2021 Poster: Learning Frequency Domain Approximation for Binary Neural Networks »
Yixing Xu · Kai Han · Chang Xu · Yehui Tang · Chunjing XU · Yunhe Wang -
2021 Poster: S$^3$: Sign-Sparse-Shift Reparametrization for Effective Training of Low-bit Shift Networks »
Xinlin Li · Bang Liu · Yaoliang Yu · Wulong Liu · Chunjing XU · Vahid Partovi Nia -
2021 Oral: Learning Frequency Domain Approximation for Binary Neural Networks »
Yixing Xu · Kai Han · Chang Xu · Yehui Tang · Chunjing XU · Yunhe Wang -
2020 Poster: SCOP: Scientific Control for Reliable Neural Network Pruning »
Yehui Tang · Yunhe Wang · Yixing Xu · Dacheng Tao · Chunjing XU · Chao Xu · Chang Xu -
2020 Poster: Kernel Based Progressive Distillation for Adder Neural Networks »
Yixing Xu · Chang Xu · Xinghao Chen · Wei Zhang · Chunjing XU · Yunhe Wang -
2020 Poster: Model Rubik’s Cube: Twisting Resolution, Depth and Width for TinyNets »
Kai Han · Yunhe Wang · Qiulin Zhang · Wei Zhang · Chunjing XU · Tong Zhang -
2020 Spotlight: Kernel Based Progressive Distillation for Adder Neural Networks »
Yixing Xu · Chang Xu · Xinghao Chen · Wei Zhang · Chunjing XU · Yunhe Wang -
2020 Poster: Residual Distillation: Towards Portable Deep Neural Networks without Shortcuts »
Guilin Li · Junlei Zhang · Yunhe Wang · Chuanjian Liu · Matthias Tan · Yunfeng Lin · Wei Zhang · Jiashi Feng · Tong Zhang -
2020 Poster: Searching for Low-Bit Weights in Quantized Neural Networks »
Zhaohui Yang · Yunhe Wang · Kai Han · Chunjing XU · Chao Xu · Dacheng Tao · Chang Xu -
2020 Poster: AutoSync: Learning to Synchronize for Data-Parallel Distributed Deep Learning »
Hao Zhang · Yuan Li · Zhijie Deng · Xiaodan Liang · Lawrence Carin · Eric Xing -
2020 Poster: Auto-Panoptic: Cooperative Multi-Component Architecture Search for Panoptic Segmentation »
Yangxin Wu · Gengwei Zhang · Hang Xu · Xiaodan Liang · Liang Lin -
2020 Poster: Towards Interpretable Natural Language Understanding with Explanations as Latent Variables »
Wangchunshu Zhou · Jinyi Hu · Hanlin Zhang · Xiaodan Liang · Maosong Sun · Chenyan Xiong · Jian Tang -
2020 Poster: DynaBERT: Dynamic BERT with Adaptive Width and Depth »
Lu Hou · Zhiqi Huang · Lifeng Shang · Xin Jiang · Xiao Chen · Qun Liu -
2020 Spotlight: DynaBERT: Dynamic BERT with Adaptive Width and Depth »
Lu Hou · Zhiqi Huang · Lifeng Shang · Xin Jiang · Xiao Chen · Qun Liu -
2019 Poster: Heterogeneous Graph Learning for Visual Commonsense Reasoning »
Weijiang Yu · Jingwen Zhou · Weihao Yu · Xiaodan Liang · Nong Xiao -
2019 Spotlight: Heterogeneous Graph Learning for Visual Commonsense Reasoning »
Weijiang Yu · Jingwen Zhou · Weihao Yu · Xiaodan Liang · Nong Xiao -
2019 Poster: Normalization Helps Training of Quantized LSTM »
Lu Hou · Jinhua Zhu · James Kwok · Fei Gao · Tao Qin · Tie-Yan Liu -
2019 Poster: Positive-Unlabeled Compression on the Cloud »
Yixing Xu · Yunhe Wang · Hanting Chen · Kai Han · Chunjing XU · Dacheng Tao · Chang Xu -
2018 Poster: Symbolic Graph Reasoning Meets Convolutions »
Xiaodan Liang · Zhiting Hu · Hao Zhang · Liang Lin · Eric Xing -
2018 Poster: Deep Generative Models with Learnable Knowledge Constraints »
Zhiting Hu · Zichao Yang · Russ Salakhutdinov · LIANHUI Qin · Xiaodan Liang · Haoye Dong · Eric Xing -
2018 Poster: Hybrid Retrieval-Generation Reinforced Agent for Medical Image Report Generation »
Yuan Li · Xiaodan Liang · Zhiting Hu · Eric Xing -
2018 Poster: Hybrid Knowledge Routed Modules for Large-scale Object Detection »
ChenHan Jiang · Hang Xu · Xiaodan Liang · Liang Lin -
2018 Poster: Soft-Gated Warping-GAN for Pose-Guided Person Image Synthesis »
Haoye Dong · Xiaodan Liang · Ke Gong · Hanjiang Lai · Jia Zhu · Jian Yin -
2018 Poster: Learning Versatile Filters for Efficient Convolutional Neural Networks »
Yunhe Wang · Chang Xu · Chunjing XU · Chao Xu · Dacheng Tao -
2017 Poster: Structured Generative Adversarial Networks »
Zhijie Deng · Hao Zhang · Xiaodan Liang · Luona Yang · Shizhen Xu · Jun Zhu · Eric Xing -
2016 Poster: Tree-Structured Reinforcement Learning for Sequential Object Localization »
Zequn Jie · Xiaodan Liang · Jiashi Feng · Xiaojie Jin · Wen Lu · Shuicheng Yan