Timezone: »
Tensor-based multimodal fusion techniques have exhibited great predictive performance. However, one limitation is that existing approaches only consider bilinear or trilinear pooling, which fails to unleash the complete expressive power of multilinear fusion with restricted orders of interactions. More importantly, simply fusing features all at once ignores the complex local intercorrelations, leading to the deterioration of prediction. In this work, we first propose a polynomial tensor pooling (PTP) block for integrating multimodal features by considering high-order moments, followed by a tensorized fully connected layer. Treating PTP as a building block, we further establish a hierarchical polynomial fusion network (HPFN) to recursively transmit local correlations into global ones. By stacking multiple PTPs, the expressivity capacity of HPFN enjoys an exponential growth w.r.t. the number of layers, which is shown by the equivalence to a very deep convolutional arithmetic circuits. Various experiments demonstrate that it can achieve the state-of-the-art performance.
Author Information
Ming Hou (RIKEN AIP)
Jiajia Tang (Hangzhou Dianzi University / RIKEN AIP)
Jianhai Zhang (Hangzhou Dianzi University)
Wanzeng Kong (Hangzhou Dianzi University)
Qibin Zhao (RIKEN AIP)
More from the Same Authors
-
2020 Workshop: First Workshop on Quantum Tensor Networks in Machine Learning »
Xiao-Yang Liu · Qibin Zhao · Jacob Biamonte · Cesar F Caiafa · Paul Pu Liang · Nadav Cohen · Stefan Leichenauer -
2011 Poster: A Multilinear Subspace Regression Method Using Orthogonal Tensors Decompositions »
Qibin Zhao · Cesar F Caiafa · Danilo Mandic · Liqing Zhang · Tonio Ball · Andreas Schulze-bonhage · Andrzej S CICHOCKI -
2011 Spotlight: A Multilinear Subspace Regression Method Using Orthogonal Tensors Decompositions »
Qibin Zhao · Cesar F Caiafa · Danilo Mandic · Liqing Zhang · Tonio Ball · Andreas Schulze-bonhage · Andrzej S CICHOCKI