Timezone: »

First Workshop on Quantum Tensor Networks in Machine Learning
Xiao-Yang Liu · Qibin Zhao · Jacob Biamonte · Cesar Caiafa · Paul Pu Liang · Nadav Cohen · Stefan Leichenauer

Fri Dec 11 08:00 AM -- 07:00 PM (PST) @ None
Event URL: https://tensorworkshop.github.io/NeurIPS2020/ »

Quantum tensor networks in machine learning (QTNML) are envisioned to have great potential to advance AI technologies. Quantum machine learning promises quantum advantages (potentially exponential speedups in training, quadratic speedup in convergence, etc.) over classical machine learning, while tensor networks provide powerful simulations of quantum machine learning algorithms on classical computers. As a rapidly growing interdisciplinary area, QTNML may serve as an amplifier for computational intelligence, a transformer for machine learning innovations, and a propeller for AI industrialization.

Tensor networks, a contracted network of factor tensors, have arisen independently in several areas of science and engineering. Such networks appear in the description of physical processes and an accompanying collection of numerical techniques have elevated the use of quantum tensor networks into a variational model of machine learning. Underlying these algorithms is the compression of high-dimensional data needed to represent quantum states of matter. These compression techniques have recently proven ripe to apply to many traditional problems faced in deep learning. Quantum tensor networks have shown significant power in compactly representing deep neural networks, and efficient training and theoretical understanding of deep neural networks. More potential QTNML technologies are rapidly emerging, such as approximating probability functions, and probabilistic graphical models. However, the topic of QTNML is relatively young and many open problems are still to be explored.

Quantum algorithms are typically described by quantum circuits (quantum computational networks). These networks are indeed a class of tensor networks, creating an evident interplay between classical tensor network contraction algorithms and executing tensor contractions on quantum processors. The modern field of quantum enhanced machine learning has started to utilize several tools from tensor network theory to create new quantum models of machine learning and to better understand existing ones.

The interplay between tensor networks, machine learning and quantum algorithms is rich. Indeed, this interplay is based not just on numerical methods but on the equivalence of tensor networks to various quantum circuits, rapidly developing algorithms from the mathematics and physics communities for optimizing and transforming tensor networks, and connections to low-rank methods for learning. A merger of tensor network algorithms with state-of-the-art approaches in deep learning is now taking place. A new community is forming, which this workshop aims to foster.

Fri 8:00 a.m. - 8:10 a.m.

A short introduction

Xiao-Yang Liu
Fri 8:10 a.m. - 8:50 a.m.


Amnon Shashua, Nadav Cohen
Fri 8:50 a.m. - 9:30 a.m.


Animashree Anandkumar
Fri 9:30 a.m. - 10:10 a.m.

In this talk, I will cover recent results in two areas: 1) Using quantum-inspired methods in machine learning, including using low-entanglement states (matrix product states/tensor train decompositions) for different regression and classification tasks. 2) Using machine learning methods for efficient classical simulation of quantum systems. I will cover our results on simulating quantum circuits on parallel computers using graph-based algorithms, and also efficient numerical methods for optimization using tensor-trains for the computational of large number (up to B=100) on GPUs. The code is a combination of classical linear algebra algorithms, Riemannian optimization methods and efficient software implementation in TensorFlow.

  1. Rakhuba, M., Novikov, A. and Oseledets, I., 2019. Low-rank Riemannian eigensolver for high-dimensional Hamiltonians. Journal of Computational Physics, 396, pp.718-737.
  2. Schutski, Roman, Danil Lykov, and Ivan Oseledets. Adaptive algorithm for quantum circuit simulation. Physical Review A 101, no. 4 (2020): 042335.
  3. Khakhulin, Taras, Roman Schutski, and Ivan Oseledets. Graph Convolutional Policy for Solving Tree Decomposition via Reinforcement Learning Heuristics. arXiv preprint arXiv:1910.08371 (2019).
Ivan Oseledets
Fri 10:10 a.m. - 10:50 a.m.

A hundred years have passed since Ising model was proposed by Lenz in 1920. One finds that the square lattice Ising model is already an example of two-dimensional tensor network (TN), which is formed by contracting 4-leg tensors. In 1941, Kramers and Wannier assumed a variational state in the form of the matrix product state (MPS), and they optimized it `numerically'. Baxter reached the concept of the corner-transfer matrix (CTM), and performed a variational computation in 1968. Independently from these statistical studies, MPS was introduced by Affleck, Lieb, Kennedy and Tasaki (AKLT) in 1987 for the study of one-dimensional quantum spin chain, by Derrida for asymetric exclusion processes, and also (implicitly) by the establishment of the density matrix renormalization group (DMRG) by White in 1992. After a brief (?) introduction of these prehistories, I'll speak about my contribution to this area, the applications of DMRG and CTMRG methods to two-dimensional statistical models, including those on hyperbolic lattices, fractal systems, and random spin models. Analysis of the spin-glass state, which is related to learning processes, from the view point of the entanglement structure would be a target of future studies in this direction.

Tomotoshi Nishino
Fri 10:50 a.m. - 11:30 a.m.

I will provide an overview of the tensor network formalism and its applications, and discuss the key operations, such as tensor contractions, required for building tensor network algorithms. I will also demonstrate the TensorTrace graphical interface, a software tool which is designed to allow users to implement and code tensor network routines easily and effectively. Finally, the utility of tensor networks towards tasks in machine learning will be briefly discussed.

Glen Evenbly
Fri 11:30 a.m. - 12:10 p.m.

In this talk, I will present uniform tensor network models (also known translation invariant tensor networks) which are particularly suited for modelling structured data such as sequences and trees. Uniform tensor networks are tensor networks where the core tensors appearing in the decomposition of a given tensor are all equal, which can be seen as a weight sharing mechanism in tensor networks. In the first part of the talk, I will show how uniform tensor networks are particularly suited to represent functions defined over sets of structured objects such as sequences and trees. I will then present how these models are related to classical computational models such as hidden Markov models, weighted automata, second-order recurrent neural networks and context free grammars. In the second part of the talk, I will present a classical learning algorithm for weighted automata and show how and it can be interpreted as a mean to convert non-uniform tensor networks to uniform ones. Lastly, I will present ongoing work leveraging the tensor network formalism to design efficient and versatile probabilistic models for sequence data.

Guillaume Rabusseau
Fri 12:10 p.m. - 1:10 p.m.

Questions and Future Directions

Jacob Biamonte, Qibin Zhao, Paul Liang, Cesar Caiafa, Stefan Leichenauer, Xiao-Yang Liu
Fri 1:10 p.m. - 1:50 p.m.


Paul Springer
Fri 1:50 p.m. - 2:30 p.m.

TensorNetwork is an open source python package for tensor network computations. It has been designed with the goal in mind to help researchers and engineers with rapid development of highly efficient tensor network algorithms for physics and machine learning applications. After a brief introduction to tensor networks, I will discuss some of the main design principles of the TensorNetwork package, and show how one can use it to speed up tensor network algorithms by running them on accelerated hardware, or by exploiting tensor sparsity.

Martin Ganahl
Fri 2:30 p.m. - 3:10 p.m.

Multivariate spatiotemporal data is ubiquitous in science and engineering, from climate science to sports analytics, to neuroscience. Such data contain higher-order correlations and can be represented as a tensor. Tensor latent factor models provide a powerful tool for reducing dimensionality and discovering higher-order structures. However, existing tensor models are often slow or fail to yield interpretable latent factors. In this talk, I will demonstrate advances in tensor methods to generate interpretable latent factors for high-dimensional spatiotemporal data. We provide theoretical guarantees and demonstrate their applications to real-world climate, basketball, and neuroscience data.

Rose Yu
Fri 3:10 p.m. - 3:50 p.m.

Recent years have enjoyed a significant interest in exploiting tensor networks in the context of machine learning, both as a tool for the formulation of new learning algorithms and for enhancing the mathematical understanding of existing methods. In this talk, we will explore two readings of such a connection. On the one hand, we will consider the task of identifying the underlying non-linear governing equations, required both for obtaining an understanding and making future predictions. We will see that this problem can be addressed in a scalable way making use of tensor network based parameterizations for the governing equations. On the other hand, we will investigate the expressive power of tensor networks in probabilistic modelling. Inspired by the connection of tensor networks and machine learning, and the natural correspondence between tensor networks and probabilistic graphical models, we will provide a rigorous analysis of the expressive power of various tensor-network factorizations of discrete multivariate probability distributions. Joint work with A. Goeßmann, M. Götte, I. Roth, R. Sweke, G. Kutyniok, I. Glasser, N. Pancotti, J. I. Cirac.

Jens Eisert
Fri 3:50 p.m. - 4:30 p.m.

An overview will be given of counting problems on the lattice, such as the calculation of the hard square constant and of the residual entropy of ice. Unlike Monte Carlo techniques which have difficulty in calculating such quantities, we will demonstrate that tensor networks provide a natural framework for tackling these problems. We will also show that tensor networks reveal nonlocal hidden symmetries in those systems, and that the typical critical behaviour is witnessed by matrix product operators which form representations of tensor fusion categories.

Frank Verstraete
Fri 4:30 p.m. - 4:42 p.m.

To date, scalable methods for data-driven identification of non-linear governing equations do not exploit or offer insight into fundamental underlying physical structure. In this work, we show that various physical constraints can be captured via tensor network based parameterizations for the governing equation, which naturally ensures scalability. In addition to providing analytic results motivating the use of such models for realistic physical systems, we demonstrate that efficient rank-adaptive optimization algorithms can be used to learn optimal tensor network models without requiring a~priori knowledge of the exact tensor ranks.

Alex Goeßmann
Fri 4:42 p.m. - 4:54 p.m.

Originating from condensed matter physics, tensor networks are compact representations of high-dimensional tensors. In this paper, the prowess of tensor networks is demonstrated on the particular task of one-class anomaly detection. We exploit the memory and computational efficiency of tensor networks to learn a linear transformation over a space with dimension exponential in the number of original features. The linearity of our model enables us to ensure a tight fit around training instances by penalizing the model's global tendency to predict normality via its Frobenius norm---a task that is infeasible for most deep learning models. Our method outperforms deep and classical algorithms on tabular datasets and produces competitive results on image datasets, despite not exploiting the locality of images.

Jensen Wang
Fri 4:54 p.m. - 5:06 p.m.

Neural networks have achieved state of the art results in many areas, supposedly due to parameter sharing, locality, and depth. Tensor networks (TNs) are linear algebraic representations of quantum many-body states based on their entanglement structure. TNs have found use in machine learning. We devise a novel TN based model called Deep convolutional tensor network (DCTN) for image classification, which has parameter sharing, locality, and depth. It is based on the Entangled plaquette states (EPS) TN. We show how EPS can be implemented as a backpropagatable layer. We test DCTN on MNIST, FashionMNIST, and CIFAR10 datasets. A shallow DCTN performs well on MNIST and FashionMNIST and has a small parameter count. Unfortunately, depth increases overfitting and thus decreases test accuracy. Also, DCTN of any depth performs badly on CIFAR10 due to overfitting. It is to be determined why. We discuss how the hyperparameters of DCTN affect its training and overfitting.

Philip Blagoveschensky
Fri 5:06 p.m. - 5:18 p.m.

Nonlocality is an important constituent of quantum physics which lies at the heart of many striking features of quantum states such as entanglement. An important category of highly entangled quantum states are Greenberger-Horne-Zeilinger (GHZ) states which play key roles in various quantum-based technologies and are particularly of interest in benchmarking noisy quantum hardwares. A novel quantum inspired generative model known as Born Machine which leverages on probabilistic nature of quantum physics has shown a great success in learning classical and quantum data over tensor network (TN) architecture. To this end, we investigate the task of training the Born Machine for learning the GHZ state over two different architectures of tensor networks. Our result indicates that gradient-based training schemes over TN Born Machine fails to learn the non-local information of the coherent superposition (or parity) of the GHZ state. This leads to an important question of what kind of architecture design, initialization and optimization schemes would be more suitable to learn the non-local information hidden in the quantum state and whether we can adapt quantum-inspired training algorithms to learn such quantum states.

Khadijeh Najafi
Fri 5:18 p.m. - 5:30 p.m.
We consider high-order learning models, of which the weight tensor is represented by (symmetric) tensor network~(TN) decomposition. Although such models have been widely used on various tasks, it is challenging to determine the optimal order in complex systems (e.g., deep neural networks). To tackle this issue, we introduce a new notion of \emph{fractional tensor network~(FrTN)} decomposition, which generalizes the conventional TN models with an integer order by allowing the order to be an arbitrary fraction. Due to the density of fractions in the field of real numbers, the order of the model can be formulated as a learnable parameter and simply optimized by stochastic gradient descent~(SGD) and its variants. Moreover, it is uncovered that FrTN strongly connects to well-known methods such as $\ell_p$-pooling~\cite{gulcehre2014learned} and ``squeeze-and-excitation''~\cite{hu2018squeeze} operations in the deep learning studies. On the numerical side, we apply the proposed model to enhancing the classic ResNet-26/50~\cite{he2016deep} and MobileNet-v2~\cite{sandler2018mobilenetv2} on both CIFAR-10 and ILSVRC-12 classification tasks, and the results demonstrate the effectiveness brought by the learnable order parameters in FrTN.
Chao Li
Fri 5:30 p.m. - 6:10 p.m.

We present a new approach to quantum process tomography, the reconstruction of an unknown quantum channel from measurement data. Specifically, we combine a tensor-network representation of the Choi matrix (a complete description of a quantum channel), with unsupervised machine learning of single-shot projective measurement data. We show numerical experiments for both unitary and noisy quantum circuits, for a number of qubits well beyond the reach of standard process tomography techniques.

Giacomo Torlai
Fri 6:10 p.m. - 6:50 p.m.

In this talk, we study high performance computation for tensor networks to address time and space complexities that grow rapidly with the tensor size. We propose efficient primitives that exploit parallelism in tensor learning for efficient implementation on GPU.

Anwar Walid, Xiao-Yang Liu
Fri 6:50 p.m. - 7:00 p.m.


Xiao-Yang Liu

Author Information

Xiao-Yang Liu (Columbia University)

Xiao-Yang Liu is a SPC member of IJCAI, TPC members of NeurIPS, ICML, ICLR, AAAI, AISTATS, and reviewers for IEEE PAMI, TNNLS, TIT, TPDS, TMC, TIP and TSP. He is a PhD candidate at Columbia University.

Qibin Zhao (RIKEN AIP)
Jacob Biamonte (Skolkovo Institute of Science and Technology)
Cesar Caiafa (CONICET/UBA)
Paul Pu Liang (Carnegie Mellon University)
Nadav Cohen (Tel Aviv University)
Stefan Leichenauer (X, The Moonshot Factory)

More from the Same Authors