Timezone: »
Recent years have seen a surge in research at the intersection of differential geometry and deep learning, including techniques for stochastic optimization on curved spaces (e.g., hyperbolic or spherical manifolds), learning embeddings for nonEuclidean data, and generative modeling on Riemannian manifolds. Insights from differential geometry have led to new state of the art approaches to modeling complex real world data, such as graphs with hierarchical structure, 3D medical data, and meshes.
Thus, it is of critical importance to understand, from a geometric lens, the natural invariances, equivariances, and symmetries that reside within data.
In order to support the burgeoning interest of differential geometry in deep learning, the primary goal for this workshop is to facilitate community building and to work towards the identification of key challenges in comparison with regular deep learning, along with techniques to overcome these challenges. With many new researchers beginning projects in this area, we hope to bring them together to consolidate this fastgrowing area into a healthy and vibrant subfield. In particular, we aim to strongly promote novel and exciting applications of differential geometry for deep learning with an emphasis on bridging theory to practice which is reflected in our choices of invited speakers, which include both machine learning practitioners and researchers who are primarily geometers.
Fri 5:00 a.m.  3:00 p.m.

gather.town
(Social)
link »
For Poster sessions Gather.Town: https://gather.town/app/jXxVw7lqYrgIZ2zL/diffgeo4dl 
🔗 
Fri 5:45 a.m.  6:00 a.m.

Opening Remarks
(Presentation)

Joey Bose 🔗 
Fri 6:00 a.m.  6:30 a.m.

Invited Talk 1: Geometric deep learning for 3D human body synthesis
(Talk by Michael Bronstein)
SlidesLive Video » Geometric deep learning, a new class of ML methods trying to extend the basic building blocks of deep neural architectures to geometric data (point clouds, graphs, and meshes), has recently excelled in many challenging analysis tasks in computer vision and graphics such as deformable 3D shape correspondence. In this talk, I will present recent research efforts in 3D shape synthesis, focusing in particular on the human body, face, and hands. 
Michael Bronstein 🔗 
Fri 6:30 a.m.  7:00 a.m.

Invited Talk 2: Gauge Theory in Geometric Deep Learning
(Talk by Taco Cohen)
SlidesLive Video » It is often said that differential geometry is in essence the study of connections on a principal bundle. These notions have been discovered independently in gauge theory in physics, and over the last few years it has become clear that they also provide a very general and systematic way to model convolutional neural networks on homogeneous spaces and general manifolds. Specifically, representation spaces in these networks are described as fields of geometric quantities on a manifold (i.e. sections of associated vector bundles). These quantities can only be expressed numerically after making an arbitrary choice of frame / gauge (section of a principal bundle). Network layers map between representation spaces, and should be equivariant to symmetry transformations. In this talk I will discuss two results that have a bearing on geometric deep learning research. First, we discuss the “convolution is all you need theorem” which states that any linear equivariant map between homogeneous representation spaces is a generalized convolution. Secondly, in the case of gauge symmetry (when all frames should be considered equivalent), we show that defining a nontrivial equivariant linear map between representation spaces requires the introduction of a principal connection which defines parallel transport. We will not assume familiarity with bundles or gauge theory, and use examples relevant to neural networks to illustrate the ideas. 
Taco Cohen 🔗 
Fri 7:00 a.m.  7:05 a.m.

Contributed Talk 1: Learning Hyperbolic Representations for Unsupervised 3D Segmentation
(Contributed Talk)
SlidesLive Video » There exists a need for unsupervised 3D segmentation on complex volumetric data, particularly when annotation ability is limited or discovery of new categories is desired. Using the observation that 3D data is innately hierarchical, we propose learning effective representations of 3D patches for unsupervised segmentation through a variational autoencoder with a hyperbolic latent space and a proposed gyroplane convolutional layer, which better models underlying hierarchical structure within a 3D image. We also introduce a hierarchical triplet loss and multiscale patch sampling scheme to embed relationships across varying levels of granularity. We demonstrate the effectiveness of our hyperbolic representations for unsupervised 3D segmentation on a hierarchical toy dataset and the BraTS dataset. 
Joy Hsu · Jeffrey Gu · Serena Yeung 🔗 
Fri 7:06 a.m.  7:11 a.m.

Contributed Talk 2: Witness Autoencoder: Shaping the Latent Space with Witness Complexes
(Contributed Talk 2)
SlidesLive Video » We present a Witness Autoencoder (WAE) – an autoencoder that captures geodesic distances of the data in the latent space. Our algorithm uses witness complexes to compute geodesic distance approximations on a minibatch level, and leverages topological information from the entire dataset while performing batchwise approximations. This way, our method allows to capture the global structure of the data even with a small batch size, which is beneficial for largescale realworld data. We show that our method captures the structure of the manifold more accurately than the recently introduced topological autoencoder (TopoAE). 
Anastasiia Varava · Danica Kragic · Simon Schönenberger · Jen Jen Chung · Roland Siegwart · Vladislav Polianskii 🔗 
Fri 7:12 a.m.  7:17 a.m.

Contributed Talk 3: A Riemannian gradient flow perspective on learning deep linear neural networks
(Contributed Talk 3)
SlidesLive Video »
We study the convergence of gradient flows related to learning deep linear neural networks from data. In this case, the composition of the network layers amounts to simply multiplying the weight matrices of all layers together, resulting in an overparameterized problem. The gradient flow with respect to these factors can be reinterpreted as a Riemannian gradient flow on the manifold of rank$r$ matrices endowed with a suitable Riemannian metric. We show that the flow always converges to a critical point of the underlying functional. Moreover, we establish that, for almost all initializations, the flow converges to a global minimum on the manifold of rank $k$ matrices for some $k\leq r$.

Ulrich Terstiege · Holger Rauhut · Bubacarr Bah · Michael Westdickenberg 🔗 
Fri 7:18 a.m.  7:23 a.m.

Contributed Talk 4: Directional Graph Networks
(Contributed Talk 4)
SlidesLive Video » In order to overcome the expressive limitations of graph neural networks (GNNs), we propose the first method that exploits vector flows over graphs to develop globally consistent directional and asymmetric aggregation functions. We show that our directional graph networks (DGNs) generalize convolutional neural networks (CNNs) when applied on a grid. Whereas recent theoretical works focus on understanding local neighbourhoods, local structures and local isomorphism with no global information flow, our novel theoretical framework allows directional convolutional kernels in any graph. First, by defining a vector field in the graph, we develop a method of applying directional derivatives and smoothing by projecting nodespecific messages into the field. Then we propose the use of the Laplacian eigenvectors as such vector field. Finally, we bring the power of CNN data augmentation to graphs by providing a means of doing reflection and rotation on the underlying directional field. 
Dominique Beaini · Saro Passaro · Vincent Létourneau · Will Hamilton · Gabriele Corso · Pietro Liò 🔗 
Fri 7:24 a.m.  7:29 a.m.

Contributed Talk 5: A New Neural Network Architecture Invariant to the Action of Symmetry Subgroups
(Contributed Talk 5)
SlidesLive Video »
We propose a computationally efficient $G$invariant neural network that approximates functions invariant to the action of a given permutation subgroup $G \leq S_n$ of the symmetric group on input data. The key element of the proposed network architecture is a new $G$invariant transformation module, which produces a $G$invariant latent representation of the input data.
Theoretical considerations are supported by numerical experiments, which demonstrate the effectiveness and strong generalization properties of the proposed method in comparison to other $G$invariant neural networks.

Mete Ozay · Piotr Kicki · Piotr Skrzypczynski 🔗 
Fri 7:30 a.m.  8:00 a.m.

Virtual Coffee Break on Gather.Town (Break) link »  🔗 
Fri 8:00 a.m.  8:30 a.m.

Invited Talk 3: Reparametrization invariance in representation learning
(Talk by Søren Hauberg)
SlidesLive Video » Generative models learn a compressed representation of data that is often used for downstream tasks such as interpretation, visualization and prediction via transfer learning. Unfortunately, the learned representations are generally not statistically identifiable, leading to a high risk of arbitrariness in the downstream tasks. We propose to use differential geometry to construct representations that are invariant to reparametrizations, thereby solving the bulk of the identifiability problem. We demonstrate that the approach is deeply tied to the uncertainty of the representation, and that practical applications require highquality uncertainty quantification. With the identifiability problem solved, we show how to construct better priors for generative models, and that the identifiable representations reveals signals in the data that were otherwise hidden. 
Søren Hauberg 🔗 
Fri 8:30 a.m.  9:30 a.m.

Poster Session 1 on Gather.Town (Poster Session) link »  Joey Bose · Ines Chami 🔗 
Fri 8:30 a.m.  9:30 a.m.

Quaternion Graph Neural Networks
(Poster)
SlidesLive Video » Recently, graph neural networks (GNNs) become a principal research direction to learn lowdimensional continuous embeddings of nodes and graphs to predict node and graph labels, respectively. However, Euclidean embeddings have high distortion when using GNNs to model complex graphs such as social networks. Furthermore, existing GNNs are not very efficient with the high number of model parameters when increasing the number of hidden layers. Therefore, we move beyond the Euclidean space to a hypercomplex vector space to improve graph representation quality and reduce the number of model parameters. To this end, we propose quaternion graph neural networks (QGNN) to generalize GCNs within the Quaternion space to learn quaternion embeddings for nodes and graphs. The Quaternion space, a hypercomplex vector space, provides highly meaningful computations through Hamilton product compared to the Euclidean and complex vector spaces. As a result, our QGNN can reduce the model size up to four times and enhance learning better graph representations. Experimental results show that the proposed QGNN produces stateoftheart accuracies on a range of wellknown benchmark datasets for three downstream tasks, including graph classification, semisupervised node classification, and text (node) classification. 
Dai Quoc Nguyen · Tu Dinh Nguyen · Dinh Phung 🔗 
Fri 8:30 a.m.  9:30 a.m.

Universal Approximation Property of Neural Ordinary Differential Equations
(Poster)
SlidesLive Video »
Neural ordinary differential equations (NODEs) is an invertible neural network architecture promising for its freeform Jacobian and the availability of a tractable Jacobian determinant estimator. Recently, the representation power of NODEs has been partly uncovered: they form an $L^p$universal approximator for continuous maps under certain conditions. However, the $L^p$universality may fail to guarantee an approximation for the entire input domain as it may still hold even if the approximator largely differs from the target function on a small region of the input space. To further uncover the potential of NODEs, we show their stronger approximation property, namely the $\sup$universality for approximating a large class of diffeomorphisms. It is shown by leveraging a structure theorem of the diffeomorphism group, and the result complements the existing literature by establishing a fairly large set of mappings that NODEs can approximate with a stronger guarantee.

Takeshi Teshima · Koichi Tojo · Masahiro Ikeda · Isao Ishikawa · Kenta Oono 🔗 
Fri 8:30 a.m.  9:30 a.m.

Hermitian Symmetric Spaces for Graph Embeddings
(Poster)
SlidesLive Video » Learning faithful graph representations as sets of vertex embeddings has become a fundamental intermediary step in a wide range of machine learning applications. The quality of the embeddings is usually determined by how well the geometry of the target space matches the structure of the data. In this work we learn continuous representations of graphs in spaces of symmetric matrices over C. These spaces offer a rich geometry that simultaneously admits hyperbolic and Euclidean subspaces, and are amenable to analysis and explicit computations. We implement an efficient method to learn embeddings and compute distances, and develop the tools to operate with such spaces. The proposed models are able to automatically adapt to very dissimilar arrangements without any apriori estimates of graph features. On various datasets with very diverse structural properties and reconstruction measures our model ties the results of competitive baselines for geometrically pure graphs and outperforms them for graphs with mixed geometric features, showcasing the versatility of our approach. 
Federico López · Beatrice Pozzetti · Steve Trettel · Anna Wienhard 🔗 
Fri 8:30 a.m.  9:30 a.m.

Isometric Gaussian Process Latent Variable Model
(Poster)
SlidesLive Video » We propose a fully generative unsupervised model where the latent variable respects both the distances and the topology of the modeled data. The model leverages the Riemannian geometry of the generated manifold to endow the latent space with a welldefined stochastic distance measure, which is modeled as Nakagami distributions. These stochastic distances are sought to be as similar as possible to observed distances along a neighborhood graph through a censoring process. The model is inferred by variational inference. We demonstrate how the new model can encode invariances in the learned manifolds. 
Martin Jørgensen · Søren Hauberg 🔗 
Fri 8:30 a.m.  9:30 a.m.

Grassmann Iterative Linear Discriminant Analysis with Proxy Matrix Optimization
(Poster)
SlidesLive Video » Linear Discriminant Analysis (LDA) is one of the most common methods for dimensionality reduction in pattern recognition and statistics. It is a supervised method that aims to find the most discriminant space in the reduced dimensional space, which can be further used with a linear classifier for classification. In this work, we present an iterative optimization method called the Proxy Matrix Optimization (PMO) which makes use of automatic differentiation and stochastic gradient descent (SGD) on the Grassmann manifold to arrive at the optimal projection matrix. We show that PMO does better than the prevailing manifold optimization methods. 
Navya Nagananda · Breton Minnehan · Andreas Savakis 🔗 
Fri 8:30 a.m.  9:30 a.m.

Tree Covers: An Alternative to Metric Embeddings
(Poster)
SlidesLive Video » We study the problem of finding distancepreserving graph representations. Most previous approaches focus on learning continuous embeddings in metric spaces such as Euclidean or hyperbolic spaces. Based on the observation that embedding into a metric space is not necessary to produce faithful representations, we explore a new conceptual approach to represent graphs using a collection of trees, namely a tree cover. We show that with the same amount of storage, covers achieve lower distortion than learned metric embeddings. While the distance induced by covers is not a metric, we find that tree covers still have the desirable properties of graph representations, including efficiency in query and construction time. 
Roshni Sahoo · Ines Chami · Christopher Ré 🔗 
Fri 8:30 a.m.  9:30 a.m.

Deep Networks and the Multiple Manifold Problem
(Poster)
SlidesLive Video » We study the multiple manifold problem, a binary classification task modeled on applications in machine vision, in which a deep fullyconnected neural network is trained to separate two lowdimensional submanifolds of the unit sphere. We provide an analysis of the onedimensional case, proving for a simple manifold configuration that when the network depth L is large relative to certain geometric and statistical properties of the data, the network width n grows as a sufficiently large polynomial in L, and the number of i.i.d. samples from the manifolds is polynomial in L, randomlyinitialized gradient descent rapidly learns to classify the two manifolds perfectly with high probability. Our analysis demonstrates concrete benefits of depth and width in the context of a practicallymotivated model problem: the depth acts as a fitting resource, with larger depths corresponding to smoother networks that can more readily separate the class manifolds, and the width acts as a statistical resource, enabling concentration of the randomlyinitialized network and its gradients. Along the way, we establish essentially optimal nonasymptotic rates of concentration for the neural tangent kernel of deep fullyconnected ReLU networks using martingale techniques, requiring width n \geq L poly(d0) to achieve uniform concentration of the initial kernel over a d0dimensional submanifold of the unit sphere. Our approach should be of use in establishing similar results for other network architectures. 
Sam Buchanan · Dar Gilboa · John Wright 🔗 
Fri 8:30 a.m.  9:30 a.m.

GENNI: Visualising the Geometry of Equivalences for Neural Network Identifiability
(Poster)
SlidesLive Video » In this paper, we propose an efficient algorithm to visualise symmetries in neural networks. Typically the models are defined with respect to a parameter space, where nonequal parameters can produce the same function. Our proposed tool, GENNI, allows us to identify parameters that are functionally equivalent and to then visualise the subspace of the resulting equivalence class. Specifically, we experiment on simple cases, to demonstrate how to identify and provide possible solutions for more complicated scenarios. 
Arinbjörn Kolbeinsson · Nicholas Jennings · Marc Deisenroth · Daniel Lengyel · Janith Petangoda · Michalis Lazarou · Kate Highnam · Isak IF Falk 🔗 
Fri 8:30 a.m.  9:30 a.m.

Graph of Thrones : Adversarial Perturbations dismantle Aristocracy in Graphs
(Poster)
SlidesLive Video » This paper investigates the effect of adversarial perturbations on the hyperbolicity of graphs. Learning lowdimensional embeddings of graph data in certain curved Riemannian manifolds has recently gained traction due to their desirable property of acting as useful geometrical inductive biases. More specifically, models of Hyperbolic geometry such as Poincar\'{e} Ball and Hyperboloid Model have found extensive applications for learning representations of discrete data such as Graphs and Trees with hierarchical anatomy. The hyperbolicity concept indicates whether the graph data under consideration is suitable for embedding in hyperbolic geometry. Lower values of hyperbolicity imply distortionfree embedding in hyperbolic space. We study adversarial perturbations that attempt to poison the graph structure, consequently rendering hyperbolic geometry an ineffective choice for learning representations. To circumvent this problem, we advocate for utilizing Lorentzian manifolds in machine learning pipelines and empirically show they are better suited to learn hierarchical relationships. Despite the recent proliferation of adversarial robustness methods in the graph data, this is the first work that explores the relationship between adversarial attacks and hyperbolicity property while also providing resolution to navigate such vulnerabilities. 
Adarsh Jamadandi · Uma Mudenagudi 🔗 
Fri 8:30 a.m.  9:30 a.m.

A Metric for Linear SymmetryBased Disentanglement
(Poster)
SlidesLive Video » The definition of Linear SymmetryBased Disentanglement (LSBD) proposed by Higgins et al. outlines the properties that should characterize a disentangled representation that captures the symmetries of data. However, it is not clear how to measure the degree to which a data representation fulfills these properties. In this work, we propose a metric for the evaluation of the level of LSBD that a data representation achieves We provide a practical method to evaluate this metric and use it to evaluate the disentanglement for the data representation obtained for three datasets with underlying SO(2) symmetries. 
Luis Armando Pérez Rey · Loek Tonnaer · Vlado Menkovski · Mike Holenderski · Jim Portegies 🔗 
Fri 9:30 a.m.  10:15 a.m.

Panel Discussion
(Panel)

Joey Bose · Emile Mathieu · Charline Le Lan · Ines Chami 🔗 
Fri 10:15 a.m.  10:45 a.m.

Virtual Coffee Break on Gather.Town (Break) link »  🔗 
Fri 10:45 a.m.  11:30 a.m.

Focused Breakout Session
(Demonstration)
link »
SlidesLive Video » 
Ines Chami · Joey Bose 🔗 
Fri 10:45 a.m. 

Focused Breakout Session Companion Notebook: Poincare Embeddings
(Demonstration)
link »
Link to Google Collab notebook on Poincare Embeddings. 
🔗 
Fri 10:45 a.m. 

Focused Breakout Session Companion Notebook: Wrapped Normal Distribution
(Demonstration)
link »
Link to Google Collab notebook on plotting a Wrapped Normal Distribution. 
🔗 
Fri 11:30 a.m.  12:00 p.m.

Invited Talk 4: An introduction to the Calderon and Steklov inverse problems on Riemannian manifolds with boundary
(Talk by Niky Kamran)
SlidesLive Video » Given a compact Riemannian manifold with boundary, the DirichlettoNeumann operator is a nonlocal map which assigns to data prescribed on the boundary of the manifold the normal derivative of the unique solution of the LaplaceBeltrami equation determined by the given boundary data. Physically, it can be thought of for example as a voltage to current map in an anisotropic medium in which the conductivity is modeled geometrically through a Riemannian metric. The Calderon problem is the inverse problem of recovering the Riemannian metric from the DirichlettoNeumann operator, while the Steklov inverse problem is to recover the metric from the knowledge of the spectrum of the DirichlettoNeumann operator. These inverse problems are both severely illposed . We will give an overview of some of the main results known about these questions, and time permitting, we will discuss the question of stability for the inverse Steklov problem. 
Niky Kamran 🔗 
Fri 12:00 p.m.  1:00 p.m.

Poster Session 2 on Gather.Town (Poster Session) link »  Charline Le Lan · Emile Mathieu 🔗 
Fri 12:00 p.m.  1:00 p.m.

The Intrinsic Dimension of Images and Its Impact on Learning
(Poster)
SlidesLive Video » It is widely believed that natural image data exhibits lowdimensional structure despite being embedded in a highdimensional pixel space. This idea underlies a common intuition for the success of deep learning and has been exploited for enhanced regularization and adversarial robustness. In this work, we apply dimension estimation tools to popular datasets and investigate the role of low dimensional structure in neural network learning. We find that common natural image datasets indeed have very low intrinsic dimension relative to the high number of pixels in the images. Additionally, we find that low dimensional datasets are easier for neural networks to learn. We validate our findings by carefullydesigned experiments to vary the intrinsic dimension of both synthetic and real data and evaluate its impact on sample complexity. 
Chen Zhu · Micah Goldblum · Ahmed Abdelkader · Tom Goldstein · Phillip Pope 🔗 
Fri 12:00 p.m.  1:00 p.m.

Sparsifying networks by traversing Geodesics
(Poster)
SlidesLive Video » The geometry of weight spaces and functional manifolds of neural networks play an important role towards `understanding' the intricacies of ML. In this paper, we attempt to solve certain open questions in ML, by viewing them through the lens of geometry, ultimately relating it to the discovery of points or paths of equivalent function in these spaces. We propose a mathematical framework to evaluate geodesics in the functional space, to find highperformance paths from a dense network to its sparser counterpart. Our results are obtained on VGG11 trained on CIFAR10 and MLP's trained on MNIST. Broadly, we demonstrate that the framework is general, and can be applied to a wide variety of problems, ranging from sparsification to alleviating catastrophic forgetting. 
Guruprasad Raghavan · Matt Thomson 🔗 
Fri 12:00 p.m.  1:00 p.m.

Convex Optimization for Blind Source Separation on a Statistical Manifold
(Poster)
SlidesLive Video » We present a novel blind source separation (BSS) method using a hierarchical structure of sample space that is incorporated with a loglinear model. Our approach is formulated as a convex optimization with theoretical guarantees to uniquely recover a set of source signals by minimizing the KL divergence from a set of mixed signals. Source signals, received signals, and mixing matrices are realized as different layers in our hierarchical sample space. Our empirical results have demonstrated superiority compared to well established techniques. 
Simon Luo · lamiae azizi · Mahito Sugiyama 🔗 
Fri 12:00 p.m.  1:00 p.m.

Unsupervised Orientation Learning Using Autoencoders
(Poster)
SlidesLive Video » We present a method to learn the orientation of symmetric objects in realworld images in an unsupervised way. Our method explicitly maps inplane relative rotations to the latent space of an autoencoder, by rotating both in the image domain and latent domain. This is achieved by adding a proposed \textit{crossing loss} to a standard autoencoder training framework which enforces consistency between the image domain and latent domain rotations. This relative representation of rotation is made absolute, by using the symmetry of the observed object, resulting in an unsupervised method to learn the orientation. Furthermore, orientation is disentangled in latent space from other descriptive factors. We apply this method on two realworld datasets: aerial images of planes in the DOTA dataset and images of densely packed honeybees. We empirically show this method can learn orientation using no annotations with high accuracy compared to the same models trained with annotations. 
Rembert Daems · Francis Wyffels 🔗 
Fri 12:00 p.m.  1:00 p.m.

Towards Geometric Understanding of LowRank Approximation
(Poster)
SlidesLive Video » Rank reduction of matrices has been widely studied in linear algebra. However, its geometric understanding is limited and theoretical connection to statistical models remains unrevealed. We tackle this problem using information geometry and present a geometric unified view of matrix rank reduction. Our key idea is to treat each matrix as a probability distribution represented by the loglinear model on a partially ordered set (poset), which enables us to formulate rank reduction as projection onto a statistical submanifold, which corresponds to the set of lowrank matrices. This geometric view enables us to derive a novel efficient rank1 reduction method, called Legendre rank1 reduction, which analytically solves meanfield approximation and minimizes the KL divergence from a given matrix. 
Mahito Sugiyama · Kazu Ghalamkari 🔗 
Fri 12:00 p.m.  1:00 p.m.

Deep Riemannian Manifold Learning
(Poster)
SlidesLive Video » We present a new class of learnable Riemannian manifolds with a metric parameterized by a deep neural network. The core manifold operationsspecifically the Riemannian exponential and logarithmic mapsare solved using approximate numerical techniques. Input and parameter gradients are computed with an adjoint sensitivity analysis. This enables us to fit geodesics and distances with gradientbased optimization of both onmanifold values and the manifold itself. We demonstrate our method's capability to model smooth, flexible metric structures in graph and dynamical system embedding tasks. 
Aaron Lou · Maximillian Nickel · Brandon Amos 🔗 
Fri 12:00 p.m.  1:00 p.m.

Leveraging Smooth Manifolds for Lexical Semantic Change Detection across Corpora
(Poster)
SlidesLive Video » Comparing two bodies of text and detecting words with significant lexical semantic shift between them is an important part of digital humanities. Traditional approaches have relied on aligning the different embeddings in the Euclidean space using the Orthogonal Procrustes problem. This study presents a geometric framework that leverages optimization on smooth Riemannian manifolds for obtaining corpusspecific orthogonal rotations and a corpusindependent scaling to project the different vector spaces into a shared latent space. This enables us to capture any affine relationship between the embedding spaces while utilising the rich geometry of smooth manifolds. 
Anmol Goel · Ponnurangam Kumaraguru 🔗 
Fri 12:00 p.m.  1:00 p.m.

Extendable and invertible manifold learning with geometry regularized autoencoders
(Poster)
SlidesLive Video » A fundamental task in data exploration is to extract simplified low dimensional representations that capture intrinsic geometry in data, especially for faithfully visualizing data in two or three dimensions. Common approaches to this task use kernel methods for manifold learning. However, these methods typically only provide an embedding of fixed input data and cannot extend to new data points. Autoencoders have also recently become popular for representation learning. But while they naturally compute feature extractors that are both extendable to new data and invertible (i.e., reconstructing original features from latent representation), they have limited capabilities to follow global intrinsic geometry compared to kernelbased manifold learning. We present a new method for integrating both approaches by incorporating a geometric regularization term in the bottleneck of the autoencoder. Our regularization, based on the diffusion potential distances from the recentlyproposed PHATE visualization method, encourages the learned latent representation to follow intrinsic data geometry, similar to manifold learning algorithms, while still enabling faithful extension to new data and reconstruction of data in the original feature space from latent coordinates. We compare our approach with leading kernel methods and autoencoder models for manifold learning to provide qualitative and quantitative evidence of our advantages in preserving intrinsic structure, out of sample extension, and reconstruction. 
Andres F Duque · Sacha Morin · Guy Wolf · Kevin Moon 🔗 
Fri 12:00 p.m.  1:00 p.m.

QuatRE: RelationAware Quaternions for Knowledge Graph Embeddings
(Poster)
SlidesLive Video » We propose an effective embedding model, named QuatRE, to learn quaternion embeddings for entities and relations in knowledge graphs. QuatRE aims to enhance correlations between head and tail entities given a relation within the Quaternion space with Hamilton product. QuatRE achieves this goal by further associating each relation with two relationaware quaternion vectors which are used to rotate the head and tail entities' quaternion embeddings, respectively. To obtain the triple score, QuatRE rotates the rotated embedding of the head entity using the normalized quaternion embedding of the relation, followed by a quaternioninner product with the rotated embedding of the tail entity. Experimental results demonstrate that our QuatRE produces stateoftheart performances on four wellknown benchmark datasets for knowledge graph completion. 
Dai Quoc Nguyen · Dinh Phung 🔗 
Fri 12:00 p.m.  1:00 p.m.

Affinity guided Geometric SemiSupervised Metric Learning
(Poster)
SlidesLive Video » In this paper, we revamp the forgotten classical SemiSupervised Distance Metric Learning (SSDML) problem from a Riemannian geometric lens, to leverage stochastic optimization within a endtoend deep framework. The motivation comes from the fact that apart from a few classical SSDML approaches learning a linear Mahalanobis metric, deep SSDML has not been studied. We first extend existing SSDML methods to their deep counterparts and then propose a new method to overcome their limitations. Due to the nature of constraints on our metric parameters, we leverage Riemannian optimization. Our deep SSDML method with a novel affinity propagation based triplet mining strategy outperforms its competitors. 
Ujjal Dutta · Mehrtash Harandi · C Chandra Shekhar 🔗 
Fri 1:00 p.m.  1:30 p.m.

Invited Talk 5: Disentangling Orientation and Camera Parameters from CryoElectron Microscopy Images Using Differential Geometry and Variational Autoencoders
(Talk by Nina Miolane)
SlidesLive Video » Cryoelectron microscopy (cryoEM) is capable of producing reconstructed 3D images of biomolecules at nearatomic resolution. However, raw cryoEM images are highly corrupted 2D projections of the target 3D biomolecules. Reconstructing the 3D molecular shape requires the estimation of the orientation of the biomolecule that has produced the given 2D image, and the estimation of camera parameters to correct for intensity defects. Current techniques performing these tasks are often computationally expensive, while the dataset sizes keep growing. There is a need for nextgeneration algorithms that preserve accuracy while improving speed and scalability. In this paper, we combine variational autoencoders (VAEs) to learn a lowdimensional latent representation of cryoEM images. Analyzing the latent space with differential geometry of shape spaces leads us to design a new estimation method for orientation and camera parameters of singleparticle cryoEM images, that has the potential to accelerate the traditional reconstruction algorithm. 
Nina Miolane 🔗 
Fri 1:30 p.m.  2:00 p.m.

Invited Talk 6: Learning a robust classifier in hyperbolic space
(Talk by Melanie Weber)
SlidesLive Video » Recently, there has been a surge of interest in representing largescale, hierarchical data in hyperbolic spaces to achieve better representation accuracy with lower dimensions. However, beyond representation learning, there are few empirical and theoretical results that develop performance guarantees for downstream machine learning and optimization tasks in hyperbolic spaces. In this talk we consider the task of learning a robust classifier in hyperbolic space. We start with algorithmic aspects of developing analogues of classical methods, such as the perceptron or support vector machines, in hyperbolic spaces. We also discuss more broadly the challenges of generalizing such methods to nonEuclidean spaces. Furthermore, we analyze the role of geometry in learning robust classifiers by evaluating the tradeoff between low embedding dimensions and low distortion for both Euclidean and hyperbolic spaces. 
Melanie Weber 🔗 
Author Information
Joey Bose (McGill/MILA)
I’m a PhD student at the RLLab at McGill/MILA where I work on Adversarial Machine Learning on Graphs. Previously, I was a Master’s student at the University of Toronto where I researched crafting Adversarial Attacks on Computer Vision models using GAN’s. I also interned at Borealis AI where I was working on applying adversarial learning principles to learn better embeddings i.e. Word Embeddings for Machine Learning models.
Emile Mathieu (University of Oxford)
Charline Le Lan (University of Oxford)
Ines Chami (Stanford University)
Frederic Sala (U. WisconsinMadison)
Christopher De Sa (Cornell)
Maximilian Nickel (Facebook)
Christopher Ré (Stanford)
Will Hamilton (McGill)
More from the Same Authors

2021 : Personalized Benchmarking with the Ludwig Benchmarking Toolkit »
Avanika Narayan · Piero Molino · Karan Goel · Willie Neiswanger · Christopher Ré 
2021 : SKMTEA: A Dataset for Accelerated MRI Reconstruction with Dense Image Labels for Quantitative Clinical Evaluation »
Arjun Desai · Andrew Schmidt · Elka Rubin · Christopher Sandino · Marianne Black · Valentina Mazzoli · Kathryn Stevens · Robert Boutin · Christopher Ré · Garry Gold · Brian Hargreaves · Akshay Chaudhari 
2021 : Combining Recurrent, Convolutional, and ContinuousTime Models with Structured Learnable Linear StateSpace Layers »
Isys Johnson · Albert Gu · Karan Goel · Khaled Saab · Tri Dao · Atri Rudra · Christopher Ré 
2022 Poster: S4ND: Modeling Images and Videos as Multidimensional Signals with State Spaces »
Eric Nguyen · Karan Goel · Albert Gu · Gordon Downs · Preey Shah · Tri Dao · Stephen Baccus · Christopher Ré 
2022 Poster: On the Parameterization and Initialization of Diagonal State Space Models »
Albert Gu · Karan Goel · Ankit Gupta · Christopher Ré 
2022 Poster: Contrastive Adapters for Foundation Model Group Robustness »
Michael Zhang · Christopher Ré 
2022 Poster: Decentralized Training of Foundation Models in Heterogeneous Environments »
Binhang Yuan · Yongjun He · Tianyi Zhang · Jared Davis · Tri Dao · Beidi Chen · Percy Liang · Christopher Ré · Ce Zhang 
2022 Poster: Finetuning Language Models over Slow Networks using Activation Compression with Guarantees »
Jue WANG · Binhang Yuan · Luka Rimanic · Yongjun He · Tri Dao · Beidi Chen · Christopher Ré · Ce Zhang 
2022 Poster: Transform Once: Efficient Operator Learning in Frequency Domain »
Michael Poli · Stefano Massaroli · Federico Berto · Jinkyoo Park · Tri Dao · Christopher Ré · Stefano Ermon 
2022 Poster: Riemannian Diffusion Models »
ChinWei Huang · Milad Aghajohari · Joey Bose · Prakash Panangaden · Aaron Courville 
2022 Poster: FlashAttention: Fast and MemoryEfficient Exact Attention with IOAwareness »
Tri Dao · Daniel Fu · Stefano Ermon · Atri Rudra · Christopher Ré 
2022 Poster: Selfsupervised learning of brain dynamics from broad neuroimaging data »
Armin Thomas · Christopher Ré · Russell Poldrack 
2022 Poster: Riemannian ScoreBased Generative Modelling »
Valentin De Bortoli · Emile Mathieu · Michael Hutchinson · James Thornton · Yee Whye Teh · Arnaud Doucet 
2022 Poster: HAPI: A Largescale Longitudinal Dataset of Commercial ML API Predictions »
Lingjiao Chen · Zhihua Jin · Evan Sabri Eyuboglu · Christopher Ré · Matei Zaharia · James Zou 
2021 Oral: Moser Flow: Divergencebased Generative Modeling on Manifolds »
Noam Rozen · Aditya Grover · Maximilian Nickel · Yaron Lipman 
2021 Poster: On Contrastive Representations of Stochastic Processes »
Emile Mathieu · Adam Foster · Yee Teh 
2021 Poster: Combining Recurrent, Convolutional, and Continuoustime Models with Linear State Space Layers »
Albert Gu · Isys Johnson · Karan Goel · Khaled Saab · Tri Dao · Atri Rudra · Christopher Ré 
2021 Poster: Representing Hyperbolic Space Accurately using MultiComponent Floats »
Tao Yu · Christopher De Sa 
2021 Poster: Hyperparameter Optimization Is Deceiving Us, and How to Stop It »
A. Feder Cooper · Yucheng Lu · Jessica Forde · Christopher De Sa 
2021 Poster: Equivariant Manifold Flows »
Isay Katsman · Aaron Lou · Derek Lim · Qingxuan Jiang · Ser Nam Lim · Christopher De Sa 
2021 Poster: Rethinking Neural Operations for Diverse Tasks »
Nicholas Roberts · Mikhail Khodak · Tri Dao · Liam Li · Christopher Ré · Ameet Talwalkar 
2021 Poster: Moser Flow: Divergencebased Generative Modeling on Manifolds »
Noam Rozen · Aditya Grover · Maximilian Nickel · Yaron Lipman 
2020 : Charline Le LanPerfect density models cannot guarantee anomaly detection »
Charline Le Lan 
2020 : Poster Session 2 on Gather.Town »
Charline Le Lan · Emile Mathieu 
2020 : Deep Riemannian Manifold Learning »
Aaron Lou · Maximilian Nickel · Brandon Amos 
2020 : Focused Breakout Session »
Ines Chami · Joey Bose 
2020 : Panel Discussion »
Joey Bose · Emile Mathieu · Charline Le Lan · Ines Chami 
2020 : Poster Session 1 on Gather.Town »
Joey Bose · Ines Chami 
2020 : Tree Covers: An Alternative to Metric Embeddings »
Roshni Sahoo · Ines Chami · Christopher Ré 
2020 : Contributed Talk 4: Directional Graph Networks »
Dominique Beaini · Saro Passaro · Vincent Létourneau · Will Hamilton · Gabriele Corso · Pietro Liò 
2020 : Opening Remarks »
Joey Bose 
2020 Poster: Riemannian Continuous Normalizing Flows »
Emile Mathieu · Maximilian Nickel 
2020 Poster: Random Reshuffling is Not Always Better »
Christopher De Sa 
2020 Poster: Asymptotically Optimal Exact Minibatch MetropolisHastings »
Ruqi Zhang · A. Feder Cooper · Christopher De Sa 
2020 Spotlight: Asymptotically Optimal Exact Minibatch MetropolisHastings »
Ruqi Zhang · A. Feder Cooper · Christopher De Sa 
2020 Spotlight: Random Reshuffling is Not Always Better »
Christopher De Sa 
2020 Poster: Neural Manifold Ordinary Differential Equations »
Aaron Lou · Derek Lim · Isay Katsman · Leo Huang · Qingxuan Jiang · Ser Nam Lim · Christopher De Sa 
2020 Poster: HiPPO: Recurrent Memory with Optimal Polynomial Projections »
Albert Gu · Tri Dao · Stefano Ermon · Atri Rudra · Christopher Ré 
2020 Poster: Adversarial Example Games »
Joey Bose · Gauthier Gidel · Hugo Berard · Andre Cianflone · Pascal Vincent · Simon LacosteJulien · Will Hamilton 
2020 Spotlight: HiPPO: Recurrent Memory with Optimal Polynomial Projections »
Albert Gu · Tri Dao · Stefano Ermon · Atri Rudra · Christopher Ré 
2020 Oral: Hogwild!: A LockFree Approach to Parallelizing Stochastic Gradient Descent »
Benjamin Recht · Christopher Ré · Stephen Wright · Feng Niu 
2020 Poster: From Trees to Continuous Embeddings and Back: Hyperbolic Hierarchical Clustering »
Ines Chami · Albert Gu · Vaggos Chatziafratis · Christopher Ré 
2020 Poster: Learning Dynamic Belief Graphs to Generalize on TextBased Games »
Ashutosh Adhikari · Xingdi Yuan · MarcAlexandre Côté · Mikuláš Zelinka · MarcAntoine Rondeau · Romain Laroche · Pascal Poupart · Jian Tang · Adam Trischler · Will Hamilton 
2019 : Poster Session #2 »
Yunzhu Li · Peter Meltzer · Jianing Sun · Guillaume SALHA · Marin Vlastelica Pogančić · ChiaCheng Liu · Fabrizio Frasca · MarcAlexandre Côté · Vikas Verma · Abdulkadir CELIKKANAT · Pierluca D'Oro · Priyesh Vijayan · Maria Schuld · Petar Veličković · Kshitij Tayal · Yulong Pei · Hao Xu · Lei Chen · Pengyu Cheng · Ines Chami · Dongkwan Kim · Guilherme Gomes · Lukasz Maziarka · Jessica Hoffmann · Ron Levie · Antonia Gogoglou · Shunwang Gong · Federico Monti · Wenlin Wang · Yan Leng · Salvatore Vivona · Daniel FlamShepherd · Chester Holtz · Li Zhang · MAHMOUD KHADEMI · IChung Hsieh · Aleksandar Stanić · Ziqiao Meng · Yuhang Jiao 
2019 : Poster Session #1 »
Adarsh Jamadandi · Sophia Sanborn · Huaxiu Yao · Chen Cai · Yu Chen · JeanMarc Andreoli · Niklas Stoehr · ShihYang Su · Tony Duan · Fábio Ferreira · Davide Belli · Amit Boyarski · Ze Ye · Elahe Ghalebi · Arindam Sarkar · MAHMOUD KHADEMI · Evgeniy Faerman · Joey Bose · Jiaqi Ma · Lin Meng · Seyed Mehran Kazemi · Guangtao Wang · Tong Wu · Yuexin Wu · Chaitanya Joshi · Marc Brockschmidt · Daniele Zambon · Colin Graber · Rafaël Van Belle · Osman Asif Malik · Xavier Glorot · Mario Krenn · Chris Cameron · Binxuan Huang · George Stoica · Alexia Toumpa 
2019 : Opening remarks »
Will Hamilton 
2019 Workshop: KR2ML  Knowledge Representation and Reasoning Meets Machine Learning »
Veronika Thost · Christian Muise · Kartik Talamadupula · Sameer Singh · Christopher Ré 
2019 Workshop: Graph Representation Learning »
Will Hamilton · Rianne van den Berg · Michael Bronstein · Stefanie Jegelka · Thomas Kipf · Jure Leskovec · Renjie Liao · Yizhou Sun · Petar Veličković 
2019 Poster: Numerically Accurate Hyperbolic Embeddings Using TilingBased Models »
Tao Yu · Christopher De Sa 
2019 Poster: On the Downstream Performance of Compressed Word Embeddings »
Avner May · Jian Zhang · Tri Dao · Christopher Ré 
2019 Spotlight: Numerically Accurate Hyperbolic Embeddings Using TilingBased Models »
Tao Yu · Christopher De Sa 
2019 Spotlight: On the Downstream Performance of Compressed Word Embeddings »
Avner May · Jian Zhang · Tri Dao · Christopher Ré 
2019 Poster: MultiResolution Weak Supervision for Sequential Data »
Paroma Varma · Frederic Sala · Shiori Sagawa · Jason A Fries · Daniel Fu · Saelig Khattar · Ashwini Ramamoorthy · Ke Xiao · Kayvon Fatahalian · James Priest · Christopher Ré 
2019 Poster: Slicebased Learning: A Programming Model for Residual Learning in Critical Data Slices »
Vincent Chen · Sen Wu · Alexander Ratner · Jen Weng · Christopher Ré 
2019 Poster: Hyperbolic Graph Convolutional Neural Networks »
Ines Chami · Zhitao Ying · Christopher Ré · Jure Leskovec 
2019 Poster: Hyperbolic Graph Neural Networks »
Qi Liu · Maximilian Nickel · Douwe Kiela 
2019 Poster: DimensionFree Bounds for LowPrecision Training »
Zheng Li · Christopher De Sa 
2019 Poster: PoissonMinibatching for Gibbs Sampling with Convergence Rate Guarantees »
Ruqi Zhang · Christopher De Sa 
2019 Spotlight: PoissonMinibatching for Gibbs Sampling with Convergence Rate Guarantees »
Ruqi Zhang · Christopher De Sa 
2019 Poster: Channel Gating Neural Networks »
Weizhe Hua · Yuan Zhou · Christopher De Sa · Zhiru Zhang · G. Edward Suh 
2019 Poster: Efficient Graph Generation with Graph Recurrent Attention Networks »
Renjie Liao · Yujia Li · Yang Song · Shenlong Wang · Will Hamilton · David Duvenaud · Raquel Urtasun · Richard Zemel 
2019 Poster: Continuous Hierarchical Representations with Poincaré Variational AutoEncoders »
Emile Mathieu · Charline Le Lan · Chris Maddison · Ryota Tomioka · Yee Whye Teh 
2018 : Invited Talk 4 »
Maximilian Nickel 
2018 : Spotlights »
Guangneng Hu · Ke Li · Aviral Kumar · Phi Vu Tran · Samuel Fadel · Rita Kuznetsova · BongNam Kang · Behrouz Haji Soleimani · Jinwon An · Nathan de Lara · Anjishnu Kumar · Tillman Weyde · Melanie Weber · Kristen Altenburger · Saeed Amizadeh · Xiaoran Xu · Yatin Nandwani · Yang Guo · Maria Pacheco · William Fedus · Guillaume Jaume · Yuka Yoneda · Yunpu Ma · Yunsheng Bai · Berk Kapicioglu · Maximilian Nickel · Fragkiskos Malliaros · Beier Zhu · Aleksandar Bojchevski · Joshua Joseph · Gemma Roig · Esma Balkir · Xander Steenbrugge 
2018 Workshop: Relational Representation Learning »
Aditya Grover · Paroma Varma · Frederic Sala · Christopher Ré · Jennifer Neville · Stefano Ermon · Steven Holtzen 
2018 Poster: Learning Compressed Transforms with Low Displacement Rank »
Anna Thomas · Albert Gu · Tri Dao · Atri Rudra · Christopher Ré 
2017 Workshop: Learning with Limited Labeled Data: Weak Supervision and Beyond »
Isabelle Augenstein · Stephen Bach · Eugene Belilovsky · Matthew Blaschko · Christoph Lampert · Edouard Oyallon · Emmanouil Antonios Platanios · Alexander Ratner · Christopher Ré 
2017 : Learning Hierarchical Representations of Relational Data »
Maximilian Nickel 
2017 Workshop: ML Systems Workshop @ NIPS 2017 »
Aparna Lakshmiratan · Sarah Bird · Siddhartha Sen · Christopher Ré · Li Erran Li · Joseph Gonzalez · Daniel Crankshaw 
2017 Demonstration: Babble Labble: Learning from Natural Language Explanations »
Braden Hancock · Paroma Varma · Percy Liang · Christopher Ré · Stephanie Wang 
2017 Poster: Learning to Compose DomainSpecific Transformations for Data Augmentation »
Alexander Ratner · Henry Ehrenberg · Zeshan Hussain · Jared Dunnmon · Christopher Ré 
2017 Poster: Gaussian Quadrature for Kernel Features »
Tri Dao · Christopher M De Sa · Christopher Ré 
2017 Poster: Poincaré Embeddings for Learning Hierarchical Representations »
Maximilian Nickel · Douwe Kiela 
2017 Spotlight: Gaussian Quadrature for Kernel Features »
Tri Dao · Christopher M De Sa · Christopher Ré 
2017 Spotlight: Poincaré Embeddings for Learning Hierarchical Representations »
Maximilian Nickel · Douwe Kiela 
2017 Poster: Inferring Generative Model Structure with Static Analysis »
Paroma Varma · Bryan He · Payal Bajaj · Nishith Khandwala · Imon Banerjee · Daniel Rubin · Christopher Ré 
2016 : Invited Talk: You've been using asynchrony wrong your whole life! (Chris Re, Stanford) »
Christopher Ré 
2016 Workshop: Learning with Tensors: Why Now and How? »
Anima Anandkumar · Rong Ge · Yan Liu · Maximilian Nickel · Qi (Rose) Yu 
2016 Poster: Cyclades: Conflictfree Asynchronous Machine Learning »
Xinghao Pan · Maximilian Lam · Stephen Tu · Dimitris Papailiopoulos · Ce Zhang · Michael Jordan · Kannan Ramchandran · Christopher Ré · Benjamin Recht 
2016 Poster: Subsampled Newton Methods with Nonuniform Sampling »
Peng Xu · Jiyan Yang · Farbod RoostaKhorasani · Christopher Ré · Michael Mahoney 
2015 Symposium: Brains, Minds and Machines »
Gabriel Kreiman · Tomaso Poggio · Maximilian Nickel 
2015 Poster: Asynchronous stochastic convex optimization: the noise is in the noise and SGD don't care »
Sorathan Chaturapruek · John Duchi · Christopher Ré 
2015 Poster: Rapidly Mixing Gibbs Sampling for a Class of Factor Graphs Using Hierarchy Width »
Christopher M De Sa · Ce Zhang · Kunle Olukotun · Christopher Ré 
2015 Spotlight: Rapidly Mixing Gibbs Sampling for a Class of Factor Graphs Using Hierarchy Width »
Christopher M De Sa · Ce Zhang · Kunle Olukotun · Christopher Ré · Christopher Ré 
2015 Poster: Taming the Wild: A Unified Analysis of HogwildStyle Algorithms »
Christopher M De Sa · Ce Zhang · Kunle Olukotun · Christopher Ré · Christopher Ré 
2014 Workshop: 4th Workshop on Automated Knowledge Base Construction (AKBC) »
Sameer Singh · Fabian M Suchanek · Sebastian Riedel · Partha Pratim Talukdar · Kevin Murphy · Christopher Ré · William Cohen · Tom Mitchell · Andrew McCallum · Jason E Weston · Ramanathan Guha · Boyan Onyshkevych · Hoifung Poon · Oren Etzioni · Ari Kobren · Arvind Neelakantan · Peter Clark 
2014 Poster: Reducing the Rank in Relational Factorization Models by Including Observable Patterns »
Maximilian Nickel · Xueyan Jiang · Volker Tresp 
2014 Spotlight: Reducing the Rank in Relational Factorization Models by Including Observable Patterns »
Maximilian Nickel · Xueyan Jiang · Volker Tresp 
2014 Poster: Parallel Feature Selection Inspired by Group Testing »
Yingbo Zhou · Utkarsh Porwal · Ce Zhang · Hung Q Ngo · XuanLong Nguyen · Christopher Ré · Venu Govindaraju 
2013 Workshop: Big Learning : Advances in Algorithms and Data Management »
Xinghao Pan · Haijie Gu · Joseph Gonzalez · Sameer Singh · Yucheng Low · Joseph Hellerstein · Derek G Murray · Raghu Ramakrishnan · Michael Jordan · Christopher Ré