Recent years have seen a surge in research at the intersection of differential geometry and deep learning, including techniques for stochastic optimization on curved spaces (e.g., hyperbolic or spherical manifolds), learning embeddings for nonEuclidean data, and generative modeling on Riemannian manifolds. Insights from differential geometry have led to new state of the art approaches to modeling complex real world data, such as graphs with hierarchical structure, 3D medical data, and meshes.
Thus, it is of critical importance to understand, from a geometric lens, the natural invariances, equivariances, and symmetries that reside within data.
In order to support the burgeoning interest of differential geometry in deep learning, the primary goal for this workshop is to facilitate community building and to work towards the identification of key challenges in comparison with regular deep learning, along with techniques to overcome these challenges. With many new researchers beginning projects in this area, we hope to bring them together to consolidate this fastgrowing area into a healthy and vibrant subfield. In particular, we aim to strongly promote novel and exciting applications of differential geometry for deep learning with an emphasis on bridging theory to practice which is reflected in our choices of invited speakers, which include both machine learning practitioners and researchers who are primarily geometers.
Fri 5:00 a.m.  3:00 p.m.

gather.town
(Social)
link »
For Poster sessions Gather.Town: https://gather.town/app/jXxVw7lqYrgIZ2zL/diffgeo4dl 
🔗 
Fri 5:45 a.m.  6:00 a.m.

Opening Remarks
(Presentation)

Joey Bose 🔗 
Fri 6:00 a.m.  6:30 a.m.

Invited Talk 1: Geometric deep learning for 3D human body synthesis
(Talk by Michael Bronstein)
SlidesLive Video » Geometric deep learning, a new class of ML methods trying to extend the basic building blocks of deep neural architectures to geometric data (point clouds, graphs, and meshes), has recently excelled in many challenging analysis tasks in computer vision and graphics such as deformable 3D shape correspondence. In this talk, I will present recent research efforts in 3D shape synthesis, focusing in particular on the human body, face, and hands. 
Michael Bronstein 🔗 
Fri 6:30 a.m.  7:00 a.m.

Invited Talk 2: Gauge Theory in Geometric Deep Learning
(Talk by Taco Cohen)
SlidesLive Video » It is often said that differential geometry is in essence the study of connections on a principal bundle. These notions have been discovered independently in gauge theory in physics, and over the last few years it has become clear that they also provide a very general and systematic way to model convolutional neural networks on homogeneous spaces and general manifolds. Specifically, representation spaces in these networks are described as fields of geometric quantities on a manifold (i.e. sections of associated vector bundles). These quantities can only be expressed numerically after making an arbitrary choice of frame / gauge (section of a principal bundle). Network layers map between representation spaces, and should be equivariant to symmetry transformations. In this talk I will discuss two results that have a bearing on geometric deep learning research. First, we discuss the “convolution is all you need theorem” which states that any linear equivariant map between homogeneous representation spaces is a generalized convolution. Secondly, in the case of gauge symmetry (when all frames should be considered equivalent), we show that defining a nontrivial equivariant linear map between representation spaces requires the introduction of a principal connection which defines parallel transport. We will not assume familiarity with bundles or gauge theory, and use examples relevant to neural networks to illustrate the ideas. 
Taco Cohen 🔗 
Fri 7:00 a.m.  7:05 a.m.

Contributed Talk 1: Learning Hyperbolic Representations for Unsupervised 3D Segmentation
(Contributed Talk)
SlidesLive Video » There exists a need for unsupervised 3D segmentation on complex volumetric data, particularly when annotation ability is limited or discovery of new categories is desired. Using the observation that 3D data is innately hierarchical, we propose learning effective representations of 3D patches for unsupervised segmentation through a variational autoencoder with a hyperbolic latent space and a proposed gyroplane convolutional layer, which better models underlying hierarchical structure within a 3D image. We also introduce a hierarchical triplet loss and multiscale patch sampling scheme to embed relationships across varying levels of granularity. We demonstrate the effectiveness of our hyperbolic representations for unsupervised 3D segmentation on a hierarchical toy dataset and the BraTS dataset. 
Joy Hsu · Jeffrey Gu · Serena Yeung 🔗 
Fri 7:06 a.m.  7:11 a.m.

Contributed Talk 2: Witness Autoencoder: Shaping the Latent Space with Witness Complexes
(Contributed Talk 2)
SlidesLive Video » We present a Witness Autoencoder (WAE) – an autoencoder that captures geodesic distances of the data in the latent space. Our algorithm uses witness complexes to compute geodesic distance approximations on a minibatch level, and leverages topological information from the entire dataset while performing batchwise approximations. This way, our method allows to capture the global structure of the data even with a small batch size, which is beneficial for largescale realworld data. We show that our method captures the structure of the manifold more accurately than the recently introduced topological autoencoder (TopoAE). 
Anastasiia Varava · Danica Kragic · Simon Schönenberger · Jen Jen Chung · Roland Siegwart · Vladislav Polianskii 🔗 
Fri 7:12 a.m.  7:17 a.m.

Contributed Talk 3: A Riemannian gradient flow perspective on learning deep linear neural networks
(Contributed Talk 3)
SlidesLive Video »
We study the convergence of gradient flows related to learning deep linear neural networks from data. In this case, the composition of the network layers amounts to simply multiplying the weight matrices of all layers together, resulting in an overparameterized problem. The gradient flow with respect to these factors can be reinterpreted as a Riemannian gradient flow on the manifold of rank$r$ matrices endowed with a suitable Riemannian metric. We show that the flow always converges to a critical point of the underlying functional. Moreover, we establish that, for almost all initializations, the flow converges to a global minimum on the manifold of rank $k$ matrices for some $k\leq r$.

Ulrich Terstiege · Holger Rauhut · Bubacarr Bah · Michael Westdickenberg 🔗 
Fri 7:18 a.m.  7:23 a.m.

Contributed Talk 4: Directional Graph Networks
(Contributed Talk 4)
SlidesLive Video » In order to overcome the expressive limitations of graph neural networks (GNNs), we propose the first method that exploits vector flows over graphs to develop globally consistent directional and asymmetric aggregation functions. We show that our directional graph networks (DGNs) generalize convolutional neural networks (CNNs) when applied on a grid. Whereas recent theoretical works focus on understanding local neighbourhoods, local structures and local isomorphism with no global information flow, our novel theoretical framework allows directional convolutional kernels in any graph. First, by defining a vector field in the graph, we develop a method of applying directional derivatives and smoothing by projecting nodespecific messages into the field. Then we propose the use of the Laplacian eigenvectors as such vector field. Finally, we bring the power of CNN data augmentation to graphs by providing a means of doing reflection and rotation on the underlying directional field. 
Dominique Beaini · Saro Passaro · Vincent Létourneau · Will Hamilton · Gabriele Corso · Pietro Liò 🔗 
Fri 7:24 a.m.  7:29 a.m.

Contributed Talk 5: A New Neural Network Architecture Invariant to the Action of Symmetry Subgroups
(Contributed Talk 5)
SlidesLive Video »
We propose a computationally efficient $G$invariant neural network that approximates functions invariant to the action of a given permutation subgroup $G \leq S_n$ of the symmetric group on input data. The key element of the proposed network architecture is a new $G$invariant transformation module, which produces a $G$invariant latent representation of the input data.
Theoretical considerations are supported by numerical experiments, which demonstrate the effectiveness and strong generalization properties of the proposed method in comparison to other $G$invariant neural networks.

Mete Ozay · Piotr Kicki · Piotr Skrzypczynski 🔗 
Fri 7:30 a.m.  8:00 a.m.

Virtual Coffee Break on Gather.Town (Break) link »  🔗 
Fri 8:00 a.m.  8:30 a.m.

Invited Talk 3: Reparametrization invariance in representation learning
(Talk by Søren Hauberg)
SlidesLive Video » Generative models learn a compressed representation of data that is often used for downstream tasks such as interpretation, visualization and prediction via transfer learning. Unfortunately, the learned representations are generally not statistically identifiable, leading to a high risk of arbitrariness in the downstream tasks. We propose to use differential geometry to construct representations that are invariant to reparametrizations, thereby solving the bulk of the identifiability problem. We demonstrate that the approach is deeply tied to the uncertainty of the representation, and that practical applications require highquality uncertainty quantification. With the identifiability problem solved, we show how to construct better priors for generative models, and that the identifiable representations reveals signals in the data that were otherwise hidden. 
Søren Hauberg 🔗 
Fri 8:30 a.m.  9:30 a.m.

Poster Session 1 on Gather.Town (Poster Session) link »  Joey Bose · Ines Chami 🔗 
Fri 8:30 a.m.  9:30 a.m.

Quaternion Graph Neural Networks
(Poster)
SlidesLive Video » Recently, graph neural networks (GNNs) become a principal research direction to learn lowdimensional continuous embeddings of nodes and graphs to predict node and graph labels, respectively. However, Euclidean embeddings have high distortion when using GNNs to model complex graphs such as social networks. Furthermore, existing GNNs are not very efficient with the high number of model parameters when increasing the number of hidden layers. Therefore, we move beyond the Euclidean space to a hypercomplex vector space to improve graph representation quality and reduce the number of model parameters. To this end, we propose quaternion graph neural networks (QGNN) to generalize GCNs within the Quaternion space to learn quaternion embeddings for nodes and graphs. The Quaternion space, a hypercomplex vector space, provides highly meaningful computations through Hamilton product compared to the Euclidean and complex vector spaces. As a result, our QGNN can reduce the model size up to four times and enhance learning better graph representations. Experimental results show that the proposed QGNN produces stateoftheart accuracies on a range of wellknown benchmark datasets for three downstream tasks, including graph classification, semisupervised node classification, and text (node) classification. 
Dai Quoc Nguyen · Tu Dinh Nguyen · Dinh Phung 🔗 
Fri 8:30 a.m.  9:30 a.m.

Universal Approximation Property of Neural Ordinary Differential Equations
(Poster)
SlidesLive Video »
Neural ordinary differential equations (NODEs) is an invertible neural network architecture promising for its freeform Jacobian and the availability of a tractable Jacobian determinant estimator. Recently, the representation power of NODEs has been partly uncovered: they form an $L^p$universal approximator for continuous maps under certain conditions. However, the $L^p$universality may fail to guarantee an approximation for the entire input domain as it may still hold even if the approximator largely differs from the target function on a small region of the input space. To further uncover the potential of NODEs, we show their stronger approximation property, namely the $\sup$universality for approximating a large class of diffeomorphisms. It is shown by leveraging a structure theorem of the diffeomorphism group, and the result complements the existing literature by establishing a fairly large set of mappings that NODEs can approximate with a stronger guarantee.

Takeshi Teshima · Koichi Tojo · Masahiro Ikeda · Isao Ishikawa · Kenta Oono 🔗 
Fri 8:30 a.m.  9:30 a.m.

Hermitian Symmetric Spaces for Graph Embeddings
(Poster)
SlidesLive Video » Learning faithful graph representations as sets of vertex embeddings has become a fundamental intermediary step in a wide range of machine learning applications. The quality of the embeddings is usually determined by how well the geometry of the target space matches the structure of the data. In this work we learn continuous representations of graphs in spaces of symmetric matrices over C. These spaces offer a rich geometry that simultaneously admits hyperbolic and Euclidean subspaces, and are amenable to analysis and explicit computations. We implement an efficient method to learn embeddings and compute distances, and develop the tools to operate with such spaces. The proposed models are able to automatically adapt to very dissimilar arrangements without any apriori estimates of graph features. On various datasets with very diverse structural properties and reconstruction measures our model ties the results of competitive baselines for geometrically pure graphs and outperforms them for graphs with mixed geometric features, showcasing the versatility of our approach. 
Federico López · Beatrice Pozzetti · Steve Trettel · Anna Wienhard 🔗 
Fri 8:30 a.m.  9:30 a.m.

Isometric Gaussian Process Latent Variable Model
(Poster)
SlidesLive Video » We propose a fully generative unsupervised model where the latent variable respects both the distances and the topology of the modeled data. The model leverages the Riemannian geometry of the generated manifold to endow the latent space with a welldefined stochastic distance measure, which is modeled as Nakagami distributions. These stochastic distances are sought to be as similar as possible to observed distances along a neighborhood graph through a censoring process. The model is inferred by variational inference. We demonstrate how the new model can encode invariances in the learned manifolds. 
Martin Jørgensen · Søren Hauberg 🔗 
Fri 8:30 a.m.  9:30 a.m.

Grassmann Iterative Linear Discriminant Analysis with Proxy Matrix Optimization
(Poster)
SlidesLive Video » Linear Discriminant Analysis (LDA) is one of the most common methods for dimensionality reduction in pattern recognition and statistics. It is a supervised method that aims to find the most discriminant space in the reduced dimensional space, which can be further used with a linear classifier for classification. In this work, we present an iterative optimization method called the Proxy Matrix Optimization (PMO) which makes use of automatic differentiation and stochastic gradient descent (SGD) on the Grassmann manifold to arrive at the optimal projection matrix. We show that PMO does better than the prevailing manifold optimization methods. 
Navya Nagananda · Breton Minnehan · Andreas Savakis 🔗 
Fri 8:30 a.m.  9:30 a.m.

Tree Covers: An Alternative to Metric Embeddings
(Poster)
SlidesLive Video » We study the problem of finding distancepreserving graph representations. Most previous approaches focus on learning continuous embeddings in metric spaces such as Euclidean or hyperbolic spaces. Based on the observation that embedding into a metric space is not necessary to produce faithful representations, we explore a new conceptual approach to represent graphs using a collection of trees, namely a tree cover. We show that with the same amount of storage, covers achieve lower distortion than learned metric embeddings. While the distance induced by covers is not a metric, we find that tree covers still have the desirable properties of graph representations, including efficiency in query and construction time. 
Roshni Sahoo · Ines Chami · Christopher Ré 🔗 
Fri 8:30 a.m.  9:30 a.m.

Deep Networks and the Multiple Manifold Problem
(Poster)
SlidesLive Video » We study the multiple manifold problem, a binary classification task modeled on applications in machine vision, in which a deep fullyconnected neural network is trained to separate two lowdimensional submanifolds of the unit sphere. We provide an analysis of the onedimensional case, proving for a simple manifold configuration that when the network depth L is large relative to certain geometric and statistical properties of the data, the network width n grows as a sufficiently large polynomial in L, and the number of i.i.d. samples from the manifolds is polynomial in L, randomlyinitialized gradient descent rapidly learns to classify the two manifolds perfectly with high probability. Our analysis demonstrates concrete benefits of depth and width in the context of a practicallymotivated model problem: the depth acts as a fitting resource, with larger depths corresponding to smoother networks that can more readily separate the class manifolds, and the width acts as a statistical resource, enabling concentration of the randomlyinitialized network and its gradients. Along the way, we establish essentially optimal nonasymptotic rates of concentration for the neural tangent kernel of deep fullyconnected ReLU networks using martingale techniques, requiring width n \geq L poly(d0) to achieve uniform concentration of the initial kernel over a d0dimensional submanifold of the unit sphere. Our approach should be of use in establishing similar results for other network architectures. 
Samuel Buchanan · Dar Gilboa · John Wright 🔗 
Fri 8:30 a.m.  9:30 a.m.

GENNI: Visualising the Geometry of Equivalences for Neural Network Identifiability
(Poster)
SlidesLive Video » In this paper, we propose an efficient algorithm to visualise symmetries in neural networks. Typically the models are defined with respect to a parameter space, where nonequal parameters can produce the same function. Our proposed tool, GENNI, allows us to identify parameters that are functionally equivalent and to then visualise the subspace of the resulting equivalence class. Specifically, we experiment on simple cases, to demonstrate how to identify and provide possible solutions for more complicated scenarios. 
Arinbjörn Kolbeinsson · Nicholas Jennings · Marc Deisenroth · Daniel Lengyel · Janith Petangoda · Michalis Lazarou · Kate Highnam · John IF Falk 🔗 
Fri 8:30 a.m.  9:30 a.m.

Graph of Thrones : Adversarial Perturbations dismantle Aristocracy in Graphs
(Poster)
SlidesLive Video » This paper investigates the effect of adversarial perturbations on the hyperbolicity of graphs. Learning lowdimensional embeddings of graph data in certain curved Riemannian manifolds has recently gained traction due to their desirable property of acting as useful geometrical inductive biases. More specifically, models of Hyperbolic geometry such as Poincar\'{e} Ball and Hyperboloid Model have found extensive applications for learning representations of discrete data such as Graphs and Trees with hierarchical anatomy. The hyperbolicity concept indicates whether the graph data under consideration is suitable for embedding in hyperbolic geometry. Lower values of hyperbolicity imply distortionfree embedding in hyperbolic space. We study adversarial perturbations that attempt to poison the graph structure, consequently rendering hyperbolic geometry an ineffective choice for learning representations. To circumvent this problem, we advocate for utilizing Lorentzian manifolds in machine learning pipelines and empirically show they are better suited to learn hierarchical relationships. Despite the recent proliferation of adversarial robustness methods in the graph data, this is the first work that explores the relationship between adversarial attacks and hyperbolicity property while also providing resolution to navigate such vulnerabilities. 
Adarsh Jamadandi · Uma Mudenagudi 🔗 
Fri 8:30 a.m.  9:30 a.m.

A Metric for Linear SymmetryBased Disentanglement
(Poster)
SlidesLive Video » The definition of Linear SymmetryBased Disentanglement (LSBD) proposed by Higgins et al. outlines the properties that should characterize a disentangled representation that captures the symmetries of data. However, it is not clear how to measure the degree to which a data representation fulfills these properties. In this work, we propose a metric for the evaluation of the level of LSBD that a data representation achieves We provide a practical method to evaluate this metric and use it to evaluate the disentanglement for the data representation obtained for three datasets with underlying SO(2) symmetries. 
Luis Armando Pérez Rey · Loek Tonnaer · Vlado Menkovski · Mike Holenderski · Jim Portegies 🔗 
Fri 9:30 a.m.  10:15 a.m.

Panel Discussion
(Panel)

Joey Bose · Emile Mathieu · Charline Le Lan · Ines Chami 🔗 
Fri 10:15 a.m.  10:45 a.m.

Virtual Coffee Break on Gather.Town (Break) link »  🔗 
Fri 10:45 a.m.  11:30 a.m.

Focused Breakout Session
(Demonstration)
link »
SlidesLive Video » 
Ines Chami · Joey Bose 🔗 
Fri 10:45 a.m. 

Focused Breakout Session Companion Notebook: Poincare Embeddings
(Demonstration)
link »
Link to Google Collab notebook on Poincare Embeddings. 
🔗 
Fri 10:45 a.m. 

Focused Breakout Session Companion Notebook: Wrapped Normal Distribution
(Demonstration)
link »
Link to Google Collab notebook on plotting a Wrapped Normal Distribution. 
🔗 
Fri 11:30 a.m.  12:00 p.m.

Invited Talk 4: An introduction to the Calderon and Steklov inverse problems on Riemannian manifolds with boundary
(Talk by Niky Kamran)
SlidesLive Video » Given a compact Riemannian manifold with boundary, the DirichlettoNeumann operator is a nonlocal map which assigns to data prescribed on the boundary of the manifold the normal derivative of the unique solution of the LaplaceBeltrami equation determined by the given boundary data. Physically, it can be thought of for example as a voltage to current map in an anisotropic medium in which the conductivity is modeled geometrically through a Riemannian metric. The Calderon problem is the inverse problem of recovering the Riemannian metric from the DirichlettoNeumann operator, while the Steklov inverse problem is to recover the metric from the knowledge of the spectrum of the DirichlettoNeumann operator. These inverse problems are both severely illposed . We will give an overview of some of the main results known about these questions, and time permitting, we will discuss the question of stability for the inverse Steklov problem. 
Niky Kamran 🔗 
Fri 12:00 p.m.  1:00 p.m.

Poster Session 2 on Gather.Town (Poster Session) link »  Charline Le Lan · Emile Mathieu 🔗 
Fri 12:00 p.m.  1:00 p.m.

The Intrinsic Dimension of Images and Its Impact on Learning
(Poster)
SlidesLive Video » It is widely believed that natural image data exhibits lowdimensional structure despite being embedded in a highdimensional pixel space. This idea underlies a common intuition for the success of deep learning and has been exploited for enhanced regularization and adversarial robustness. In this work, we apply dimension estimation tools to popular datasets and investigate the role of low dimensional structure in neural network learning. We find that common natural image datasets indeed have very low intrinsic dimension relative to the high number of pixels in the images. Additionally, we find that low dimensional datasets are easier for neural networks to learn. We validate our findings by carefullydesigned experiments to vary the intrinsic dimension of both synthetic and real data and evaluate its impact on sample complexity. 
Chen Zhu · Micah Goldblum · Ahmed Abdelkader · Tom Goldstein · Phillip Pope 🔗 
Fri 12:00 p.m.  1:00 p.m.

Sparsifying networks by traversing Geodesics
(Poster)
SlidesLive Video » The geometry of weight spaces and functional manifolds of neural networks play an important role towards `understanding' the intricacies of ML. In this paper, we attempt to solve certain open questions in ML, by viewing them through the lens of geometry, ultimately relating it to the discovery of points or paths of equivalent function in these spaces. We propose a mathematical framework to evaluate geodesics in the functional space, to find highperformance paths from a dense network to its sparser counterpart. Our results are obtained on VGG11 trained on CIFAR10 and MLP's trained on MNIST. Broadly, we demonstrate that the framework is general, and can be applied to a wide variety of problems, ranging from sparsification to alleviating catastrophic forgetting. 
Guruprasad Raghavan · Matt Thomson 🔗 
Fri 12:00 p.m.  1:00 p.m.

Convex Optimization for Blind Source Separation on a Statistical Manifold
(Poster)
SlidesLive Video » We present a novel blind source separation (BSS) method using a hierarchical structure of sample space that is incorporated with a loglinear model. Our approach is formulated as a convex optimization with theoretical guarantees to uniquely recover a set of source signals by minimizing the KL divergence from a set of mixed signals. Source signals, received signals, and mixing matrices are realized as different layers in our hierarchical sample space. Our empirical results have demonstrated superiority compared to well established techniques. 
Simon Luo · lamiae azizi · Mahito Sugiyama 🔗 
Fri 12:00 p.m.  1:00 p.m.

Unsupervised Orientation Learning Using Autoencoders
(Poster)
SlidesLive Video » We present a method to learn the orientation of symmetric objects in realworld images in an unsupervised way. Our method explicitly maps inplane relative rotations to the latent space of an autoencoder, by rotating both in the image domain and latent domain. This is achieved by adding a proposed \textit{crossing loss} to a standard autoencoder training framework which enforces consistency between the image domain and latent domain rotations. This relative representation of rotation is made absolute, by using the symmetry of the observed object, resulting in an unsupervised method to learn the orientation. Furthermore, orientation is disentangled in latent space from other descriptive factors. We apply this method on two realworld datasets: aerial images of planes in the DOTA dataset and images of densely packed honeybees. We empirically show this method can learn orientation using no annotations with high accuracy compared to the same models trained with annotations. 
Rembert Daems · Francis Wyffels 🔗 
Fri 12:00 p.m.  1:00 p.m.

Towards Geometric Understanding of LowRank Approximation
(Poster)
SlidesLive Video » Rank reduction of matrices has been widely studied in linear algebra. However, its geometric understanding is limited and theoretical connection to statistical models remains unrevealed. We tackle this problem using information geometry and present a geometric unified view of matrix rank reduction. Our key idea is to treat each matrix as a probability distribution represented by the loglinear model on a partially ordered set (poset), which enables us to formulate rank reduction as projection onto a statistical submanifold, which corresponds to the set of lowrank matrices. This geometric view enables us to derive a novel efficient rank1 reduction method, called Legendre rank1 reduction, which analytically solves meanfield approximation and minimizes the KL divergence from a given matrix. 
Mahito Sugiyama · Kazu Ghalamkari 🔗 
Fri 12:00 p.m.  1:00 p.m.

Deep Riemannian Manifold Learning
(Poster)
SlidesLive Video » We present a new class of learnable Riemannian manifolds with a metric parameterized by a deep neural network. The core manifold operationsspecifically the Riemannian exponential and logarithmic mapsare solved using approximate numerical techniques. Input and parameter gradients are computed with an adjoint sensitivity analysis. This enables us to fit geodesics and distances with gradientbased optimization of both onmanifold values and the manifold itself. We demonstrate our method's capability to model smooth, flexible metric structures in graph and dynamical system embedding tasks. 
Aaron Lou · Maximilian Nickel · Brandon Amos 🔗 
Fri 12:00 p.m.  1:00 p.m.

Leveraging Smooth Manifolds for Lexical Semantic Change Detection across Corpora
(Poster)
SlidesLive Video » Comparing two bodies of text and detecting words with significant lexical semantic shift between them is an important part of digital humanities. Traditional approaches have relied on aligning the different embeddings in the Euclidean space using the Orthogonal Procrustes problem. This study presents a geometric framework that leverages optimization on smooth Riemannian manifolds for obtaining corpusspecific orthogonal rotations and a corpusindependent scaling to project the different vector spaces into a shared latent space. This enables us to capture any affine relationship between the embedding spaces while utilising the rich geometry of smooth manifolds. 
Anmol Goel · Ponnurangam Kumaraguru 🔗 
Fri 12:00 p.m.  1:00 p.m.

Extendable and invertible manifold learning with geometry regularized autoencoders
(Poster)
SlidesLive Video » A fundamental task in data exploration is to extract simplified low dimensional representations that capture intrinsic geometry in data, especially for faithfully visualizing data in two or three dimensions. Common approaches to this task use kernel methods for manifold learning. However, these methods typically only provide an embedding of fixed input data and cannot extend to new data points. Autoencoders have also recently become popular for representation learning. But while they naturally compute feature extractors that are both extendable to new data and invertible (i.e., reconstructing original features from latent representation), they have limited capabilities to follow global intrinsic geometry compared to kernelbased manifold learning. We present a new method for integrating both approaches by incorporating a geometric regularization term in the bottleneck of the autoencoder. Our regularization, based on the diffusion potential distances from the recentlyproposed PHATE visualization method, encourages the learned latent representation to follow intrinsic data geometry, similar to manifold learning algorithms, while still enabling faithful extension to new data and reconstruction of data in the original feature space from latent coordinates. We compare our approach with leading kernel methods and autoencoder models for manifold learning to provide qualitative and quantitative evidence of our advantages in preserving intrinsic structure, out of sample extension, and reconstruction. 
Andres F Duque · Sacha Morin · Guy Wolf · Kevin Moon 🔗 
Fri 12:00 p.m.  1:00 p.m.

QuatRE: RelationAware Quaternions for Knowledge Graph Embeddings
(Poster)
SlidesLive Video » We propose an effective embedding model, named QuatRE, to learn quaternion embeddings for entities and relations in knowledge graphs. QuatRE aims to enhance correlations between head and tail entities given a relation within the Quaternion space with Hamilton product. QuatRE achieves this goal by further associating each relation with two relationaware quaternion vectors which are used to rotate the head and tail entities' quaternion embeddings, respectively. To obtain the triple score, QuatRE rotates the rotated embedding of the head entity using the normalized quaternion embedding of the relation, followed by a quaternioninner product with the rotated embedding of the tail entity. Experimental results demonstrate that our QuatRE produces stateoftheart performances on four wellknown benchmark datasets for knowledge graph completion. 
Dai Quoc Nguyen · Dinh Phung 🔗 
Fri 12:00 p.m.  1:00 p.m.

Affinity guided Geometric SemiSupervised Metric Learning
(Poster)
SlidesLive Video » In this paper, we revamp the forgotten classical SemiSupervised Distance Metric Learning (SSDML) problem from a Riemannian geometric lens, to leverage stochastic optimization within a endtoend deep framework. The motivation comes from the fact that apart from a few classical SSDML approaches learning a linear Mahalanobis metric, deep SSDML has not been studied. We first extend existing SSDML methods to their deep counterparts and then propose a new method to overcome their limitations. Due to the nature of constraints on our metric parameters, we leverage Riemannian optimization. Our deep SSDML method with a novel affinity propagation based triplet mining strategy outperforms its competitors. 
Ujjal Dutta · Mehrtash Harandi · C Chandra Shekhar 🔗 
Fri 1:00 p.m.  1:30 p.m.

Invited Talk 5: Disentangling Orientation and Camera Parameters from CryoElectron Microscopy Images Using Differential Geometry and Variational Autoencoders
(Talk by Nina Miolane)
SlidesLive Video » Cryoelectron microscopy (cryoEM) is capable of producing reconstructed 3D images of biomolecules at nearatomic resolution. However, raw cryoEM images are highly corrupted 2D projections of the target 3D biomolecules. Reconstructing the 3D molecular shape requires the estimation of the orientation of the biomolecule that has produced the given 2D image, and the estimation of camera parameters to correct for intensity defects. Current techniques performing these tasks are often computationally expensive, while the dataset sizes keep growing. There is a need for nextgeneration algorithms that preserve accuracy while improving speed and scalability. In this paper, we combine variational autoencoders (VAEs) to learn a lowdimensional latent representation of cryoEM images. Analyzing the latent space with differential geometry of shape spaces leads us to design a new estimation method for orientation and camera parameters of singleparticle cryoEM images, that has the potential to accelerate the traditional reconstruction algorithm. 
Nina Miolane 🔗 
Fri 1:30 p.m.  2:00 p.m.

Invited Talk 6: Learning a robust classifier in hyperbolic space
(Talk by Melanie Weber)
SlidesLive Video » Recently, there has been a surge of interest in representing largescale, hierarchical data in hyperbolic spaces to achieve better representation accuracy with lower dimensions. However, beyond representation learning, there are few empirical and theoretical results that develop performance guarantees for downstream machine learning and optimization tasks in hyperbolic spaces. In this talk we consider the task of learning a robust classifier in hyperbolic space. We start with algorithmic aspects of developing analogues of classical methods, such as the perceptron or support vector machines, in hyperbolic spaces. We also discuss more broadly the challenges of generalizing such methods to nonEuclidean spaces. Furthermore, we analyze the role of geometry in learning robust classifiers by evaluating the tradeoff between low embedding dimensions and low distortion for both Euclidean and hyperbolic spaces. 
Melanie Weber 🔗 