Timezone: »

Workshop
New Frontiers in Graph Learning
Jiaxuan You · Marinka Zitnik · Rex Ying · Yizhou Sun · Hanjun Dai · Stefanie Jegelka

Fri Dec 02 06:40 AM -- 03:00 PM (PST) @ Theater A

Background. In recent years, graph learning has quickly grown into an established sub-field of machine learning. Researchers have been focusing on developing novel model architectures, theoretical understandings, scalable algorithms and systems, and successful applications across industry and science regarding graph learning. In fact, more than 5000 research papers related to graph learning have been published over the past year alone.

Challenges. Despite the success, existing graph learning paradigms have not captured the full spectrum of relationships in the physical and the virtual worlds. For example, in terms of applicability of graph learning algorithms, current graph learning paradigms are often restricted to datasets with explicit graph representations, whereas recent works have shown promise of graph learning methods for applications without explicit graph representations. In terms of usability, while popular graph learning libraries greatly facilitate the implementation of graph learning techniques, finding the right graph representation and model architecture for a given use case still requires heavy expert knowledge. Furthermore, in terms of generalizability, unlike domains such as computer vision and natural language processing where large-scale pre-trained models generalize across downstream applications with little to no fine-tuning and demonstrate impressive performance, such a paradigm has yet to succeed in the graph learning domain.

Goal. The primary goal of this workshop is to expand the impact of graph learning beyond the current boundaries. We believe that graph, or relation data, is a universal language that can be used to describe the complex world. Ultimately, we hope graph learning will become a generic tool for learning and understanding any type of (structured) data. We aim to present and discuss the new frontiers in graph learning with researchers and practitioners within and outside the graph learning community. New understandings of the current challenges, new perspectives regarding the future directions, and new solutions and applications as proof of concepts are highly welcomed.

Scope and Topics. We welcome submissions regarding the new frontiers of graph learning, including but not limited to:
- Graphs in the wild: Graph learning for datasets and applications without explicit relational structure (e.g., images, text, audios, code). Novel ways of modeling structured/unstructured data as graphs are highly welcomed.
- Graphs in ML: Graph representations in general machine learning problems (e.g., neural architectures as graphs, relations among input data and learning tasks, graphs in large language models, etc.)
- New oasis: Graph learning methods that are significantly different from the current paradigms (e.g., large-scale pre-trained models, multi-task models, super scalable algorithms, etc.)
- New capabilities: Graph representation for knowledge discovery, optimization, causal inference, explainable ML, ML fairness, etc.
- Novel applications: Novel applications of graph learning in real-world industry and scientific domains. (e.g., graph learning for missing data imputation, program synthesis, etc.)

Call for papers

Submission deadline: Thursday, Sept 22, 2022 (16:59 PDT)

Submission site (OpenReview): NeurIPS 2022 GLFrontiers Workshop

Author notification: Thursday, Oct 6, 2022

Workshop (in person): Friday, Dec 2, 2022

The workshop will be held fully in person at the New Orleans Convention Center, as part of the NeurIPS 2022 conference. We also plan to offer livestream for the event, and more details will come soon.

We welcome both short research papers of up to 4 pages (excluding references and supplementary materials), and full-length research papers of up to 8 pages (excluding references and supplementary materials). All accepted papers will be presented as posters. We plan to select around 6 papers for oral presentations and 2 papers for the outstanding paper awards with potential cash incentives.

All submissions must use the NeurIPS template. We do not require the authors to include the checklist in the template. Submissions should be in .pdf format, and the review process is double-blind—therefore the papers should be appropriately anonymized. Previously published work (or under-review) is acceptable.

Should you have any questions, please reach out to us via email:

 Fri 6:40 a.m. - 7:00 a.m. Jiaxuan You (Opening remarks) Jiaxuan You 🔗 Fri 7:00 a.m. - 7:30 a.m. Invited talk: Kelsey Allen (Invited talk) Kelsey Allen 🔗 Fri 7:30 a.m. - 8:00 a.m. Invited talk: Azalia Mirhoseini (Invited talk) Azalia Mirhoseini 🔗 Fri 8:30 a.m. - 9:00 a.m. Oral presenters (Contributed talks: Part 1) 🔗 Fri 9:00 a.m. - 10:00 a.m. All participants (Poster sessions: Morning) 🔗 Fri 10:00 a.m. - 11:00 a.m. All participants (Lunch break) 🔗 Fri 11:00 a.m. - 11:30 a.m. Invited talk: Francesco Di Giovanni (Invited talk) Francesco Di Giovanni · Francesco Di Giovanni 🔗 Fri 11:30 a.m. - 12:00 p.m. Invited talk: Matej Balog (Invited talk) Matej Balog · Matej Balog 🔗 Fri 12:00 p.m. - 12:30 p.m. Invited talk: Xavier Bresson (Invited talk) Xavier Bresson 🔗 Fri 12:30 p.m. - 1:00 p.m. Contributed talks: Part 2 (Oral presenters) 🔗 Fri 1:00 p.m. - 1:30 p.m. All participants (Coffee break) 🔗 Fri 1:30 p.m. - 2:15 p.m. Panelists (Panel Discussions: The Future of Graph Learning) 🔗 Fri 2:15 p.m. - 3:00 p.m. All participants (Poster sessions: Afternoon) 🔗 - Spectrum Guided Topology Augmentation for Graph Contrastive Learning (Poster)  link » Graph contrastive learning (GCL) is a major self-supervised graph learning technique that aims to capture invariant properties of graphs via instance discrimination. Its performance heavily relies on the construction of multiple graph views yet it still remains unclear about what makes effective topology augmentations. Recent studies mainly perform topology augmentations in a uniformly random manner without considering graph properties. In this work, we aim to find principled ways for topology augmentations by exploring the invariance of graphs from the graph spectral perspective. Specifically, we propose a novel topology augmentation method guided by spectral change. Extensive experiments on both graph and node classification tasks demonstrate the effectiveness of our method in capturing the structural essence of graphs for self-supervised learning. The proposed method also brings promising performance in transfer learning and adversarial attack settings. We envision this work to provide a principled way for graph augmentation. Link » Lu Lin · Jinghui Chen · Hongning Wang 🔗 - Sketch-GNN: Scalable Graph Neural Networks with Sublinear Training Complexity (Poster)  link » Graph Neural Networks (GNNs) are widely applied to graph learning problems such as node classification. When scaling up the underlying graphs of GNNs to a larger size, we are forced to either train on the complete graph and keep the full graph adjacency and node embeddings in memory (which is often infeasible) or mini-batch sample the graph (which results in exponentially growing computational complexities with respect to the number of GNN layers). Various sampling-based and historical-embedding-based methods are proposed to avoid this exponential growth of complexities. However, none of these solutions eliminates the linear dependence on graph size. This paper proposes a sketch-based algorithm whose training time and memory grow sublinearly with respect to graph size by training GNNs atop a few compact sketches of graph adjacency and node embeddings. Based on polynomial tensor-sketch (PTS) theory, our framework provides a novel protocol for sketching non-linear activations and graph convolution matrices in GNNs, as opposed to existing methods that sketch linear weights or gradients in neural networks. In addition, we develop a locality sensitive hashing (LSH) technique that can be trained to improve the quality of sketches. Experiments on large-graph benchmarks demonstrate the scalability and competitive performance of our Sketch-GNNs versus their full-size GNN counterparts. Link » Mucong Ding · Tahseen Rabbani · Bang An · Evan Wang · Furong Huang 🔗 - Graph Neural Networks for Selection of Preconditioners and Krylov Solvers (Poster)  link » Solving large sparse linear systems is ubiquitous in science and engineering, generally requiring iterative solvers and preconditioners, as many problems cannot be solved efficiently by using direct solvers. However, the practical performance of solvers and preconditioners is sometimes beyond theoretical analysis and an optimal choice calls for intuitions from domain experts, knowledge of the hardware, as well as trial and error. In this work, we propose a new method for optimal solver-preconditioner selection using Graph Neural Networks (GNNs), as a complementary solution to laborious expert efforts. The method is based upon the graph representation of the problem and the idea of integrating node features with graph features via graph convolutions. We show that our models perform favorably well compared with traditional machine learning models investigated by the prior literature (with an improvement of 25\% in selected evaluation metrics). Implementation details and possible limitations and improvements will be discussed. Link » Ziyuan Tang · Hong Zhang · Jie Chen 🔗 - Faster Hyperparameter Search on Graphs via Calibrated Dataset Condensation (Poster)  link » Dataset condensation aims to reduce the computational cost of training multiple models on a large dataset by condensing the training set into a small synthetic one. State-of-the-art approaches rely on matching the gradients between the real and synthetic data and are recently applied to condense large-scale graphs for node classification tasks. Although dataset condensation may be efficient when we need to train multiple models for hyperparameter optimization, there is no theoretical guarantee on the generalizability of the condensed data, and it can generalize poorly across hyperparameters/architectures in practice; while on graphs, we find and prove this overfitting is much more severe. This paper considers a different condensation objective specifically for hyperparameter search. We aim to generate the synthetic dataset so that the validation-performance ranking of different models under different hyperparameters on the condensed and original datasets are comparable. We propose a novel hyperparameter-calibrated dataset condensation (HCDC) algorithm, which learns the synthetic validation data by matching the hyperparameter gradients computed by implicit differentiation and efficient inverse Hessian approximation. HCDC employs a supernet with differentiable hyperparameters, making it suitable for modeling GNNs with widely different convolution filters. Experiments demonstrate that the proposed framework effectively maintains the validation-performance rankings of GNNs and speeds up hyperparameter/architecture search on graphs. Link » Mucong Ding · Xiaoyu Liu · Tahseen Rabbani · Furong Huang 🔗 - Sequence-Graph Duality: Unifying User Modeling with Self-Attention for Sequential Recommendation (Poster)  link » User modeling is of great importance in personalization services. Many existing methods treat users as sequences of interacted items to encode item transition patterns. Another line of research models users as graphs in which interacted items are modelled as nodes, and item-item relations are modelled as edges. The graph-based user modeling provides more flexibility to encode complex item relationships of different types (e.g. co-occurrence, similarity) but usually overlooks sequential patterns. Here, we introduce a novel user representation, Heterogeneous User Graph (HUG), which unifies sequence and graph-based user modeling to take advantage of both methods. An HUG is associated with two types of edges: sequential edges that preserve the sequential information and collaborative edges that store different item-item relationship. To learn users' latent representations from their HUGs, we propose a multi-head attention based architecture called Heterogeneous User Graph Transformer (HUGT). HUGT is developed on the basis of SASRec and can concurrently capture the sequential transition pattern and complex graph topology. We conduct experiments on four real world datasets from three different application domains. Experimental results show that (1) jointly modeling users as sequences and graphs with HUG provides better recommendation performance over sequence-only and graph-only user modeling; (2) HUGT is effective in learning user latent representations from HUGs; (3) HUGT outperforms the baselines by up to 10\% on datasets with long sequences and aligns with the state-of-the-art performance on datasets with short sequences. Link » Zeren Shui · Ge Liu · Anoop Deoras · George Karypis 🔗 - How Powerful is Implicit Denoising in Graph Neural Networks (Poster)  link » Graph Neural Networks (GNNs), which aggregate features from neighbors, are widely used for processing graph-structured data due to their powerful representation learning capabilities. It is generally believed that GNNs can implicitly remove feature noises and thus still obtain generalizable models. This point of view motivates some existing works to derive GNN models from the graph signal denoising (GSD) problem. However, few works have rigorously analyzed the implicit denoising effect in graph neural networks. In this work, we conduct a comprehensive theoretical study and analyze when and why implicit denoising happens in GNNs. Our theoretical analysis suggests that the implicit denoising largely depends on the connectivity and size of the graph, as well as the GNN architectures. Moreover, extensive empirical evaluations verify our theoretical analyses and the effectiveness of GNNs in eliminating noise in the feature matrix compared with Multi-Layer Perceptron (MLP). Moreover, motivated by adversarial machine learning in improving the robustness of neural networks and the correlation of node features and graph structure in GSD, we propose the adversarial graph signal denoising (AGSD) problem. By solving such a problem, we derive a robust graph convolution, where the smoothness of the node representations and the implicit denoising effect can be enhanced. Link » Songtao Liu · Rex Ying · Hanze Dong · Lu Lin · Jinghui Chen · Dinghao Wu 🔗 - PolarMOT: How far can geometric relations take us in 3D multi-object tracking? (Poster)  link » Most (3D) multi-object tracking methods rely on object-level information, e.g. appearance, for data association. By contrast, we investigate how far we can get by considering only geometric relationships and interactions between objects over time. We represent each tracking sequence as a multiplex graph where 3D object detections are nodes, and spatial and temporal pairwise relations among them are encoded via two types of edges. This structure allows our graph neural network to consider all types of interactions and distinguish temporal, contextual and motion cues to obtain final scene interpretation by posing tracking as edge classification. The model outputs classification results after multiple rounds of neural message passing, during which it is able to reason about long-term object trajectories, influences and motion based solely on initial pairwise relationships. To enable our method for online (streaming) scenarios, we introduce a technique to continuously evolve our graph over long tracking sequences to achieve good performance while maintaining sparsity with linear complexity for the number of edges. We establish a new state-of-the-art on the nuScenes dataset. Link » Aleksandr Kim · Guillem Braso · Aljosa Osep · Laura Leal-Taixé 🔗 - Agent-based Graph Neural Networks (Poster)  link » We present a novel graph neural network we call AgentNet, which is designed specifically for graph-level tasks. AgentNet is inspired by sublinear algorithms, featuring a computational complexity that is independent of the graph size. The architecture of AgentNet differs fundamentally from the architectures of traditional graph neural networks. In AgentNet, some trained \textit{neural agents} intelligently walk the graph, and then collectively decide on the output. We provide an extensive theoretical analysis of AgentNet: We show that the agents can learn to systematically explore their neighborhood and that AgentNet can distinguish some structures that are even indistinguishable by 3-WL. Moreover, AgentNet is able to separate any two graphs which are sufficiently different in terms of subgraphs. We confirm these theoretical results with synthetic experiments on hard-to-distinguish graphs and real-world graph classification tasks. In both cases, we compare favorably not only to standard GNNs but also to computationally more expensive GNN extensions. Link » Karolis Martinkus · Pál András Papp · Benedikt Schesch · Roger Wattenhofer 🔗 - Dissimilar Nodes Improve Graph Active Learning (Poster)  link » Training labels for graph embedding algorithms could be costly to obtain in many practical scenarios. Information score-based active learning (AL) algorithms are frameworks that could obtain the most useful labels for training while keeping the total label queries under a certain budget. The existing Active Graph Embedding framework has been shown to be capable of bringing some improvement to the node classification tasks of Graph Convolutional Networks, however, when evaluating the importance of unlabeled nodes, they fail to consider the influence of existing labeled nodes on the value of unlabeled nodes. With the aim of addressing this limitation, in this work, we introduce 3 dissimilarity-based information scores for active learning: feature dissimilarity score (FDS), structure dissimilarity score (SDS), and embedding dissimilarity score (EDS). According to experiments, our newly proposed scores boost the classification accuracy by 2.1$\%$ on average and are capable of generalizing to different Graph Neural Network architectures. Link » Zhicheng Ren · Yifu Yuan · Yuxin Wu · Xiaxuan Gao · Yewen Wang · Yizhou Sun 🔗 - Convolutional Neural Networks on Manifolds: From Graphs and Back (Poster)  link » In recent years, geometric deep learning has gained attraction due to both the need for machine learning on structured data (e.g., graphs) and the increasing availability of this type of data. Extensions of deep convolutional architectures to non-Euclidean domains in particular are a powerful technique in sensor network applications --- which can be seen as graphs --- and 3D model analysis --- which can be seen as manifolds. While recent works have provided a better theoretical understanding of why convolutional neural network architectures work well on graphs of moderate size, in the large-scale regime that is the setting of most problems of interest, their behavior is not as well understood. In this paper, we bridge this gap by modeling large graphs as samples from manifolds and studying manifold neural networks (MNNs). Our main contribution is to define a manifold convolution operation which, when discretized'' in both the space and time domains, is consistent with the practical implementation of a graph convolution. We then show that graph neural networks (GNNs) can be particularized from MNNs, which in turn are the limits of these GNNs. We conclude with numerical experiments showcasing an application of the MNN to point-cloud classification. Link » Zhiyang Wang · Luana Ruiz · Alejandro Ribeiro 🔗 - Hierarchical Abstraction for Combinatorial Generalization in Object Rearrangement (Poster)  link » Object rearrangement is a challenge for embodied agents because solving these tasks requires generalizing across a combinatorially large set of underlying entities that take the value of object states. Worse, these entities are often unknown and must be inferred from sensory percepts. We present a hierarchical abstraction approach to uncover these underlying entities and achieve combinatorial generalization from unstructured inputs. By constructing a factorized transition graph over clusters of object representations inferred from pixels, we show how to learn a correspondence between intervening on states of entities in the agent's model and acting on objects in the environment. We use this correspondence to develop a method for control that generalizes to different numbers and configurations of objects, which outperforms current offline deep RL methods when evaluated on a set of simulated rearrangement and stacking tasks. Link » Michael Chang · Alyssa L Dayan · Franziska Meier · Tom Griffiths · Sergey Levine · Amy Zhang 🔗 - Diffusion Models for Graphs Benefit From Discrete State Spaces (Poster)  link » Denoising diffusion probabilistic models and score matching models have proven to be very powerful for generative tasks. While these approaches have also been applied to the generation of discrete graphs, they have, so far, relied on continuous Gaussian perturbations. Instead, in this work, we suggest using discrete noise for the forward Markov process. This ensures that in every intermediate step the graph remains discrete. Compared to the previous approach, our experimental results on four datasets and multiple architectures show that using a discrete noising process results in higher quality generated samples indicated with an average MMDs reduced by a factor of 1.5. Furthermore, the number of denoising steps is reduced from 1000 to 32 steps leading to a 30 times faster sampling procedure. Link » Kilian Haefeli · Karolis Martinkus · Nathanael Perraudin · Roger Wattenhofer 🔗 - Multimodal Video Understanding using Graph Convolutional Network (Poster)  link » Majority of existing semantic video understanding methods process every video independently without considering the underlying inter-video relationships. However, videos uploaded by individuals on social media platforms like YouTube, Instagram etc. exhibit inter-video relationship which are a reflection of individual’s interest, geography, culture etc. In this work, we explicitly attempt to model this inter-video relationship, originating from the creators of these videos using Graph Neural Networks (GNN) in a multimodal setup. We perform video classification by leveraging the creators of the videos and semantic similarity between for creating edges between videos and observe improvements of 4% in accuracy Link » Ayush Singh · Vikram Gupta 🔗 - Condensing Graphs via One-Step Gradient Matching (Poster)  link » As training deep learning models on large dataset takes a lot of time and resources, it is desired to construct a small synthetic dataset with which we can train deep learning models sufficiently. There are recent works that have explored solutions on condensing image datasets through complex bi-level optimization. For instance, dataset condensation (DC) matches network gradients w.r.t. large-real data and small-synthetic data, where the network weights are optimized for multiple steps at each outer iteration. However, existing approaches have their inherent limitations: (1) they are not directly applicable to graphs where the data is discrete; and (2) the condensation process is computationally expensive due to the involved nested optimization. To bridge the gap, we investigate efficient dataset condensation tailored for graph datasets where we model the discrete graph structure as a probabilistic model. We further propose a one-step gradient matching scheme, which performs gradient matching for only one single step without training the network weights. Our theoretical analysis shows this strategy can generate synthetic graphs that lead to lower classification loss on real graphs. Extensive experiments on various graph datasets demonstrate the effectiveness and efficiency of the proposed method. In particular, we are able to reduce the dataset size by $90$\% while approximating up to $98$\% of the original performance and our method is significantly faster than multi-step gradient matching (e.g. $15$× in CIFAR10 for synthesizing $500$ graphs). Link » Wei Jin · Xianfeng Tang · Haoming Jiang · Zheng Li · Danqing Zhang · Jiliang Tang · Bing Yin 🔗 - From Local to Global: Spectral-Inspired Graph Neural Networks (Poster)  link » Graph Neural Networks (GNNs) are powerful deep learning methods for Non-Euclidean data. Popular GNNs are message-passing algorithms (MPNNs) that aggregate and combine signals in a local graph neighborhood. However, shallow MPNNs tend to miss long-range signals and perform poorly on some heterophilous graphs, while deep MPNNs can suffer from issues like over-smoothing or over-squashing. To mitigate such issues, existing works typically borrow normalization techniques from training neural networks on Euclidean data or modify the graph structures. Yet these approaches are not well-understood theoretically and could increase the overall computational complexity. In this work, we draw inspirations from spectral graph embedding and propose \texttt{PowerEmbed} --- a simple layer-wise normalization technique to boost MPNNs. We show \texttt{PowerEmbed} can provably express the top-$k$ leading eigenvectors of the graph operator, which prevents over-smoothing and is agnostic to the graph topology; meanwhile, it produces a list of representations ranging from local features to global signals, which avoids over-squashing. We apply \texttt{PowerEmbed} in a wide range of simulated and real graphs and demonstrate its competitive performance, particularly for heterophilous graphs. Link » Ningyuan Huang · Soledad Villar · Carey E Priebe · Da Zheng · Chengyue Huang · Lin Yang · Vladimir Braverman 🔗 - Provably expressive temporal graph networks (Poster)  link » Temporal graph networks (TGNs) have gained prominence as models for embedding dynamic interactions, but little is known about their theoretical underpinnings. We establish fundamental results about the representational power and limits of the two main categories of TGNs: those that aggregate temporal walks (WA-TGNs), and those that augment local message passing with recurrent memory modules (MP-TGNs). Specifically, novel constructions reveal the inadequacy of MP-TGNs and WA-TGNs, proving that neither category subsumes the other. We extend the 1-WL (Weisfeiler-Leman) test to temporal graphs, and show that the most powerful MP-TGNs should use injective updates, as in this case they become as expressive as the temporal WL. Also, we show that sufficiently deep MP-TGNs cannot benefit from memory, and MP/WA-TGNs fail to compute graph properties such as girth. These theoretical insights lead us to introduce PINT --- a novel architecture that leverages injective temporal message passing and relative positional features. Importantly, PINT is provably more expressive than both MP-TGNs and WA-TGNs.Our experiments demonstrate that PINT significantly outperforms existing TGNs on several real-world benchmarks. Link » Amauri Souza · Diego Mesquita · Samuel Kaski · Vikas Garg 🔗 - pyGSL: A Graph Structure Learning Toolkit (Poster)  link » We introduce pyGSL, a Python library that provides efficient implementations of state-of-the-art graph structure learning models along with diverse datasets to evaluate them on. The implementations are written in GPU-friendly ways, allowing one to scale to much larger network tasks. A common interface is introduced for algorithm unrolling methods, unifying implementations of recent state-of-the-art techniques and allowing new methods to be quickly developed by avoiding the need to rebuild the underlying unrolling infrastructure. Implementations of differentiable graph structure learning models are written in PyTorch, allowing us to leverage the rich software ecosystem that exists e.g., around logging, hyperparameter search, and GPU-communication. This also makes it easy to incorporate these models as components in larger gradient based learning systems where differentiable estimates of graph structure may be useful, e.g. in latent graph learning. Diverse datasets and performance metrics allow consistent comparisons across models in this fast growing field. The full code repository on GitHub will be included upon acceptance. Link » Max Wasserman · Gonzalo Mateos 🔗 - Improving Graph Neural Networks at Scale: Combining Approximate PageRank and CoreRank (Poster)  link » Graph Neural Networks (GNNs) have achieved great successes in many learning tasks performed on graph structures. Nonetheless, to propagate information GNNs rely on a message passing scheme which can become prohibitively expensive when working with industrial-scale graphs. Inspired by the PPRGo model, we propose the CorePPR model, a scalable solution that utilises a learnable convex combination of the approximate personalised PageRank and the CoreRank to diffuse multi-hop neighbourhood information in GNNs. Additionally, we incorporate a dynamic mechanism to select the most influential neighbours for a particular node which reduces training time while preserving the performance of the model. Overall, we demonstrate that CorePPR outperforms PPRGo, particularly on large graphs where selecting the most influential nodes is particularly relevant for scalability. Link » Ariel Ramos Vela · Johannes Lutzeyer · Anastasios Giovanidis · Michalis Vazirgiannis 🔗 - GraphWorld: Fake Graphs BringReal Insights for GNNs (Poster)  link » Despite advances in the field of Graph Neural Networks (GNNs), only a small number (~5) of datasets are currently used to evaluate new models. This continued reliance on a handful of datasets provides minimal insight into the performance differences between models, and is especially challenging for industrial practitioners who have datasets which are very different from academic benchmarks.In this work we introduce GraphWorld, a novel methodology and system for benchmarking GNN models on an arbitrarily-large population of synthetic graphs for any conceivable GNN task. GraphWorld allows a user to efficiently generate a \emph{world} with millions of statistically diverse datasets. It is accessible, scalable, and easy to use. GraphWorld can be run on a single machine without specialized hardware, or it can be easily scaled up to run on arbitrary clusters or cloud frameworks. Using GraphWorld, a user has fine-grained control over graph generator parameters, and can benchmark arbitrary GNN models.We present insights from GraphWorld experiments on the performance of thirteen GNN models and baselines over millions of benchmark datasets. We show that GraphWorld efficiently explores regions of benchmark dataset space uncovered by standard benchmarks, revealing comparisons between models that have not been historically obtainable. Using GraphWorld, we also are able to study in-detail the relationship between graph properties and task performance metrics, which is nearly impossible with the classic collection of real-world benchmarks. Link » John Palowitch · Anton Tsitsulin · Bryan Perozzi · Brandon Mayer 🔗 - GLINKX: A Unified Framework for Large-scale Homophilous and Heterophilous Graphs (Poster)  link » In graph learning, there have been two main inductive biases regarding graph-inspired architectures: On the one hand, higher-order interactions and message passing work well on homophilous graphs and are leveraged by GCNs and GATs. Such architectures, however, cannot easily scale to large real-world graphs. On the other hand, shallow (or node-level) models using ego features and adjacency embeddings work well in heterophilous graphs. In this work, we propose a novel scalable shallow method -- GLINKX -- that can work both on homophilous and heterophilous graphs. GLINKX leverages (i) novel monophilous label propagations (ii) ego/node features, (iii) knowledge graph embeddings as positional embeddings, (iv) node-level training, and (v) low-dimensional message passing, to achieve scaling in large graphs. We show the effectiveness of GLINKX on several homophilous and heterophilous datasets. Link » Marios Papachristou · Rishab Goel · Frank Portman · Matthew Miller · Rong Jin 🔗 - Modular Flows: Differential Molecular Generation (Poster)  link » Generating new molecules is fundamental to advancing critical applications such as drug discovery and material synthesis. Flows can generate molecules effectively by inverting the encoding process, however, existing flow models either require artifactual dequantization or specific node/edge orderings, lack desiderata such as permutation invariance or induce discrepancy between encoding and decoding steps that necessitates post hoc validity correction. We circumvent these issues with novel continuous normalizing E(3)-equivariant flows, based on a system of node ODEs coupled as a graph PDE, that repeatedly reconcile locally toward globally aligned densities. Our models can be cast as message passing temporal networks, and result in superlative performance on the tasks of density estimation and molecular generation. In particular, our generated samples achieve state of art on both the standard QM9 and ZINC250K benchmarks. Link » Yogesh Verma · Samuel Kaski · Markus Heinonen · Vikas Garg 🔗 - GIST: Distributed Training for Large-Scale Graph Convolutional Networks (Poster)  link » The graph convolutional network (GCN) is a go-to solution for machine learning on graphs, but its training is notoriously difficult to scale both in terms of graph size and the number of model parameters. Although some work has explored training on large-scale graphs, we pioneer efficient training of large-scale GCN models with the proposal of a novel, distributed training framework, called \texttt{GIST}. \texttt{GIST} disjointly partitions the parameters of a GCN model into several, smaller sub-GCNs that are trained independently and in parallel. Compatible with all GCN architectures and existing sampling techniques, \texttt{GIST} $i)$ improves model performance, $ii)$ scales to training on arbitrarily large graphs, $iii)$ decreases wall-clock training time, and $iv)$ enables the training of markedly overparameterized GCN models. Remarkably, with \texttt{GIST}, we train an astonishgly-wide $32,\!768$-dimensional GraphSAGE model, which exceeds the capacity of a single GPU by a factor of $8\times$, to SOTA performance on the Amazon2M dataset. Link » Cameron Wolfe · Jingkang Yang · Fangshuo Liao · Arindam Chowdhury · Chen Dun · Artun Bayer · Santiago Segarra · Anastasios Kyrillidis 🔗 - Neural Coarsening Process for Multi-level Graph Combinatorial Optimization (Poster)  link » Combinatorial optimization (CO) is applicable to various industrial fields, but solving CO problems is usually NP-hard. Thus, previous studies have focused on designing heuristics to solve CO within a reasonable time. Recent advances in deep learning show the potential to automate the designing process of CO solvers by leveraging the powerful representation capability of deep neural networks. Practically, solving CO is often cast as a multi-level process; the lower-level CO problems are solved repeatedly so as to solve the upper-level CO problem. In this case, the number of iterations within the lower-level process can dramatically impact the overall process. This paper proposes a new graph learning method, Neural Coarsening Process (NCP), to reduce the number of graph neural network inferences for lower-level CO problems. Experimental results show that NCP effectively reduces the number of inferences as compared to fully sequential decision-making. Furthermore, NCP outperforms competitive heuristics on CVRP-CapacityCut, a subproblem of the cutting plane method for the capacitated vehicle routing problem (CVRP). Link » Hyeonah Kim · Minsu Kim · Changhyun Kwon · Jinkyoo Park 🔗 - Graph Contrastive Learning with Cross-view Reconstruction (Poster)  link » Although different graph self-supervised learning strategies have been proposed to tackle the supervision shortage issue in graph learning tasks, Graph contrastive learning (GCL) has been the most prevalent approach to this problem. Despite the remarkable performances those GCL methods have achieved, existing GCL methods that heavily depend on various manually designed augmentation techniques still struggle to improve model robustness without risking losing task-relevant information. Consequently, the learned representation is either brittle or unilluminating. In light of this, we introduce the GraphCV, which follows the information bottleneck principle to learn minimal yet sufficient representations from graph data. Specifically, our proposed model elicits the predictive (useful for downstream instance discrimination) and other non-predictive features separately. Except for the conventional contrastive loss which guarantees the consistency and sufficiency of the representations across different augmentation views, we introduce a cross-view reconstruction mechanism to pursue the disentanglement of the two learned representations. Besides, an adversarial global view is added as the third view of contrastive loss to avoid the learned representation from being drafted too far away from the original distribution. We empirically demonstrate that our proposed model outperforms the state-of-the-art on graph classification task over multiple benchmark datasets. Link » Qianlong Wen · Zhongyu Ouyang · Chunhui Zhang · Yiyue Qian · Yanfang Ye · Chuxu Zhang 🔗 - Skeleton Clustering: Graph-Based Approach for Dimension-Free Density-Aided Clustering (Poster)  link » Density-based clustering can identify clusters with irregular shapes and has intuitive interpretations, but struggles with large-dimensional data due to the curse of dimensionality.We introduce a graph-based clustering framework called \textit{Skeleton Clustering} to adopt density-based clustering idea to multivariate and even high-dimensional data. The proposed framework constructs a graph representation of the data as a first step and combines prototype methods, density-based clustering, and hierarchical clustering.We propose surrogate density measures based on the skeleton graph that are less dependent on the dimension and have meaningful geometric interpretations. We show by empirical studies that the proposed skeleton clustering method leads to reliable clusters in multivariate and even high-dimensional data with irregular shapes. Link » Zeyu Wei · Yen-Chi Chen 🔗 - AutoTransfer: AutoML with Knowledge Transfer - An Application to Graph Neural Networks (Poster)  link » AutoML has demonstrated remarkable success in finding an effective neural architecture for a given machine learning task defined by a specific dataset and an evaluation metric. However, most present AutoML techniques consider each task independently from scratch, which leads to many explored architectures and high computational cost. Here we propose AutoTransfer, an AutoML solution that improves search efficiency by transferring the known architectural design knowledge to the novel task of interest. Our key insights are a task-model bank that captures the training performance over a diverse set of GNN architectures and tasks, and a computationally efficient task embedding that can accurately measure the similarity between different tasks. Based on the task-model bank and the task embeddings, we estimate the design priors of desirable models of the novel task, by aggregating a similarity-weighted sum of the top-K design distributions on tasks that are similar to the task of interest. The computed design priors can be used with any AutoML search algorithm. We evaluate AutoTransfer on six datasets in the graph machine learning domain. Experiments demonstrate that (i) our proposed task embedding can be computed efficiently, and that tasks with similar embeddings have similar best-performing architectures; (ii) AutoTransfer significantly improves search efficiency with the transferred design priors, reducing the number of explored architectures by an order of magnitude. Finally, we release GNN-Bank-101, a large-scale dataset of detailed GNN training information of 120,000 task-model combinations to facilitate and inspire future research. Link » Kaidi Cao · Jiaxuan You · Jiaju Liu · Jure Leskovec 🔗 - Individual Fairness in Dynamic Financial Networks (Poster)  link » In the financial world, a transaction graph is commonly used for modeling the ever-changing payee-payor relationships. Every online transaction corresponds to a directed edge from the paying party to the receiving party in this graph. Even though the superior learning capability of Graph Neural Networks (GNNs) has led to several successful financial applications like fraud detection and anti-money laundering, most of these existing works do not have fairness considerations. Apparently, the lack of fairness guarantees during the GNN-based decision-making process would cause increasingly serious societal concerns from both buyers and sellers. Furthermore, the time-varying property of the financial networks makes the fairness requirements more challenging, since current fairness measures on graph learning tasks and fairness-aware GNN models are all designed for static graphs only. In this work, we present a new generic definition of individual fairness for dynamic graphs and propose a regularization-based method to debias the GNN model in the temporal setting. We perform some preliminary experimental evaluations on two real-world datasets and demonstrate the potential efficacy of the proposed methods. Link » Zixing Song · Yueen Ma · Irwin King 🔗 - Complete the Missing Half: Augmenting Aggregation Filtering with Diversification for Graph Convolutional Networks (Poster)  link » The core operation of current Graph Neural Networks (GNNs) is the aggregation enabled by the graph Laplacian or message passing, which filters the neighborhood node information. Though effective for various tasks, in this paper, we show that they are potentially a problematic factor underlying all GNN methods for learning on certain datasets, as they force the node representations similar, making the nodes gradually lose their identity and become indistinguishable. Hence, we augment the aggregation operations with their dual, i.e. diversification operators that make the node more distinct and preserve the identity. Such augmentation replaces the aggregation with a two-channel filtering process that, in theory, is beneficial for enriching the node representations. In practice, the proposed two-channel filters can be easily patched on existing GNN methods with diverse training strategies, including spectral and spatial (message passing) methods. In the experiments, we observe desired characteristics of the models and significant performance boost upon the baselines on $9$ node classification tasks. Link » Sitao Luan · Mingde Zhao · Chenqing Hua · Xiao-Wen Chang · Doina Precup 🔗 - SPGP: Structure Prototype Guided Graph Pooling (Poster)  link » While graph neural networks (GNNs) have been successful for node classification tasks and link prediction tasks in graph, learning graph-level representations still remains a challenge. For the graph-level representation, it is important to learn both representation of neighboring nodes, i.e., aggregation, and graph structural information. A number of graph pooling methods have been developed for this goal. However, most of the existing pooling methods utilize k-hop neighborhood without considering explicit structural information in a graph. In this paper, we propose Structure Prototype Guided Pooling (SPGP) that utilizes prior graph structures to overcome the limitation. SPGP formulates graph structures as learnable prototype vectors and computes the affinity between nodes and prototype vectors. This leads to a novel node scoring scheme that prioritizes informative nodes while encapsulating the useful structures of the graph. Our experimental results show that SPGP outperforms state-of-the-art graph pooling methods on graph classification benchmark datasets in both accuracy and scalability. Link » Sangseon Lee · Dohoon Lee · Yinhua Piao · Sun Kim 🔗 - ACMP: Allen-Cahn Message Passing with Attractive and Repulsive Forces for Graph Neural Networks (Poster)  link » Neural message passing is a basic feature extraction unit for graph-structured data considering neighboring node features in network propagation from one layer to the next. We model such process by an interacting particle system with attractive and repulsive forces and Allen-Cahn force arising in the modeling of phase transition. The dynamics of the system is a reaction-diffusion process which can separate particles without blowing up. This induces an Allen-Cahn message passing (ACMP) for graph neural networks where the numerical iteration for the particle system solution constitutes the message passing propagation. ACMP which has a simple implementation with a neural ODE solver can propel the network depth up to one hundred of layers with theoretically proven strictly positive lower bound of the Dirichlet energy. It thus provides a deep model of GNNs circumventing the common GNN problem of oversmoothing. GNNs with ACMP achieve state of the art performance for real-world node classification tasks on both homophilic and heterophilic datasets. Link » Yuelin Wang · Kai Yi · Xinliang Liu · Yuguang Wang · Shi Jin 🔗 - Expander Graph Propagation (Poster)  link » Deploying graph neural networks (GNNs) on whole-graph classification or regression tasks is known to be challenging: it often requires computing node features that are mindful of both local interactions in their neighbourhood and the global context of the graph structure. GNN architectures that navigate this space need to avoid pathological behaviours, such as bottlenecks and oversquashing, while ideally having linear time and space complexity requirements. In this work, we propose an elegant approach based on propagating information over expander graphs. We provide an efficient method for constructing expander graphs of a given size, and use this insight to propose the EGP model. We show that EGP is able to address all of the above concerns, while requiring minimal effort to set up, and provide evidence of its empirical utility on relevant datasets and baselines in the Open Graph Benchmark. Importantly, using expander graphs as a template for message passing necessarily gives rise to negative curvature. While this appears to be counterintuitive in light of recent related work on oversquashing, we theoretically demonstrate that negatively curved edges are likely to be required to obtain scalable message passing without bottlenecks. To the best of our knowledge, this is a previously unstudied result in the context of graph representation learning, and we believe our analysis paves the way to a novel class of scalable methods to counter oversquashing in GNNs. Link » Andreea Deac · Marc Lackenby · Petar Veličković 🔗 - Modeling Hierarchical Topological Structure in Scientific Images with Graph Neural Networks (Poster)  link » Topological analysis reveals meaningful structure in data from a variety of domains. Tasks such as image segmentation can be effectively performed on the network structure of an image's topological complex using graph neural networks (GNNs). We propose two methods for using GNNs to learn from the hierarchical information captured by complexes at multiple levels of topological persistence: one modifies the training procedure of an existing GNN, while one extends the message passing across all levels of the complex. Experiments on real-world data from three different domains shows the performance benefits to GNNs from using hierarchical topological structure. Link » Samuel Leventhal · Attila Gyulassy · Valerio Pascucci · Mark Heimann 🔗 - Variational Graph Auto-Encoders for Heterogeneous Information Network (Poster)  link » Heterogeneous networks, where nodes and their attributes denote real-world entities and links encode relationships between entities, are ubiquitous in many applications, including social media, recommender systems, biological networks, molecular structures, and brain networks. Such applications typically involve node classification or link prediction based on the attributes of the nodes and their connections in the graph. Graph neural networks naturally model the topological structures of \emph{homogeneous} graphs by learning low-dimensional embeddings of nodes. However, heterogeneous graphs with various types of nodes and links pose significant challenges in learning efficient low-dimensional embeddings. Existing methods for learning such representation rely on computationally expensive sampling of meta-paths and heuristic adaptation of models developed for homogeneous networks. Therefore, the computational inefficiency and the generalization performance of such methods are frequently criticized and scrutinized, preventing their adoption in solving real-world problems. To mitigate such issues, we present three variants of graph variational autoencoder models for heterogeneous networks that avoid the computationally expensive sampling of meta-paths and maintain uncertainty estimates of node embeddings that help with better generalization. We report the results of experiments on link prediction using three different real-world benchmark heterogeneous network data sets that show that the proposed methods significantly outperform state-of-the-art baselines. Link » Abhishek Dalvi · Ayan Acharya · Jing Gao · Vasant Honavar 🔗 - Contrastive Graph Few-Shot Learning (Poster)  link » Prevailing deep graph learning models often suffer from label sparsity issue. Although many graph few-shot learning (GFL) methods have been developed to avoid the performance degradation in face of limited annotated data, they excessively rely on labeled data, where the distribution shift in the test phase might result in impaired generalization ability. Additionally, they lack a general purpose as their designs are coupled with task or data-specific characteristics. To this end, we propose a general and effective Contrastive-Graph Few-shot Learning framework (CGFL). CGFL leverages a self-distilled contrastive learning procedure to boost GFL. Specifically, our model firstly pre-trains a graph encoder with contrastive learning using unlabeled data. Later, the trained encoder is frozen as a teacher model to distill a student model with a contrastive loss. The distilled model is finally fed to GFL. CGFL learns data representation in a self-supervised manner, thus mitigating the distribution shift impact for better generalization and making model task and data-independent for a general graph mining purpose. Furthermore, we introduce an information-based method to quantitatively measure the capability of CGFL. Comprehensive experiments demonstrate that CGFL outperforms state-of-the-art baselines on several graph mining tasks in the few-shot scenario. We also provide quantitative measurement of CGFL's success. Link » Chunhui Zhang · Hongfu Liu · Jundong Li · Yanfang Ye · Chuxu Zhang 🔗 - Shift-Robust Node Classification via Graph Clustering Co-training (Poster)  link » It is widely known that machine learning models only achieve sub-optimal performance when testing data exhibit distribution shift against training \ie, $\Pr_\text{train}(X,Y) \neq \Pr_\text{test}(X,Y)$. Although Graph Neural Networks (GNNs) have become de facto models for semi-supervised learning tasks, they suffer even more from distribution shift because multiple types of shifts origin from not only node features but graph structures. Existing domain adaptation methods only work for specific type of shifts. In response, we propose Shift-Robust Node Classification (SRNC) - a unified domain adaptation framework for different kinds of distribution shifts on graph. Specifically, we co-train an unsupervised cluster GNN, which captures the data distribution by graph homophily on target graph. Then a shift-robust classifier is optimized on training graph and pseudo samples from target graph, which are provided by cluster GNN. Compared to the existing domain adaptation algorithms on graph, our approach works for both open-set and close-set shifts with convergence guarantees.In our experiments, the classification accuracy is improved at least $3\%$ against the second-best baseline under open-set shifts. On time-evolving graph with close-set shift, existing domain adaption algorithms can barely improve the generalization if not worse. SRNC is still able to mitigate the negative effect ($>2\%$ absolute improvements) of the shift across different testing-times. Link » Qi Zhu · Chao Zhang · Chanyoung Park · Carl Yang · Jiawei Han 🔗 - Diving into Unified Data-Model Sparsity for Class-Imbalanced Graph Representation Learning (Poster)  link » Even pruned by the state-of-the-art network compression methods, recent research shows that deep learning model training still suffers from the demand of massive data usage. In particular, Graph Neural Networks (GNNs) training upon such non-Euclidean graph data often encounters relatively higher time costs, due to its irregular and nasty density properties, compared with data in the regular Euclidean space (e.g., image or text). Another natural property concomitantly with graph is class-imbalance which cannot be alleviated by the massive graph data while hindering GNNs' generalization. To fully tackle these unpleasant properties, (i) theoretically, we introduce a hypothesis about what extent a subset of the training data can approximate the full dataset's learning effectiveness. The effectiveness is further guaranteed and proved by the gradients' distance between the subset and the full set; (ii) empirically, we discover that during the learning process of a GNN, some samples in the training dataset are informative for providing gradients to update model parameters. Moreover, the informative subset is not fixed during training process. Samples that are informative in the current training epoch may not be so in the next one. We refer to this observation as dynamic data sparsity. We also notice that sparse subnets pruned from a well-trained GNN sometimes forget the information provided by the informative subset, reflected in their poor performances upon the subset. Based on these findings, we develop a unified data-model dynamic sparsity framework named Graph Decantation (GraphDec) to address challenges brought by training upon a massive class-imbalanced graph data. The key idea of GraphDec is to identify the informative subset dynamically during the training process by adopting sparse graph contrastive learning. Extensive experiments on multiple benchmark datasets demonstrate that GraphDec outperforms state-of-the-art baselines for class-imbalanced graph classification and class-imbalanced node classification tasks, with respect to classification accuracy and data usage efficiency. Link » Chunhui Zhang · Chao Huang · Yijun Tian · Qianlong Wen · Zhongyu Ouyang · Youhuan Li · Yanfang Ye · Chuxu Zhang 🔗 - Invertible Neural Networks for Graph Prediction (Poster)  link » In this work, we address conditional generation using deep invertible neural networks. This is a type of problem where one aims to infer the most probable inputs $X$ given outcomes $Y$. We call our method \textit{invertible graph neural network} (iGNN) due to the primary focus on generating node features on graph data. A notable feature of our proposed methods is that during network training, we revise the typically-used loss objective in normalizing flow and consider Wasserstein-2 regularization to facilitate the training process. Algorithmic-wise, we adopt an end-to-end training approach since our objective is to address prediction and generation in the forward and backward processes at once through a single model. Theoretically, we study the expressiveness of iGNN in learning the mapping through utilizing the Fokker-Planck equation of an Ornstein-Uhlenbeck process. Experimentally, we verify the performance of iGNN on both simulated and real-data datasets. We demonstrate through extensive numerical experiments that iGNN shows clear improvement over competing conditional generation benchmarks on high-dimensional and/or non-convex data. Link » Chen Xu · Xiuyuan Cheng · Yao Xie 🔗 - Efficient Automatic Machine Learning via Design Graphs (Poster)  link » Despite the success of automated machine learning (AutoML), which aims to find the best design, including the architecture of deep networks and hyper-parameters, conventional AutoML methods are computationally expensive and hardly provide insights into the effect of various model design choices. To tackle the challenges, we propose FALCON, an efficient sample-based method to predict the performance of a model and search for the optimal model design. Our key insight is to model the design space of possible model designs as a design graph, where the nodes represent design choices, and the edges denote design similarities. FALCON features 1) a task-agnostic module, which performs message passing on the design graph via a Graph Neural Network (GNN), and 2) a task-specific module, which conducts label propagation of the known model performance information on the design graph. Both modules are combined to predict the performances of each design in the design space. We conduct extensive experiments on 27 tasks on graphs, including node and graph classifications in various application domains, and an image classification task on the CIFAR-10 dataset. We empirically show that FALCON can efficiently obtain the well-performing designs for each task using only 30 explored nodes. Specifically, FALCON has a comparable time cost with the one-shot approaches while achieving an average improvement of 3.3% on the graph classification datasets. Link » Shirley Wu · Jiaxuan You · Jure Leskovec · Rex Ying 🔗 - Equivariant Graph Hierarchy-based Neural Networks (Poster)  link » Equivariant Graph neural Networks (EGNs) are powerful in characterizing the dynamics of multi-body physical systems. Existing EGNs conduct flat message passing, which, yet, is unable to capture the spatial/dynamical hierarchy for complex systems particularly, limiting substructure discovery and global information fusion. In this paper, we propose Equivariant Hierarchy-based Graph Networks (EGHNs) which consist of the three key components: generalized Equivariant Matrix Message Passing (EMMP), E-Pool, and E-UnPool. In particular, EMMP is able to improve the expressivity of conventional equivariant message passing, E-Pool assigns the quantities of the low-level nodes into high-level clusters, while E-UnPool leverages the high-level information to update the dynamics of the low-level nodes. As their names imply, both E-Pool and E-UnPool are guaranteed to be equivariant to meet physic symmetry. Considerable experimental evaluations verify the effectiveness of our EGHN on several applications including multi-object dynamics simulation, motion capture, and protein dynamics modeling. Link » Jiaqi Han · Yu Rong · Tingyang Xu · Wenbing Huang 🔗 - NOSMOG: Learning Noise-robust and Structure-aware MLPs on Graphs (Poster)  link » While Graph Neural Networks (GNNs) have demonstrated their efficacy in dealing with non-Euclidean structural data, they are difficult to be deployed in real applications due to the scalability constraint imposed by multi-hop data dependency. Existing methods attempt to address this scalability issue by training multi-layer perceptrons (MLPs) exclusively on node content features using labels derived from trained GNNs. Even though the performance of MLPs can be significantly improved, two issues prevent MLPs from outperforming GNNs and being used in practice: the ignorance of graph structural information and the sensitivity to node feature noises. In this paper, we propose to learn NOise-robust Structure-aware MLPs On Graphs (NOSMOG) to overcome the challenges. Specifically, we first complement node content with position features to help MLPs capture graph structural information. We then design a novel representational similarity distillation strategy to inject structural node similarities into MLPs. Finally, we introduce the adversarial feature augmentation to ensure stable learning against feature noises and further improve performance. Extensive experiments demonstrate that NOSMOG outperforms GNNs and the state-of-the-art method in both transductive and inductive settings across 7 datasets, while maintaining a competitive inference efficiency. Link » Yijun Tian · Chuxu Zhang · Zhichun Guo · Xiangliang Zhang · Nitesh Chawla 🔗 - A New Graph Node Classification Benchmark: Learning Structure from Histology Cell Graphs (Poster)  link » We introduce a new benchmark dataset, Placenta, for node classification in an underexplored domain: predicting microanatomical tissue structures from cell graphs in placenta histology whole slide images. This problem is uniquely challenging for graph learning for a few reasons. Cell graphs are large (>1 million nodes per image), node features are varied (64-dimensions of 11 types of cells), class labels are imbalanced (9 classes ranging from 0.21% of the data to 40.0%), and cellular communities cluster into heterogeneously distributed tissues of widely varying sizes (from 11 nodes to 44,671 nodes for a single structure). Here, we release a dataset consisting of two cell graphs from two placenta histology images totalling 2,395,747 nodes, 799,745 of which have ground truth labels. We present inductive benchmark results for 7 scalable models and show how the unique qualities of cell graphs can help drive the development of novel graph neural network architectures. Link » Claudia Vanea · Jonathan Campbell · Omri Dodi · Liis Salumäe · Karen Meir · Drorith Hochner · Hagit Hochner · Triin Laisk · Linda Ernst · Cecilia Lindgren · Christoffer Nellaker 🔗 - GraphFramEx: Towards Systematic Evaluation of Explainability Methods for Graph Neural Networks (Poster)  link » As one of the most popular machine learning models today, graph neural networks (GNNs) have attracted intense interest recently, and so does their explainability. Users are increasingly interested in a better understanding of GNN models and their outcomes. Unfortunately, today's evaluation frameworks for GNN explainability often rely on few inadequate synthetic datasets, leading to conclusions of limited scope due to a lack of complexity in the problem instances. As GNN models are deployed to more mission-critical applications, we are in dire need for a common evaluation protocol of explainability methods of GNNs. In this paper, we propose, to our best knowledge, the first systematic evaluation framework for GNN explainability, considering explainability on three different "user needs". We propose a unique metric that combines the fidelity measures and classifies explanations based on their quality of being sufficient or necessary. We scope ourselves to node classification tasks and compare the most representative techniques in the field of input-level explainability for GNNs. For the inadequate but widely used synthetic benchmarks, surprisingly shallow techniques such as personalized PageRank have the best performance for a minimum computation time. But when the graph structure is more complex and nodes have meaningful features, gradient-based methods are the best according to our evaluation criteria. However, none dominates the others on all evaluation dimensions and there is always a trade-off. We further apply our evaluation protocol in a case study for frauds explanation on eBay transaction graphs to reflect the production environment. Link » Kenza Amara · Rex Ying · Zitao Zhang · Zhihao Han · Yinan Shan · Ulrik Brandes · Sebastian Schemm 🔗 - Learning Heterogeneous Interaction Strengths by Trajectory Prediction with Graph Neural Network (Poster)  link » Dynamical systems with interacting agents are universal in nature, commonly modeled by a graph of relationships between their constituents. Recently, various works have been presented to tackle the problem of inferring those relationships from the system trajectories via deep neural networks, but most of the studies assume binary or discrete types of interactions for simplicity. In the real world, the interaction kernels often involve continuous interaction strengths, which cannot be accurately approximated by discrete relations. In this work, we propose the relational attentive inference network (RAIN) to infer continuously weighted interaction graphs without any ground-truth interaction strengths. Our model employs a novel pairwise attention (PA) mechanism to refine the trajectory representations and a graph transformer to extract heterogeneous interaction weights for each pair of agents. We show that our RAIN model with the PA mechanism accurately infers continuous interaction strengths for simulated physical systems in an unsupervised manner. Further, RAIN with PA successfully predicts trajectories from motion capture data with an interpretable interaction graph, demonstrating the virtue of modeling unknown dynamics with continuous weights. Link » Seungwoong Ha · Hawoong Jeong 🔗 - A Graph Is More Than Its Nodes: Towards Structured Uncertainty-Aware Learning on Graphs (Poster)  link » Current graph neural networks (GNNs) that tackle node classification on graphs tend to only focus on nodewise scores and are solely evaluated by nodewise metrics. This limits uncertainty estimation on graphs since nodewise marginals do not fully characterize the joint distribution given the graph structure. In this work, we propose novel edgewise metrics, namely the edgewise expected calibration error (ECE) and the agree/disagree ECEs, which provide criteria for uncertainty estimation on graphs beyond the nodewise setting. Our experiments demonstrate that the proposed edgewise metrics can complement the nodewise results and yield additional insights. Moreover, we show that GNN models which consider the structured prediction problem on graphs tend to have better uncertainty estimations, which illustrates the benefit of going beyond the nodewise setting. Link » Hans Hao-Hsun Hsu · Yuesong Shen · Daniel Cremers 🔗 - Expectation Complete Graph Representations using Graph Homomorphisms (Poster)  link » We propose and study a practical graph embedding that in expectation is able to distinguish all non-isomorphic graphs and can be computed in polynomial time. The embedding is based on Lovász' characterization of graph isomorphism through an infinite dimensional vector of homomorphism counts. Recent work has studied the expressiveness of graph embeddings by comparing their ability to distinguish graphs to that of the Weisfeiler-Leman hierarchy. While previous methods have either limited expressiveness or are computationally impractical, we devise efficient sampling-based alternatives that are maximally expressive in expectation. We empirically evaluate our proposed embeddings and show competitive results on several benchmark graph learning tasks. Link » Maximilian Thiessen · Pascal Welke · Thomas Gärtner 🔗 - antGLasso: An Efficient Tensor Graphical Lasso Algorithm (Poster)  link » The class of bigraphical lasso algorithms (and, more broadly, 'tensor'-graphical lasso algorithms) has been used to estimate dependency structures within matrix and tensor data. However, all current methods to do so take prohibitively long on modestly sized datasets. We present a novel tensor-graphical lasso algorithm that directly estimates the dependency structure, unlike its iterative predecessors. This provides a speedup of multiple orders of magnitude, allowing this class of algorithms to be used on large, real-world datasets. Link » Bailey Andrew · David Westhead · Luisa Cutillo 🔗 - Certified Graph Unlearning (Poster)  link » Graph-structured data is ubiquitous in practice and often processed using graph neural networks (GNNs). With the adoption of recent laws ensuring the right to be forgotten'', the problem of graph data removal has become of significant importance. To address the problem, we introduce the first known framework for \emph{certified graph unlearning} of GNNs. In contrast to standard machine unlearning, new analytical and heuristic unlearning challenges arise when dealing with complex graph data. First, three different types of unlearning requests need to be considered, including node feature, edge and node unlearning. Second, to establish provable performance guarantees, one needs to address challenges associated with feature mixing during propagation. The underlying analysis is illustrated on the example of simple graph convolutions (SGC) and their generalized PageRank (GPR) extensions, thereby laying the theoretical foundation for certified unlearning of GNNs. Our empirical studies on six benchmark datasets demonstrate excellent performance-complexity trade-offs when compared to complete retraining methods and approaches that do not leverage graph information. For example, when unlearning $20\%$ of the nodes on the Cora dataset, our approach suffers only a $0.1\%$ loss in test accuracy while offering a $4$-fold speed-up compared to complete retraining. Our scheme also outperforms unlearning methods that do not leverage graph information with a $12\%$ increase in test accuracy for comparable time complexity. Link » Eli Chien · Chao Pan · Olgica Milenkovic 🔗 - New Frontiers in Graph Autoencoders: Joint Community Detection and Link Prediction (Poster)  link » Graph autoencoders (GAE) and variational graph autoencoders (VGAE) emerged as powerful methods for link prediction (LP). Their performances are less impressive on community detection (CD), where they are often outperformed by simpler alternatives such as the Louvain method. It is still unclear to what extent one can improve CD with GAE and VGAE, especially in the absence of node features. It is moreover uncertain whether one could do so while simultaneously preserving good performances on LP in a multi-task setting. In this workshop paper, summarizing results from our journal publication, we show that jointly addressing these two tasks with high accuracy is possible. For this purpose, we introduce a community-preserving message passing scheme, doping our GAE and VGAE encoders by considering both the initial graph and Louvain-based prior communities when computing embedding spaces. Inspired by modularity-based clustering, we further propose novel training and optimization strategies specifically designed for joint LP and CD. We demonstrate the empirical effectiveness of our approach, referred to as Modularity-Aware GAE and VGAE, on various real world graphs. Link » Guillaume SALHA · Johannes Lutzeyer · George Dasoulas · Romain Hennequin · Michalis Vazirgiannis 🔗 - A Simple Hypergraph Kernel Convolution based on Discounted Markov Diffusion Process (Poster)  link » Kernels on discrete structures evaluate pairwise similarities between objects which capture semantics and inherent topology information. Existing kernels on discrete structures are only developed by topology information(such as adjacency matrix of graphs), without considering original attributes of objects. This paper proposes a two-phase paradigm to aggregate comprehensive information on discrete structures leading to a Discount Markov Diffusion Learnable Kernel (DMDLK). Specifically, based on the underlying projection of DMDLK, we design a Simple Hypergraph Kernel Convolution (SHKC) for hidden representation of vertices. SHKC can adjust diffusion steps rather than stacking convolution layers to aggregate information from long-range neighborhoods which prevents over-smoothing issues of existing hypergraph convolutions. Moreover, we utilize the uniform stability bound theorem in transductive learning to analyze critical factors for the effectiveness and generalization ability of SHKC from a theoretical perspective. The experimental results on several benchmark datasets for node classification tasks verified the superior performance of SHKC over state-of-the-art methods. Link » Fuyang Li · Jiying Zhang · Xi Xiao · bin zhang · Dijun Luo 🔗 - GraphCG: Unsupervised Discovery of Steerable Factors in Graphs (Poster)  link » Deep generative models have been extensively explored recently, especially for the graph data such as molecular graphs and point clouds. Yet, much less investigation has been carried out on understanding the learned latent space of deep graph generative models. Such understandings can open up a unified perspective and provide guidelines for essential tasks like controllable generation. In this paper, we first examine the representation space of the recent deep generative model trained for graph data, observing that the learned representation space is not perfectly disentangled. Based on this observation, we then propose an unsupervised method called GraphCG, which is model-agnostic and task-agnostic for discovering steerable factors in graph data. Specifically, GraphCG learns the semantic-rich directions via maximizing the corresponding mutual information, where the edited graph along the same direction will possess certain steerable factors. We conduct experiments on two types of graph data, molecular graphs and point clouds. Both the quantitative and qualitative results show the effectiveness of GraphCG for discovering steerable factors. The code will be public in the near future. Link » Shengchao Liu · Chengpeng Wang · Weili Nie · Hanchen Wang · Jiarui Lu · Bolei Zhou · Jian Tang 🔗 - Differentially Private Graph Learning via Sensitivity-Bounded Personalized PageRank (Poster)  link » Personalized PageRank (PPR) is a fundamental tool in unsupervised learning of graph representations such as node ranking, labeling, and graph embedding. However, while data privacy is one of the most important recent concerns, existing PPR algorithms are not designed to protect user privacy. PPR is highly sensitive to the input graph edges: the difference of only one edge may cause a big change in the PPR vector, potentially leaking private user data.In this work, we propose an algorithm which outputs an approximate PPR and has provably bounded sensitivity to input edges. In addition, we prove that our algorithm achieves similar accuracy to non-private algorithms when the input graph has large degrees. Our sensitivity-bounded PPR directly implies private algorithms for several tools of graph learning, such as, differentially private (DP) PPR ranking, DP node classification, and DP node embedding. To complement our theoretical analysis, we also empirically verify the practical performances of our algorithms. Link » Alessandro Epasto · Vahab Mirrokni · Bryan Perozzi · Anton Tsitsulin · Peilin Zhong 🔗 - A deep learning approach to recover conditional independence graphs (Poster)  link » Probabilistic Graphical Models are generative models of complex systems. They rely on conditional independence assumptions between variables to learn sparse representations which can be visualized in a form of a graph. Such models are used for domain exploration and structure discovery in poorly understood domains. This work introduces a novel technique to perform sparse graph recovery by optimizing deep unrolled networks. Assuming that the input data $X\in\mathbb{R}^{M\times D}$ comes from an underlying multivariate Gaussian distribution, we apply a deep model on $X$ that outputs the precision matrix $\Theta$. Then, the partial correlation matrix $\rho$' is calculated which can also be interpreted as the conditional independence graph. Our model, \texttt{uGLAD}, builds upon and extends the state-of-the-art model \texttt{GLAD} to the unsupervised setting. The key benefits of our model are (1) \texttt{uGLAD} automatically optimizes sparsity-related regularization parameters leading to better performance than existing algorithms. (2) We introduce multi-task learning based consensus' strategy for robust handling of missing data in an unsupervised setting. We evaluate performance on synthetic Gaussian, non-Gaussian data generated from Gene Regulatory Networks, and present a case study in anaerobic digestion. Link » Harsh Shrivastava · Urszula Chajewska · Robin Abraham · Xinshi Chen 🔗 - On the Unreasonable Effectiveness of Feature Propagation in Learning on Graphs with Missing Node Features (Poster)  link » While Graph Neural Networks (GNNs) have recently become the de facto standard for modeling relational data, they impose a strong assumption on the availability of the node or edge features of the graph. In many real-world applications, however, features are only partially available; for example, in social networks, age and gender are available only for a small subset of users. We present a general approach for handling missing features in graph machine learning applications that is based on minimization of the Dirichlet energy and leads to a diffusion-type differential equation on the graph. The discretization of this equation produces a simple, fast and scalable algorithm which we call Feature Propagation. We experimentally show that the proposed approach outperforms previous methods on seven common node-classification benchmarks and can withstand surprisingly high rates of missing features: on average we observe only around 4% relative accuracy drop when 99% of the features are missing. Moreover, it takes only 10 seconds to run on a graph with ~2.5M nodes and ~23M edges on a single GPU. Link » Emanuele Rossi · Henry Kenlay · Maria Gorinova · Benjamin Chamberlain · Xiaowen Dong · Michael Bronstein 🔗 - Empowering Language Models with Knowledge Graph Reasoning for Question Answering (Poster)  link » Answering open-domain questions requires world knowledge about in-context entities. As pre-trained Language Models (LMs) lack the power to store required knowledge, external knowledge sources, such as knowledge graphs, are often used to augment LMs. In this work, we propose knOwledge REasOning empowered Language Model (OREO-LM), which consists of a novel Knowledge Interaction Layer that can be flexibly plugged into existing Transformer-based LMs to interact with a differentiable Knowledge Graph Reasoning module collaboratively. In this way, LM guides KG to walk towards the desired answer, while the retrieved knowledge improves LM.By adopting OREO-LM to RoBERTa and T5, we show significant performance gain, achieving state-of-art results in the Closed-Book setting. The performance enhancement is mainly from the KG reasoning's capacity to infer missing relational facts. In addition, OREO-LM provides reasoning paths as rationales to interpret the model's decision. Link » Ziniu Hu · Yichong Xu · Wenhao Yu · Shuohang Wang · Ziyi Yang · Chenguang Zhu · Kai-Wei Chang · Yizhou Sun 🔗 - COIN: Co-Cluster Infomax for Bipartite Graphs (Poster)  link » Graph self-supervised learning has attracted plenty of attention in recent years. However, most existing methods are designed for homogeneous graphs yet not tailored for bipartite graphs, and their objectives could induce cluster-level errors since they only consider instance-wise topological information. In this paper, we introduce a novel co-cluster infomax (COIN) framework to capture the cluster-level information by maximizing the mutual information of co-clusters. Different from previous infomax methods which estimate mutual information, COIN directly calculates mutual information. Besides, COIN is an end-to-end method which can be trained jointly with other objectives. Furthermore, we theoretically prove that COIN could effectively increase the mutual information of node embeddings and it is upper-bounded by the prior distributions of nodes. Experimental results show that COIN outperforms state-of-the-art methods on various downstream tasks. Link » Baoyu Jing · Yuchen Yan · Yada Zhu · Hanghang Tong 🔗

#### Author Information

##### Jiaxuan You (Stanford University)

I received my Ph.D. and M.S. degrees from Department of Computer Science, Stanford University, advised by Prof. Jure Leskovec. My research aims at developing data-driven methods to study our interconnected world. I investigate scientific and industrial problems through the lens of graph/relational data, and develop AI/ML solutions for these problems.