Skip to yearly menu bar Skip to main content


Poster

What Is Missing For Graph Homophily? Disentangling Graph Homophily For Graph Neural Networks

Yilun Zheng · Sitao Luan · Lihui Chen

East Exhibit Hall A-C #2911
[ ]
Thu 12 Dec 11 a.m. PST — 2 p.m. PST

Abstract: Graph homophily refers to the phenomenon that connected nodes tend to share similar characteristics. Understanding this concept and its related metrics is crucial for designing effective Graph Neural Networks (GNNs). The most widely used homophily metrics, such as edge or node homophily, quantify such "similarity" as label consistency across the graph topology. These metrics are believed to be able to reflect the performance of GNNs, especially on node-level tasks. However, many recent studies have empirically demonstrated that the performance of GNNs does not always align with homophily metrics, and how homophily influences GNNs still remains unclear and controversial. Then, a crucial question arises from such controversy: Should we completely discard the conventional definition of graph homophily? In this paper, our answer is NO. We find that the original homophily is still useful, but only provides a partial understanding of the GNNs performance. To give a comprehensive view, we disentangle graph homophily into $3$ different aspects: label, structural, and feature homophily, which corresponds to $3$ basic elements of graph data and our proposed feature homophily can disentangle the influence of node features from label and structural homophily. To investigate their synergy, we propose a Contextual Stochastic Block Model with $3$ types of Homophily (CSBM-3H), where the topology and feature generation are controlled by those $3$ types of homophily. Based on CSBM-3H, we propose a composite metric, named Tri-Hom, which combines all $3$ homophily aspects. Our theoretical analysis and simulation results demonstrate that GNNs performance is indeed strongly correlated with Tri-Hom. Tri-Hom provides a more comprehensive view of homophily and can explain the inconsistency between conventional homophily metrics and GNNs performance. In addition, our experimental results on $31$ real-world datasets also verify that Tri-Hom aligns with GNNs performance significantly better than other $17$ existing metrics that only focus on a single homophily aspect, which confirms the superiority of the disentangled homophily.

Live content is unavailable. Log in and register to view live content