Oral Presentations

Learning on Graphs Conference, 2022

Table of Contents

Not too little, not too much: a theoretical analysis of graph (over)smoothing

Authors: Nicolas Keriven

Date: December 9, 2022

Time slot: 17:00 - 17:20 (GMT)

Abstract: We analyze graph smoothing with \emph{mean aggregation}, where each node successively receives the average of the features of its neighbors. Indeed, it has been observed that Graph Neural Networks (GNNs), which generally follow some variant of Message-Passing (MP) with repeated aggregation, may be subject to the \emph{oversmoothing} phenomenon: by performing too many rounds of MP, the node features tend to converge to a non-informative limit. At the other end of the spectrum, it is intuitively obvious that \emph{some} MP rounds are necessary, but existing analyses do not exhibit both phenomena at once. In this paper, we consider simplified linear GNNs, and rigorously analyze two examples of random graphs for which a finite number of mean aggregation steps provably improves the learning performance, before oversmoothing kicks in. We identify two key phenomena: graph smoothing shrinks non-principal directions in the data faster than principal ones, which is useful for regression, and shrinks nodes within communities faster than they collapse together, which improves Classification.

OpenReview: https://openreview.net/forum?id=KQNsbAmJEug

Neighborhood-aware Scalable Temporal Network Representation Learning

Authors: Yuhong Luo, Pan Li

Date: December 9, 2022

Time slot: 17:20 - 17:40 ((GMT)

Abstract: Temporal networks have been widely used to model real-world complex systems such as financial systems and e-commerce systems. In a temporal network, the joint neighborhood of a set of nodes often provides crucial structural information useful for predicting whether they may interact at a certain time. However, recent representation learning methods for temporal networks often fail to extract such information or depend on online construction of structural features, which is time-consuming. To address the issue, this work proposes Neighborhood-Aware Temporal network model (NAT). For each node in the network, NAT abandons the commonly-used one-single-vector-based representation while adopting a novel dictionary-type neighborhood representation. Such a dictionary representation records a downsampled set of the neighboring nodes as keys, and allows fast construction of structural features for a joint neighborhood of multiple nodes. We also design a dedicated data structure termed N-cache to support parallel access and update of those dictionary representations on GPUs. NAT gets evaluated over seven real-world large-scale temporal networks. NAT not only outperforms all cutting-edge baselines by averaged 1.2% and 4.2% in transductive and inductive link prediction accuracy, respectively, but also keeps scalable by achieving a speed-up of 4.1-76.7x against the baselines that adopt joint structural features and achieves a speed-up of 1.6-4.0x against the baselines that cannot adopt those features. The link to the code: https: //github.com/Graph-COM/Neighborhood-Aware-Temporal-Network.

OpenReview: https://openreview.net/forum?id=EPUtNe7a9ta

A Generalist Neural Algorithmic Learner

Authors: Borja Ibarz, Vitaly Kurin, George Papamakarios, Kyriacos Nikiforou, Mehdi Bennani, Róbert Csordás, Andrew Joseph Dudzik, Matko Bošnjak, Alex Vitvitskyi, Yulia Rubanova, Andreea Deac, Beatrice Bevilacqua, Yaroslav Ganin, Charles Blundell, Petar Veličković

Date: December 9, 2022

Time slot: 17:40 - 18:00 (GMT)

Abstract: The cornerstone of neural algorithmic reasoning is the ability to solve algorithmic tasks, especially in a way that generalises out of distribution. While recent years have seen a surge in methodological improvements in this area, they mostly focused on building specialist models. Specialist models are capable of learning to neurally execute either only one algorithm or a collection of algorithms with identical control-flow backbone. Here, instead, we focus on constructing a generalist neural algorithmic learner—a single graph neural network processor capable of learning to execute a wide range of algorithms, such as sorting, searching, dynamic programming, path-finding and geometry. We leverage the CLRS benchmark to empirically show that, much like recent successes in the domain of perception, generalist algorithmic learners can be built by “incorporating” knowledge. That is, it is possible to effectively learn algorithms in a multi-task manner, so long as we can learn to execute them well in a single-task regime. Motivated by this, we present a series of improvements to the input representation, training regime and processor architecture over CLRS, improving average single-task performance by over 20% from prior art. We then conduct a thorough ablation of multi-task learners leveraging these improvements. Our results demonstrate a generalist learner that effectively incorporates knowledge captured by specialist models.

OpenReview: https://openreview.net/forum?id=FebadKZf6Gd

GARNET: Reduced-Rank Topology Learning for Robust and Scalable Graph Neural Networks

Authors: Chenhui Deng, Xiuyu Li, Zhuo Feng, Zhiru Zhang

Date: December 10, 2022

Time slot: 16:00 - 16:20 (GMT)

Abstract: Graph neural networks (GNNs) have been increasingly deployed in various applications that involve learning on non-Euclidean data. However, recent studies show that GNNs are vulnerable to graph adversarial attacks. Although there are several defense methods to improve GNN robustness by eliminating adversarial components, they may also impair the underlying clean graph structure that contributes to GNN training. In addition, few of those defense models can scale to large graphs due to their high computational complexity and memory usage. In this paper, we propose GARNET, a scalable spectral method to boost the adversarial robustness of GNN models. GARNET first leverages weighted spectral embedding to construct a base graph, which is not only resistant to adversarial attacks but also contains critical (clean) graph structure for GNN training. Next, GARNET further refines the base graph by pruning additional uncritical edges based on probabilistic graphical model. GARNET has been evaluated on various datasets, including a large graph with millions of nodes. Our extensive experiment results show that GARNET achieves adversarial accuracy improvement and runtime speedup over state-of-the-art GNN (defense) models by up to 10.23% and 14.7×, respectively

OpenReview: https://openreview.net/forum?id=kvwWjYQtmw

Transductive Linear Probing: A Novel Framework for Few-Shot Node Classification

Authors: Zhen Tan, Song Wang, Kaize Ding, Jundong Li, huan liu

Date: December 10, 2022

Time slot: 16:20 - 16:40 (GMT)

Abstract: Few-shot node classification is tasked to provide accurate predictions for nodes from novel classes with only few representative labeled nodes. This problem has drawn tremendous attention for its projection to prevailing real-world applications, such as product categorization for newly added commodity categories on an E-commerce platform with scarce records or diagnoses for rare diseases on a patient similarity graph. To tackle such challenging label scarcity issues in the non-Euclidean graph domain, meta-learning has become a successful and predominant paradigm. More recently, inspired by the development of graph self-supervised learning, transferring pretrained node embeddings for few-shot node classification could be a promising alternative to meta-learning but remains unexposed. In this work, we empirically demonstrate the potential of an alternative framework, \textit{Transductive Linear Probing}, that transfers pretrained node embeddings, which are learned from graph contrastive learning methods. We further extend the setting of few-shot node classification from standard fully supervised to a more realistic self-supervised setting, where meta-learning methods cannot be easily deployed due to the shortage of supervision from training classes. Surprisingly, even without any ground-truth labels, transductive linear probing with self-supervised graph contrastive pretraining can outperform the state-of-the-art fully supervised meta-learning based methods under the same protocol. We hope this work can shed new light on few-shot node classification problems and foster future research on learning from scarcely labeled instances on graphs.

OpenReview: https://openreview.net/forum?id=dK8vOIBENa3

Shortest Path Networks for Graph Property Prediction

Authors: Ralph Abboud, Radoslav Dimitrov, Ismail Ilkan Ceylan

Date: December 10, 2022

Time slot: 16:40 - 17:00 (GMT)

Abstract: Most graph neural network models rely on a particular message passing paradigm, where the idea is to iteratively propagate node representations of a graph to each node in the direct neighborhood. While very prominent, this paradigm leads to information propagation bottlenecks, as information is repeatedly compressed at intermediary node representations, which causes loss of information, making it practically impossible to gather meaningful signals from distant nodes. To address this, we propose shortest path message passing neural networks, where the node representations of a graph are propagated to each node in the shortest path neighborhoods. In this setting, nodes can directly communicate between each other even if they are not neighbors, breaking the information bottleneck and hence leading to more adequately learned representations. Our framework generalizes message passing neural networks, resulting in a class of more expressive models, including some recent state-of-the-art models. We verify the capacity of a basic model of this framework on dedicated synthetic experiments, and on real-world graph classification and regression benchmarks, and obtain state-of-the art results.

OpenReview: https://openreview.net/forum?id=mWzWvMxuFg1

Taxonomy of Benchmarks in Graph Representation Learning

Authors: Renming Liu, Semih Cantürk, Frederik Wenkel, Sarah McGuire, Xinyi Wang, Anna Little, Leslie O’Bray, Michael Perlmutter, Bastian Rieck, Matthew Hirn, Guy Wolf, Ladislav Rampášek

Date: December 11, 2022

Time slot: 19:00-19:20 (GMT)

Abstract: We provide a systematic approach for categorization of graph learning datasets based on their empirical properties. Abstract: Graph Neural Networks (GNNs) extend the success of neural networks to graph-structured data by accounting for their intrinsic geometry. While extensive research has been done on developing GNN models with superior performance according to a collection of graph representation learning benchmarks, it is currently not well understood what aspects of a given model are probed by them. For example, to what extent do they test the ability of a model to leverage graph structure vs. node features? Here, we develop a principled approach to taxonomize benchmarking datasets according to a that is based on how much GNN performance changes due to a collection of graph perturbations. Our data-driven analysis provides a deeper understanding of which benchmarking data characteristics are leveraged by GNNs. Consequently, our taxonomy can aid in selection and development of adequate graph benchmarks, and better informed evaluation of future GNN methods. Finally, our approach is designed to be extendable to multiple graph prediction task types and future datasets.

OpenReview: https://openreview.net/forum?id=EM-Z3QFj8n

An Analysis of Virtual Nodes in Graph Neural Networks for Link Prediction

Authors: EunJeong Hwang, Veronika Thost, Shib Sankar Dasgupta, Tengfei Ma

Date: December 11, 2022

Time slot: 19:20-19:40 (GMT)

Abstract: It is well known that the graph classification performance of graph neural networks often improves by adding an artificial virtual node to the graphs, which is connected to all graph nodes. Surprisingly, the advantage of using virtual nodes has never been theoretically investigated, and their impact on other problems is still an open research question. In this paper, we adapt the concept of virtual nodes to link prediction, where we usually have much larger, often very sparse or dense, and overall more heterogeneous graphs. In particular, we use multiple virtual nodes per graph and graph-based clustering to determine the connections to the graph nodes. We also provide a detailed theoretical analysis. We conducted experiments over different datasets of the Open Graph Benchmark, analyze the results in detail, and show that virtual nodes may yield rather stable performance increases and sometimes considerably boost performance.

OpenReview: https://openreview.net/forum?id=dI6KBKNRp7

Graph Learning Indexer: A Contributor-Friendly and Metadata-Rich Platform for Graph Learning Benchmarks

Authors: Jiaqi Ma, Xingjian Zhang, Hezheng Fan, Jin Huang, Tianyue Li, Ting Wei Li, Yiwen Tu, Chenshu Zhu, Qiaozhu Mei

Date: December 11, 2022

Time slot: 19:40-20:00 (GMT)

Abstract: Establishing common benchmarks has been a critical driving force behind the success of modern machine learning techniques. As machine learning is being applied in broader domains and tasks, there is a need to establish more and diverse benchmarks to better reflect the reality of the application scenarios. For graph learning, an emerging field of machine learning, the need of establishing better benchmarks is particularly urgent. Towards this goal, we introduce Graph Learning Indexer (GLI), a benchmark curation platform for graph learning. In comparison to existing graph learning benchmark libraries, GLI highlights two novel design objectives. First, GLI is designed to incentivize \emph{dataset contributors}. In particular, we incorporate various measures to minimize the effort of contributing and maintaining a dataset, increase the usability of the contributed dataset, as well as encourage better credits to different contributors of the dataset. Second, GLI is designed to curate a knowledge base, instead of a collection, of benchmark datasets. For this purpose, we come up with multiple sources of meta information of the benchmark datasets in order to better characterize the datasets.

OpenReview: https://openreview.net/forum?id=ZBsxA6_gp3

You Can Have Better Graph Neural Networks by Not Training Weights at All: Finding Untrained GNNs Tickets

Authors: Tianjin Huang, Tianlong Chen, Meng Fang, Vlado Menkovski, Jiaxu Zhao, Lu Yin, Yulong Pei, Decebal Constantin Mocanu, Zhangyang Wang, Mykola Pechenizkiy, Shiwei Liu

Date: December 12, 2022

Time slot: 18:00-18:20 (GMT)

Abstract: Recent works have impressively demonstrated that there exists a subnetwork in randomly initialized convolutional neural networks (CNNs) that can match the performance of the fully trained dense networks at initialization, without any optimization of the weights of the network (i.e., untrained networks). However, the presence of such untrained subnetworks in graph neural networks (GNNs) still remains mysterious. In this paper we carry out the first-of-its-kind exploration of discovering matching untrained GNNs. With sparsity as the core tool, we can find untrained sparse subnetworks at the initialization, that can match the performance of fully trained dense GNNs. Besides this already encouraging finding of comparable performance, we show that the found untrained subnetworks can substantially mitigate the GNN over-smoothing problem, hence becoming a powerful tool to enable deeper GNNs without bells and whistles. We also observe that such sparse untrained subnetworks have appealing performance in out-of-distribution detection and robustness of input perturbations. We evaluate our method across widely-used GNN architectures on various popular datasets including the Open Graph Benchmark (OGB).

OpenReview: https://openreview.net/forum?id=dF6aEW3_62O

Influence-Based Mini-Batching for Graph Neural Networks

Authors: Johannes Gasteiger, Chendi Qian, Stephan Günnemann

Date: December 12, 2022

Time slot: 18:20-18:40 (GMT)

Abstract: Using graph neural networks for large graphs is challenging since there is no clear way of constructing mini-batches. To solve this, previous methods have relied on sampling or graph clustering. While these approaches often lead to good training convergence, they introduce significant overhead due to expensive random data accesses and perform poorly during inference. In this work we instead focus on model behavior during inference. We theoretically model batch construction via maximizing the influence score of nodes on the outputs. This formulation leads to optimal approximation of the output when we do not have knowledge of the trained model. We call the resulting method influence-based mini-batching (IBMB). IBMB accelerates inference by up to 130x compared to previous methods that reach similar accuracy. Remarkably, with adaptive optimization and the right training schedule IBMB can also substantially accelerate training, thanks to precomputed batches and consecutive memory accesses. This results in up to 18x faster training per epoch and up to 17x faster convergence per runtime compared to previous methods.

OpenReview: https://openreview.net/forum?id=b9g0vxzYa_

The Multi-Orbits Skew Spectrum: Boosting Permutation-Invariant Data Representations

Authors: Armando Bellante, Martin Plávala, Alessandro Luongo

Date: December 12, 2022

Time slot: 18:40-19:00 (GMT)

Abstract: We generalize the concept of skew spectrum of a graph, a group-theoretical permutation-invariant feature mapping. The skew spectrum considers adjacency matrices as functions over and leverages Fourier transform and group-theoretical tools to extract features that are invariant under the group action. The main shortcoming of the previous result is that the skew spectrum only works for unlabeled graphs. The reason is that these graphs can be represented using matrices whose main diagonal contains zeros, meaning that there is only one set of elements that can permute among themselves (i.e., one orbit). However, the representations of more complex graphs (e.g., labeled graphs, multigraphs, or hypergraphs) have different sets of elements that can consistently permute on different orbits. In this work, we generalize the skew spectrum to the multiple orbits case. Our multi-orbits skew spectrum produces features invariant to such permutations and possibly informative of non-consistent ones. We show this method can improve the performances of models that learn on graphs by providing comparisons with single orbit representations and eigenvalues. Moreover, the theory is general enough to handle invariance under the action of any finite group on multiple orbits and has applications beyond the graph domain.

OpenReview: https://openreview.net/forum?id=oqf2GRYTET