Tutorial Sessions

Learning on Graphs Conference, 2022

Table of Contents

Day 1 – December 10, 2022

Complex Reasoning Over Relational Databases

Organizers: Hongyu Ren, Hanjun Dai, Jiani Huang, Ziyang Li, and Jure Leskovec

Date: December 10, 2022

Time slot: 1A

Length: 90 minutes

Abstract: Combining reasoning with deep learning techniques has received increasing attention in the community nowadays. Among recent works, graph-structured relational databases serve as the fundamental component in many reasoning tasks. However, designing effective neural methods for reasoning tasks could be challenging, as typically it would involve two problems – learning and reasoning over representations. One motivating example is to understand image content through scene graph representations. The challenges involved in the two stages are 1) learning the representation for objects to obtain scene graphs in a weakly supervised manner; and 2) handling the noisy links when executing symbolic queries on scene graphs. These two stages are complementary while also coupled to each other for a reasoning task. In this tutorial, we will cover the reasoning over relational databases in these two stages through 1) learning representations with symbolic reasoning and 2) learning to reason over symbolic queries. For each of these two, we will present the corresponding preliminaries and recent advances in research, and provide hands-on experience on the recently open-sourced toolkits Scallop and Smore, respectively. The main goal of this tutorial is to introduce the background and recent works in the graph reasoning topic, provide demos of recent toolkits, and cover the challenges and possible future directions in the research.

Website: https://snap.stanford.edu/logtutorial/

Setup requirements: Please set up these two tools using the provided instructions: Scallop and Smore.

Graph Neural Networks in TensorFlow: A Practical Guide

Organizers: Bryan Perozzi, Sami Abu-el-Haija, Arno Eigenwillig, and Brandon Mayer

Date: December 10, 2022

Time slot: 2A

Length: 90 minutes

Abstract: Graphs are general data structures that can represent information from a variety of domains (social, biomedical, online transactions, and many more). Graph Neural Networks (GNNs) are quickly becoming the de-facto Machine Learning models for learning from Graph data and hereby infer missing information, such as, predicting labels of nodes or imputing missing edges. The main goal of this tutorial is to help practitioners and researchers to implement GNNs in a TensorFlow setting. Specifically, the tutorial will be mostly hands-on, and will walk the audience through a process of running existing GNNs on heterogeneous graph data, and a tour of how to implement new GNN models. The hands-on portion of the tutorial will be based on TF-GNN1, a new framework that we open-sourced.

Setup requirements: None.

Scaling GNNs in Production: A Tale of Challenges and Opportunities

Organizers: Da Zheng, Vassilis N. Ioannidis, and Soji Adeshina

Date: December 10, 2022

Time slot: 1B

Length: 90 minutes

Abstract: Abstract: Graph Neural Networks (GNNs) have seen a lot of academic interest in recent years and have shown a lot of promise for many real-world applications from fraud and abuse detection to recommendations. Yet, industry-wide adoption of GNN techniques to these problems have been lagging behind. As such, there is a strong need for tools and frameworks that help researchers develop GNNs for large scale graph machine learning problems, and help machine learning practitioners deploy these models for production use cases. The relatively slow adoption of GNNs in industry is a result of the unique set of challenges that need to be solved to scale GNNs for industrial applications. In this tutorial, we detail these challenges, including i) scaling GNNs to giant graphs, including distributed training on billion node graphs ii) scaling GNNs with rich and heterogeneous node level features, including joint training for GNNs and large language models (LLMs) and iii) scaling GNNs within a busi- ness driven machine learning (ML) workflow for real time inference and batch predictions with graph databases. We discuss how we tackle these challenges at Amazon using frameworks like DGL, Dist-DGL and Neptune ML that take away the undifferentiated heavy lifting necessary for productionizing GNNs.

Setup requirements: None.

Parallel and Distributed Graph Neural Networks: An In-Depth Concurrency Analysis

Organizers: Torsten Hoefler and Maciej Besta

Date: December 10, 2022

Time slot: 2B

Length: 90 minutes

Abstract: Graph neural networks (GNNs) are among the most powerful tools in deep learning. Accelerating and scaling GNN computations to much larger graph and model sizes are critical to advance the field. For example, while the largest graph covered in the Open Graph Benchmark’s Large-Scale challenge has fewer than 2 billion edges, modern graphs can have more than tens of trillions of edges. However, both inference and training of GNNs are complex, and they uniquely combine the features of irregular graph processing with dense and regular computations. Thus, it is very challenging to execute and scale GNNs efficiently on modern massively parallel architectures. To alleviate this, we first design a taxonomy of parallelism in GNNs, considering data, model, and pipeline parallelism. We use this taxonomy to investigate the amount of parallelism in numerous GNN models, GNN-driven machine learning tasks, software frameworks, or hardware accelerators. We use the work-depth model, and we also assess communication/synchronization. We specifically focus on the sparsity/density of the associated tensors to understand how to effectively apply techniques such as vectorization. We also formally analyze GNN pipelining, and we generalize the established Message-Passing class of GNNs to cover arbitrary pipeline depths, facilitating future optimizations. Finally, we investigate different forms of asynchronicity, navigating the path for future asynchronous parallel GNN pipelines. To conclude, we synthesize a set of insights that help to maximize GNN performance, and a comprehensive list of challenges/opportunities for further research into efficient GNN computations. Our work will help to advance the design of future GNNs.

Setup requirements: None.

Neural Algorithmic Reasoning

Organizers: Petar Velickovic, Andreea Deac, and Andrew Dudzik

Date: December 10, 2022

Time slot: 1C

Length: 180 minutes

Abstract: Neural networks that are able to reliably execute algorithmic computation may hold transformative potential to both machine learning and theoretical computer science. On one hand, they could enable the kind of extrapolative generalisation scarcely seen with deep learning models. On another, they may allow for running classical algorithms on inputs previously considered inaccessible to them. Both of these promises are shepherded by the neural algorithmic reasoning blueprint, which has been recently proposed in a position paper by Petar Velickovic and Charles Blundell. On paper, this is a remarkably elegant pipeline for reasoning on natural inputs which carefully leverages the tried-and-tested power of deep neural networks as feature extractors. In practice, how far did we actually take it? In this tutorial, we aim to provide the foundations needed to answer three key questions of neural algorithmic reasoning: how to develop neural networks that execute algorithmic computation, how to deploy such neural networks in real-world problems, and how to deepen their theoretical links to classical algorithms. Our tutorial will be presented from the ground up, in a way that is accessible to anyone with a basic computer science background. Hands-on coding segments will also be provided, showing how attendees can directly develop their ideas in graph representation learning on relevant algorithmic reasoning datasets (such as CLRS), and then deploy them in downstream agents (e.g., in reinforcement learning).

Setup requirements: Please ensure that you are able to run a Google Colab, ideally with a GPU instance.

Day 2 – December 11, 2022

Self-Supervised Learning on Graphs and More Complex Structures

Organizers: Balaraman Ravindran, Anasua Mitra, and Snehil Sanyal

Date: December 11, 2022

Time slot: 1A

Length: 180 minutes

Abstract: Self-Supervised Learning (SSL), although popular in domains like computer vision, and natural language processing, is less explored for graphs. There has been a surge in learning self-supervised network representations on graph data due to its advantages in learning from abundantly available pseudo-supervised data, less reliance on manual annotations, less costly training, better generalization, and more robustness. Our proposed tutorial SSLoGCS focuses on self-supervised learning on various complex graph structures. The complex graphs are characterized by a set of connected nodes that interact in non-trivial ways. Complex graph structures have diverse applications in real-world scenarios. The tutorial focuses on representing real-world knowledge via ubiquitous graph representations and paves the way for future avenues to learn self-supervised representations on such graphs.

Setup requirements: None.

Graph Rewiring Tutorial: From Theory to Applications in Fairness

Organizers: Adrian Arnaiz-Rodriguez, Francisco Escolano, and Nuria Oliver

Date: December 11, 2022

Time slot: 1B

Length: 180 minutes

Abstract: Graph Neural Networks (GNNs) have been shown to achieve competitive results to tackle graph-related tasks, such as node and graph classification, link prediction and node and graph clustering in a variety of domains. Most GNNs use a message passing framework and hence are called MPNNs. Despite their promising results, MPNNs have been reported to suffer from over-smoothing, over-squashing and under-reaching. Graph rewiring and graph pooling have been proposed in the literature as solutions to address these limitations. Many graph rewiring methods rely on edge sampling strategies: first, the edges are assigned new weights according to a relevance function and then they are re-sampled according to the new weights to retain the most relevant edges (i.e. those with larger weights). Edge relevance might be computed in different ways, including randomly, based on similarity or on the edge’s curvature. This tutorial provides an overview of the most relevant techniques proposed in the literature for graph rewiring based on diffusion, curvature or spectral concepts. It will explain their relationship and will present the most relevant state-of-the-art techniques and their application to different domains. The tutorial will outline open questions in this field, both from a theoretical and ethical perspective. The tutorial will end with a panel which will give the opportunity to attendees to engage in a discussion with a diverse set of scientists with different technical perspectives, levels of seniority, and institutional and geographic affiliations.

Website: https://ellisalicante.org/tutorials/GraphRewiring

Setup requirements: None.

Exploring the practical and theoretical landscape of expressive graph neural networks

Organizers: Fabrizio Frasca, Beatrice Bevilacqua, and Haggai Maron

Date: December 11, 2022

Time slot: 1C

Length: 180 minutes

Abstract: In an effort to overcome the expressiveness limitations of Graph Neural Networks (GNNs), a multitude of novel architectures has been recently proposed, aiming to balance expressive power, computational complexity, and domain-specific empiri- cal performance. Several directions and methods are involved in this recent surge, ranging from Graph Theory and Topology to Group Theory and theoretical Com- puter Science. As a result, researchers who wish to work on this critical topic are exposed to an unsystematic collection of seemingly independent approaches whose relations remain poorly understood. In an effort to address this issue, the pro- posed tutorial reviews the most prominent expressive GNNs, categorises them into different families, and draws interesting connections between them. This is accom- plished through a series of practical coding sessions and an organic overview of the literature landscape. We aim to convey the importance of studying the expressive power of GNNs and make this field more accessible to our community, especially practitioners and newcomers.

Setup requirements: Please make sure you have access to Google Colab for our hands-on sessions.