Book Image

Hands-On Graph Neural Networks Using Python

By : Maxime Labonne
Book Image

Hands-On Graph Neural Networks Using Python

By: Maxime Labonne

Overview of this book

Graph neural networks are a highly effective tool for analyzing data that can be represented as a graph, such as networks, chemical compounds, or transportation networks. The past few years have seen an explosion in the use of graph neural networks, with their application ranging from natural language processing and computer vision to recommendation systems and drug discovery. Hands-On Graph Neural Networks Using Python begins with the fundamentals of graph theory and shows you how to create graph datasets from tabular data. As you advance, you’ll explore major graph neural network architectures and learn essential concepts such as graph convolution, self-attention, link prediction, and heterogeneous graphs. Finally, the book proposes applications to solve real-life problems, enabling you to build a professional portfolio. The code is readily available online and can be easily adapted to other datasets and apps. By the end of this book, you’ll have learned to create graph datasets, implement graph neural networks using Python and PyTorch Geometric, and apply them to solve real-world problems, along with building and training graph neural network models for node and graph classification, link prediction, and much more.
Table of Contents (25 chapters)
1
Part 1: Introduction to Graph Learning
5
Part 2: Fundamentals
10
Part 3: Advanced Techniques
18
Part 4: Applications
22
Chapter 18: Unlocking the Potential of Graph Neural Networks for Real-World Applications

Introducing graph datasets

The graph datasets we’re going to use in this chapter are richer than Zachary’s Karate Club: they have more nodes, more edges, and include node features. In this section, we will introduce them to give us a good understanding of these graphs and how to process them with PyTorch Geometric. Here are the two datasets we will use:

  • The Cora dataset
  • The Facebook Page-Page dataset

Let’s start with the smaller one: the popular Cora dataset.

The Cora dataset

Introduced by Sen et al. in 2008 [1], Cora (no license) is the most popular dataset for node classification in the scientific literature. It represents a network of 2,708 publications, where each connection is a reference. Each publication is described as a binary vector of 1,433 unique words, where 0 and 1 indicate the absence or presence of the corresponding word, respectively. This representation is also called a binary bag of words in natural language processing...