Chapter 8: Building a GDS Pipeline for Node Classification Model Training

Book Overview & Buying
Table Of Contents

Graph Data Science with Neo4j

By : Scifo

4.5 (6)

Buy this Book

Graph Data Science with Neo4j

4.5 (6)

By: Scifo

Buy this Book

Overview of this book

Neo4j, along with its Graph Data Science (GDS) library, is a complete solution to store, query, and analyze graph data. As graph databases are getting more popular among developers, data scientists are likely to face such databases in their career, making it an indispensable skill to work with graph algorithms for extracting context information and improving the overall model prediction performance. Data scientists working with Python will be able to put their knowledge to work with this practical guide to Neo4j and the GDS library that offers step-by-step explanations of essential concepts and practical instructions for implementing data science techniques on graph data using the latest Neo4j version 5 and its associated libraries. You’ll start by querying Neo4j with Cypher and learn how to characterize graph datasets. As you get the hang of running graph algorithms on graph data stored into Neo4j, you’ll understand the new and advanced capabilities of the GDS library that enable you to make predictions and write data science pipelines. Using the newly released GDSL Python driver, you’ll be able to integrate graph algorithms into your ML pipeline. By the end of this book, you’ll be able to take advantage of the relationships in your dataset to improve your current model and make other types of elaborate predictions.

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Conventions used

Get in touch

Share Your Thoughts

Download a free PDF copy of this book

Part 1 – Creating Graph Data in Neo4j

Free Chapter

Chapter 1: Introducing and Installing Neo4j

Technical requirements

What is a graph database?

Finding or creating a graph database

Neo4j in the graph databases landscape

Setting up Neo4j

Inserting data into Neo4j with Cypher, the Neo4j query language

Extracting data from Neo4j with Cypher pattern matching

Summary

Further reading

Exercises

Chapter 2: Importing Data into Neo4j to Build a Knowledge Graph

Technical requirements

Importing CSV data into Neo4j with Cypher

Introducing the APOC library to deal with JSON data

Discovering the Wikidata public knowledge graph

Enriching our graph with Wikidata information

Dealing with spatial data in Neo4j

Importing data in the cloud

Summary

Further reading

Exercises

Part 2 – Exploring and Characterizing Graph Data with Neo4j

Chapter 3: Characterizing a Graph Dataset

Technical requirements

Characterizing a graph from its node and edge properties

Computing the graph degree distribution

Installing and using the Neo4j Python driver

Learning about other characterizing metrics

Summary

Further reading

Exercises

Chapter 4: Using Graph Algorithms to Characterize a Graph Dataset

Technical requirements

Digging into the Neo4j GDS library

Projecting a graph for use by GDS

Computing a node’s degree with GDS

Understanding a graph’s structure by looking for communities

Summary

Further reading

Chapter 5: Visualizing Graph Data

Technical requirements

The complexity of graph data visualization

Visualizing a small graph with networkx and matplotlib

Discovering the Neo4j Bloom graph application

Visualizing large graphs with Gephi

Summary

Further reading

Exercises

Part 3 – Making Predictions on a Graph

Chapter 6: Building a Machine Learning Model with Graph Features

Technical requirements

Introducing the GDS Python client

Running GDS algorithms from Python and extracting data in a dataframe

Using features from graph algorithms in a scikit-learn pipeline

Summary

Further reading

Exercise

Chapter 7: Automatically Extracting Features with Graph Embeddings for Machine Learning

Technical requirements

Introducing graph embedding algorithms

Using a transductive graph embedding algorithm

Training an inductive embedding algorithm

Computing new node representations

Summary

Further reading

Exercises

Chapter 8: Building a GDS Pipeline for Node Classification Model Training

Technical requirements

The GDS pipelines

Building and training a pipeline

Making predictions

Using embedding features

Summary

Further reading

Exercise

Chapter 9: Predicting Future Edges

Technical requirements

Introducing the LP problem

LP features

Building an LP pipeline with the GDS

Summary

Further reading

Chapter 10: Writing Your Custom Graph Algorithms with the Pregel API in Java

Technical requirements

Introducing the Pregel API

Implementing the PageRank algorithm

Testing our code

Using our algorithm from Cypher

Summary

Further reading

Exercises

Index

Why subscribe?

Other Books You May Enjoy

Packt is searching for authors like you

Share Your Thoughts

Download a free PDF copy of this book

Graph Data Science with Neo4j

By : Scifo

Graph Data Science with Neo4j

By: Scifo

Overview of this book

Building a GDS Pipeline for Node Classification Model Training

Confirmation

Buy this book with your credits?

Submit Your Feedback

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access