Book Image

Graph Data Modeling in Python

By : Gary Hutson, Matt Jackson
Book Image

Graph Data Modeling in Python

By: Gary Hutson, Matt Jackson

Overview of this book

Graphs have become increasingly integral to powering the products and services we use in our daily lives, driving social media, online shopping recommendations, and even fraud detection. With this book, you’ll see how a good graph data model can help enhance efficiency and unlock hidden insights through complex network analysis. Graph Data Modeling in Python will guide you through designing, implementing, and harnessing a variety of graph data models using the popular open source Python libraries NetworkX and igraph. Following practical use cases and examples, you’ll find out how to design optimal graph models capable of supporting a wide range of queries and features. Moreover, you’ll seamlessly transition from traditional relational databases and tabular data to the dynamic world of graph data structures that allow powerful, path-based analyses. As well as learning how to manage a persistent graph database using Neo4j, you’ll also get to grips with adapting your network model to evolving data requirements. By the end of this book, you’ll be able to transform tabular data into powerful graph data models. In essence, you’ll build your knowledge from beginner to advanced-level practitioner in no time.
Table of Contents (16 chapters)
Part 1: Getting Started with Graph Data Modeling
Part 2: Making the Graph Transition
Part 3: Storing and Productionizing Graphs
Part 4: Graphing Like a Pro

Optimizing travel with Python and Cypher

With our graph fully loaded into Neo4j, and our methods for querying data using Cypher and Python set up, we are ready to perform some more complex analysis. At the start of this section, we will use Cypher to answer questions and return answers in Python. Later, we will be doing more complex analysis, by sampling graph data from Neo4j and working with the sample in igraph.

Let’s begin by delving into the structure of our graph and asking some questions of our data to understand it better. The following steps will look at finding some relationships in the data:

  1. The first query we will run will find out the highest population by city and we are going to return the name of the city and the city’s population as the result. ORDER BY will order by the population of those nodes (n). For those SQL people out there, these commands will look very familiar, and you will find the transition to Cypher much easier than those who...