Book Image

Hands-On Graph Analytics with Neo4j

By : Estelle Scifo
Book Image

Hands-On Graph Analytics with Neo4j

By: Estelle Scifo

Overview of this book

Neo4j is a graph database that includes plugins to run complex graph algorithms. The book starts with an introduction to the basics of graph analytics, the Cypher query language, and graph architecture components, and helps you to understand why enterprises have started to adopt graph analytics within their organizations. You’ll find out how to implement Neo4j algorithms and techniques and explore various graph analytics methods to reveal complex relationships in your data. You’ll be able to implement graph analytics catering to different domains such as fraud detection, graph-based search, recommendation systems, social networking, and data management. You’ll also learn how to store data in graph databases and extract valuable insights from it. As you become well-versed with the techniques, you’ll discover graph machine learning in order to address simple to complex challenges using Neo4j. You will also understand how to use graph data in a machine learning model in order to make predictions based on your data. Finally, you’ll get to grips with structuring a web application for production using Neo4j. By the end of this book, you’ll not only be able to harness the power of graphs to handle a broad range of problem areas, but you’ll also have learned how to use Neo4j efficiently to identify complex relationships in your data.
Table of Contents (18 chapters)
1
Section 1: Graph Modeling with Neo4j
5
Section 2: Graph Algorithms
10
Section 3: Machine Learning on Graphs
14
Section 4: Neo4j for Production

Measuring the similarity between nodes

There are several techniques used to quantify the similarity between nodes. They can be divided into two categories:

  • Set-based measures: Compare the content of two sets globally. For instance, sets (A, B, C) and (C, D, B) have two common elements.
  • Vector-based measures: Compare vectors element-wise, meaning that the position of each element is important. Euclidean distance is an example of such measures.

Let's go into more detail about these metrics, starting from the set-based similarities.

Set-based similarities

The GDS 1.0 implements two variants of set-based similarities we'll cover here.

Overlapping

The overlapping similarity is a measure of the number of common elements between two sets, relative to the size of the smallest set.

Definition

This measure's mathematical definition is as follows:

O(A, B) = | A ∩ B | / min(|A|, |B|)

A ∩ B is the intersection between sets A and B (common elements) and |A| denotes the...