Book Image

Hands-On Graph Analytics with Neo4j

By : Estelle Scifo
Book Image

Hands-On Graph Analytics with Neo4j

By: Estelle Scifo

Overview of this book

Neo4j is a graph database that includes plugins to run complex graph algorithms. The book starts with an introduction to the basics of graph analytics, the Cypher query language, and graph architecture components, and helps you to understand why enterprises have started to adopt graph analytics within their organizations. You’ll find out how to implement Neo4j algorithms and techniques and explore various graph analytics methods to reveal complex relationships in your data. You’ll be able to implement graph analytics catering to different domains such as fraud detection, graph-based search, recommendation systems, social networking, and data management. You’ll also learn how to store data in graph databases and extract valuable insights from it. As you become well-versed with the techniques, you’ll discover graph machine learning in order to address simple to complex challenges using Neo4j. You will also understand how to use graph data in a machine learning model in order to make predictions based on your data. Finally, you’ll get to grips with structuring a web application for production using Neo4j. By the end of this book, you’ll not only be able to harness the power of graphs to handle a broad range of problem areas, but you’ll also have learned how to use Neo4j efficiently to identify complex relationships in your data.
Table of Contents (18 chapters)
1
Section 1: Graph Modeling with Neo4j
5
Section 2: Graph Algorithms
10
Section 3: Machine Learning on Graphs
14
Section 4: Neo4j for Production

Running the Label Propagation algorithm

Label Propagation is another example of a community detection algorithm. Proposed in 2017, its strength is in its possibility to set some labels for known nodes and derive the unknown labels from them in a semi-supervised way. It can also take into account both relationships and node weights. In this section, we are going to detail the algorithm with a simple implementation in Python.

Defining Label Propagation

Several variants of Label Propagation exist. The main idea is the following:

  1. Labels are initialized such that each node lies in its own community.
  2. Labels are iteratively updated based on the majority vote rule: each nodes receives the label of its neighbors and the most common label within them is assigned to the node. Conflicts appear when the most common label is not unique. In that case, a rule needs to be defined, which can be random or deterministic (like in the GDS).
  3. The iterative process is repeated until all nodes have fixed labels...