Book Image

Apache Spark Graph Processing

Book Image

Apache Spark Graph Processing

Overview of this book

Table of Contents (16 chapters)
Apache Spark Graph Processing
Credits
Foreword
About the Author
About the Reviewer
www.PacktPub.com
Preface
Index

Network datasets


In the previous chapter, we constructed a small social network as a toy example. From this chapter onwards, we are going to work with real-world datasets, drawn from various applications. In fact, graphs are used to represent any complex system as it describes the interactions between the components of the system. Despite the diversity in form, size, nature, and granularity of different systems, graph theory provides a common language, and a set of tools, for representing and analyzing complex systems.

Note

In brief, a graph consists of a set of vertices connected by a set of edges. Each edge represents the relationship between a pair of connected vertices. In this book, we will sometimes use the less technical terms network nodes to refer to vertices, and links to refer to edges. Note that Spark supports multigraphs, that is, it is permitted to have multiple edges between any pair of nodes.

Let's get a preview of the networks that we are going to build in this chapter.

The...