Book Image

Spark Cookbook

By : Rishi Yadav
Book Image

Spark Cookbook

By: Rishi Yadav

Overview of this book

Table of Contents (19 chapters)
Spark Cookbook
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Introduction


Graph analysis is much more commonplace in our life than we think. To take the most common example, when we ask a GPS to find the shortest route to a destination, it uses a graph-processing algorithm.

Let's start by understanding graphs. A graph is a representation of a set of vertices where some pairs of vertices are connected by edges. When these edges move from one direction to another, it's called a directed graph or digraph.

GraphX is the Spark API for graph processing. It provides a wrapper around an RDD called resilient distributed property graph. The property graph is a directed multigraph with properties attached to each vertex and edge.

There are two types of graphs—directed graphs (digraphs) and regular graphs. Directed graphs have edges that run in one direction, for example, from vertex A to vertex B. Twitter follower is a good example of a digraph. If John is David's Twitter follower, it does not mean that David is John's follower. On the other hand, Facebook is...