Graph analysis is much more commonplace in our life than we think. To take the most common example, when we ask a Global Positioning System (GPS) to find the shortest route to a destination, it uses a graph-processing algorithm.
Let's start by understanding graphs. A graph is a representation of a set of vertices, where some pairs of vertices are connected by edges. When these edges move from one direction to another, it's called a directed graph or digraph.
GraphX is the Spark API for graph processing. It provides a wrapper around an RDD called a resilient distributed property graph. The property graph is a directed multigraph, with properties attached to each vertex and edge.
There are two types of graphs—directed graphs (digraphs) and regular graphs. Directed graphs have edges that run in one direction; for example, from vertex A to vertex B. A Twitter follower is a good example of a digraph. If John is David's Twitter follower, it does not mean that David is John's follower...