Book Image

Apache Spark Graph Processing

Book Image

Apache Spark Graph Processing

Overview of this book

Table of Contents (16 chapters)
Apache Spark Graph Processing
Credits
Foreword
About the Author
About the Reviewer
www.PacktPub.com
Preface
Index

Graph builders


In GraphX, there are four functions for building a property graph. Each of these functions requires that the data from which the graph is constructed should be structured in a specified manner.

The Graph factory method

The first one is the Graph factory method that we have already seen in the previous chapter. It is defined in the apply method of the companion object called Graph, which is as follows:

def apply[VD, ED](
      vertices: RDD[(VertexId, VD)],
      edges: RDD[Edge[ED]],
      defaultVertexAttr: VD = null)
    : Graph[VD, ED]

As we have seen before, this function takes two RDD collections: RDD[(VertexId, VD)] and RDD[Edge[ED]] as parameters for the vertices and edges respectively, to construct a Graph[VD, ED] parameter. The defaultVertexAttr attribute is used to assign the default attribute for the vertices that are present in the edge RDD but not in the vertex RDD. The Graph factory method is convenient when the RDD collections of edges and vertices are readily available...