In this recipe, we will learn how to create graphs and do basic operations on them.
As a starting example, we will have three vertices, each representing the city center of three cities in California—Santa Clara, Fremont, and San Francisco. The following is the distance between these cities:
Source |
Destination |
Distance (miles) |
---|---|---|
Santa Clara, CA |
Fremont, CA |
20 |
Fremont, CA |
San Francisco, CA |
44 |
San Francisco, CA |
Santa Clara, CA |
53 |
Import the GraphX-related classes:
scala> import org.apache.spark.graphx._ scala> import org.apache.spark.rdd.RDD
Load the vertex data in an array:
scala> val vertices = Array((1L, ("Santa Clara","CA")),(2L, ("Fremont","CA")),(3L, ("San Francisco","CA")))
Load the array of vertices into the RDD of vertices:
scala> val vrdd = sc.parallelize(vertices)
Load the edge data in an array:
scala> val edges = Array(Edge(1L,2L,20),Edge(2L,3L,44),Edge(3L,1L,53))