In this recipe, we will learn how to create graphs and do basic operations on them.
As a starting example, we will have three vertices, each representing the city center of three cities in California—Santa Clara, Fremont, and San Francisco. The following is a roughly drawn out geographic position of the three cities (not to scale):
The following is the distance between these cities:
Source | Destination | Distance (miles) |
Santa Clara, CA | Fremont, CA | 20 |
Fremont, CA | San Francisco, CA | 44 |
San Francisco, CA | Santa Clara, CA | 53 |
- Import the
graphx
related classes:
scala> import org.apache.spark.graphx._ scala> import org.apache.spark.rdd.RDD
- Load the vertex data in an array:
scala> val vertices = Array((1L, ("Santa Clara","CA")),(2L,
("Fremont","CA")),(3L, ("San Francisco","CA")))
- Load the array of vertices into the RDD of vertices:
scala> val vrdd = sc.parallelize(vertices)
- Load the edge data in an array:
scala> val edges = Array(Edge...