Prior to Spark 1.2, there was no aggregateMessages
method in Graph. Instead, the now deprecated mapReduceTriplets
was the primary aggregation operator. The API for mapReduceTriplets
is:
class Graph[VD, ED] { def mapReduceTriplets[Msg]( map: EdgeTriplet[VD, ED] => Iterator[(VertexId, Msg)], reduce: (Msg, Msg) => Msg) : VertexRDD[Msg] }
Compared to mapReduceTriplets
, the new operator aggregateMessages
is more expressive as it employs the message passing mechanism instead of returning an iterator of messages as mapReduceTriplets
does. In addition, aggregateMessages
explicitly requires the user to specify the TripletFields
object for performance improvement as we explained previously. In addition to API improvements, aggregateMessages
is optimized for performance.
Since mapReduceTriplets
is now deprecated, we will not discuss it further. If you have to use it with earlier versions of Spark, you can refer to the Spark programming guide.