The Trident's aggregator is used to perform aggregation operations on an input batch or partition or stream. For example, let's say a user wants to count the number of tuples present in each batch, then he/she can use the count aggregator to count the number of tuples in each batch. The output of the Aggregator
interface completely replaces the value of the input tuple. There are three types of aggregators available in Trident:
The partition aggregate
The aggregate
The persistence aggregate
Let's understand each type of aggregator in detail.
As the name suggests, the partition aggregate works on each partition instead of the entire batch. The output of the partition aggregate completely replaces the input tuple. Also, the output of the partition aggregate contains a single field tuple. The following is the piece of code that shows how we can use the partitionAggregate
method:
mystream.partitionAggregate(new Fields("x"), new Count(), new Fields("count...