Book Image

Learning Cascading

Book Image

Learning Cascading

Overview of this book

Table of Contents (18 chapters)
Learning Cascading
About the Authors
About the Reviewers
Optimizing the Performance of a Cascading Application

Understanding operations

Operations form the basis for most things that can be done to data as it passes through a pipe. They are connected to the appropriate type of a Pipe instance.

There are several classes of operations:

  • Filter: It discards unwanted records.

  • Function: It performs transformations.

  • Aggregator: It summarizes data across sets of records.

  • Buffer: It operates on a set of records.

  • Assertion: It imposes asserted conditions on the records that fail if they're not met. Assertions are specific to single tuples (that is, Each), pipes, and grouped tuples (that is, Every) pipes. As we shall see later, assertions are special and follow a slightly different set of rules than the other four preceding operations listed.

Using operations is easy. An operation is just a class. They are instantiated as objects using new, and then are attached to pipes by passing them as parameters when Pipe is created. Then, the pipe passes tuples to the attached operation where they are processed. The following...