Book Image

Learning Cascading

Book Image

Learning Cascading

Overview of this book

Table of Contents (18 chapters)
Learning Cascading
Credits
Foreword
About the Authors
About the Reviewers
www.PacktPub.com
Preface
7
Optimizing the Performance of a Cascading Application
Index

Writing custom operations


We have now seen the function signatures for each of the custom operations. We also know that an operation is generically typed. Its type defines the Context class that it will use. Let's look in detail at how we actually write these extensions. There are several things that we must discuss first though.

As we have seen, operations are attached to pipes. When tuples of data flow through the pipes, these operations are called to process them. Remember that tuples are accessed only by position, so really they are vectors of data. But we know that tuple positions are also defined as Fields.

How is this managed inside of an operation? Well, a tuple is "wrapped" in a TupleEntry object. A TupleEntry object has method calls that allow positional elements in the tuple to be accessed by field name. It seems clear that to do this, an internal map must be kept where the textual field name can be looked up to obtain the ordinal position within the tuple. Because of this, tuples...