Book Image

Learning Cascading

Book Image

Learning Cascading

Overview of this book

Table of Contents (18 chapters)
Learning Cascading
Credits
Foreword
About the Authors
About the Reviewers
www.PacktPub.com
Preface
7
Optimizing the Performance of a Cascading Application
Index

Defining the project – the Cascading development methodology


Now, let's dive into the full development methodology. We will talk about composing your team, analyzing your data and problem set, and also how to decompose the problem set in a way that will make developing a solution in Cascading straightforward and understandable.

Project roles and responsibilities

Here are the roles of projects and their key responsibilities:

  • The process owner is a person who is sees the "big picture". This person understands the ultimate purpose of the application, the inputs required, and the outputs that need to be produced. He or she is the subject matter expert (SME) in the underlying domain of the application. In effect, this person is more of an analyst than a technician, although some knowledge of the underlying operating system and HDFS is important. He or she runs an application on a given cluster either on a command line, using a prepackaged Java JAR file compiled against the Apache Hadoop and Cascading...