Now, let's dive into the full development methodology. We will talk about composing your team, analyzing your data and problem set, and also how to decompose the problem set in a way that will make developing a solution in Cascading straightforward and understandable.
Here are the roles of projects and their key responsibilities:
The process owner is a person who is sees the "big picture". This person understands the ultimate purpose of the application, the inputs required, and the outputs that need to be produced. He or she is the subject matter expert (SME) in the underlying domain of the application. In effect, this person is more of an analyst than a technician, although some knowledge of the underlying operating system and HDFS is important. He or she runs an application on a given cluster either on a command line, using a prepackaged Java JAR file compiled against the Apache Hadoop and Cascading...