Book Image

Programming MapReduce with Scalding

By : Antonios Chalkiopoulos
Book Image

Programming MapReduce with Scalding

By: Antonios Chalkiopoulos

Overview of this book

Table of Contents (16 chapters)
Programming MapReduce with Scalding
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Black box testing


During test-driven development, we retain an internal perspective of the system. We identify all possible paths and exercise them through test case inputs to validate the expected output. However, using only valid input is not sufficient, especially when implementing MapReduce applications that execute against possibly billions of lines of data. As we cannot generate all possible cases of invalid input, we look at techniques that increase the data coverage of tests.

Taking a step back, the development lifecycle begins with data exploration followed by the algorithm design. Having a data scientist performing these tasks in a non-scalable development language such as R or Python is the basis of black box testing. Data scientists use multiple tools to extract meaning, insights, and ultimately, value from data. These tools provide powerful capabilities and rich visualizations that enable them to quickly conclude into mathematical models. The drawback is that the resulting implementation...