Optimizing performance by tuning a Cascading application is essential in order to ensure speedy and reliable execution. This chapter will provide in-depth information on how to efficiently optimize a Cascading application and configure the underlying framework (Hadoop) for maximum performance.
You will learn what to look for when determining performance characteristics. We will be:
Examining practices for improving the performance of a Cascading application
Discussing how to make performance changes to the underlying Hadoop system when Cascading is running on a specific platform
Showing how to effectively use checkpoints to help with processing time when failures occur
Overviewing several open source and commercial tools that can help us to diagnose performance of a Cascading application