In this chapter, we learned about MapReduce Combiners and how they help to improve the overall execution job time. Also, we covered why it is important to use compression, especially in a large data volume context.
Then we covered Java code-side optimization and learned about choosing appropriate Writable types and how to reuse these types smartly. We also learned about WritableComparator
and RawComparator
custom class implementation.
In the final section, we covered basic guidelines with some rules to tune your Hadoop configuration and enhance its performance.
In the next chapter, we will learn more about MapReduce optimization best practices. Keep reading!