This chapter covers various optimizations and performance-tuning best practices when working with Spark.
The chapter is divided into the following recipes:
Optimizing memory
Using compression to improve performance
Using serialization to improve performance
Optimizing garbage collection
Optimizing the level of parallelism
Understanding the future of optimization – project Tungsten