Suppose you want some specific software installed on the machines executing your Hadoop jobs, or if you want to tweak some of the default Hadoop configurations, EMR bootstrap actions will help you perform these tasks.
Amazon EMR provides a mechanism to customize the installation and configuration of Hadoop clusters using bootstrap actions. A bootstrap action is a script that will be run on the cluster before Hadoop starts and a node is ready for data processing.
EMR provides certain default bootstrap actions like Hadoop configuration customization, so you can tweak or tune the default Hadoop parameters of their cluster. However, you can create custom bootstrap actions based on your requirements.
We need to store the bootstrap actions in the S3 bucket and one cluster can have up to 16 bootstrap actions. They will be executed in the order of their assignment while launching the cluster.