In this chapter, we will now see how to launch an EMR cluster via the AWS management console. We will then execute the solution that we created in the previous chapter in this cluster. Out of various ways to program a solution on EMR, as we saw in Chapter 4, Amazon EMR – Hadoop on Amazon Web Services, we chose the custom JAR technique, and we will use the JAR we created in the previous chapter.
Before you go ahead and launch your EMR cluster, you will need to make sure that the following two things are taken care of:
You need to have an EC2 key pair. If you do not have it, you can get it generated from your AWS management console. You will need this to SSH into the master node of the EMR cluster.
You need to upload the input files and the custom JAR we created in Chapter 4, Amazon EMR – Hadoop on Amazon Web Services, to Amazon S3. EMR will fetch the input as well as the program to be executed (a JAR file) by the cluster from S3.