You will need to create an Amazon S3 bucket to hold the following four things:
Input file(s)
The custom JAR executable
Output file(s)
Hadoop job's logfiles generated by the EMR cluster
Perform the following steps to create a S3 bucket and upload the custom JAR we have created, and also upload the sample input file on which we have executed this locally in the previous chapter:
Log in to your AWS management console and go to the EC2 Dashboard by navigating to Services | All AWS Services | S3. S3 doesn't require region selection.
Click on Create Bucket and provide a suitable name for the bucket. Let's say you named your bucket
learning-bigdata
. It is to be kept in mind that S3 bucket names are unique globally, so your bucket name will be allowed only if no other bucket exists with the same name.At this point, your browser screen will look as follows:
Create an appropriate folder structure inside the bucket. Click on Create Folder and create a folder named...