Amazon Elastic Cloud Computing (EC2) and Simple Storage Service (S3) are cloud computing web services provided by Amazon Web Services(AWS). EC2 offers platform as a service (PaaS), with which we can start up theoretically an unlimited number of servers on the cloud. S3 offers storage services on the cloud. More information about AWS, EC2, and S3 can be obtained from aws.amazon.com.
From the previous chapters of this book, we know that the configuration of a Hadoop cluster requires a big amount of hardware investment. For example, to set up a Hadoop cluster, a number of computing nodes and networking devices are required. Comparatively, with the help of AWS cloud computing, especially EC2, we can set up a Hadoop cluster with minimum cost and much less efforts.
In this chapter, we are going to discuss topics of configuring a Hadoop cluster in the Amazon cloud. We will guide you through the recipes of registering with AWS, creating Amazon Machine Image (AMI), configuring a Hadoop...