Book Image

The Self-Taught Cloud Computing Engineer

By : Dr. Logan Song
Book Image

The Self-Taught Cloud Computing Engineer

By: Dr. Logan Song

Overview of this book

The Self-Taught Cloud Computing Engineer is a comprehensive guide to mastering cloud computing concepts by building a broad and deep cloud knowledge base, developing hands-on cloud skills, and achieving professional cloud certifications. Even if you’re a beginner with a basic understanding of computer hardware and software, this book serves as the means to transition into a cloud computing career. Starting with the Amazon cloud, you’ll explore the fundamental AWS cloud services, then progress to advanced AWS cloud services in the domains of data, machine learning, and security. Next, you’ll build proficiency in Microsoft Azure Cloud and Google Cloud Platform (GCP) by examining the common attributes of the three clouds while distinguishing their unique features. You’ll further enhance your skills through practical experience on these platforms with real-life cloud project implementations. Finally, you’ll find expert guidance on cloud certifications and career development. By the end of this cloud computing book, you’ll have become a cloud-savvy professional well-versed in AWS, Azure, and GCP, ready to pursue cloud certifications to validate your skills.
Table of Contents (24 chapters)
1
Part 1: Learning about the Amazon Cloud
9
Part 2:Comprehending GCP Cloud Services
14
Part 3:Mastering Azure Cloud Services
19
Part 4:Developing a Successful Cloud Career

Amazon EMR

Amazon EMR is a platform for leveraging many big data tools for data processing. We will start by looking at the concepts of MapReduce and Hadoop.

MapReduce and Hadoop

MapReduce and Hadoop are two related concepts in the field of distributed computing and big data processing.

The idea of MapReduce is “divide and conquer”: decompose a big dataset into smaller ones to be processed in parallel on distributed computers. It was originally developed by Google for its search engine to handle the massive amounts of data generated by web crawling. The MapReduce programming model involves two functions: a map function that divides and processes in parallel the datasets and a map function that aggregates the map outputs.

Hadoop is an open source software framework that implements the MapReduce model. Hadoop consists of two core components: Hadoop Distributed File System (HDFS) and MapReduce. HDFS is a distributed filesystem that can rapidly transfer data between...