-
Book Overview & Buying
-
Table Of Contents
Ultimate AWS Data Engineering Bootcamp - 15 Real-World Labs
By :
Ultimate AWS Data Engineering Bootcamp - 15 Real-World Labs
By:
Overview of this book
This intensive bootcamp delivers hands-on mastery in AWS data engineering through 15 real-world labs that simulate complex business pipelines. You’ll start by setting up your environment using Docker and AWS CLI, then dive into batch processing with Airflow and Redshift, distributed processing with PySpark and DynamoDB, and automated ETL using Glue and Step Functions. Each project is designed to reinforce data orchestration, workflow automation, and scalable processing.
As you progress, you’ll build data lakes with EMR and Athena, construct event-driven systems using ECS, Lambda, and Kinesis, and visualize streaming data with Streamlit dashboards. Real-time pipelines and CI/CD deployment with GitHub Actions add production-level skills to your toolkit. The course structure emphasizes practical applications, allowing you to architect, deploy, and manage end-to-end AWS data solutions.
By completing five advanced assignments, you’ll solidify concepts and create a robust cloud-native portfolio. This course is built for professionals ready to apply cloud and Python skills to real engineering challenges.
Table of Contents (16 chapters)
Course Introduction
Lab - Batch data processing of music streams using Airflow & Redshift
Lab - Distributed music streams processing using Airflow, Spark & DynamoDB
Lab - ETL for Rental apartments using Step Functions, AWS Glue, and Redshift
Lab - Build a datalake for rental vehicles store using EMR, S3 and Athena
Lab - Build Event driven pipelines for E-Commerce using ECS and Step Functions
Lab - Build a lakehouse for an E-Commerce store using Pyspark delta tables and S3
Lab - Event driven data processing for Taxi trips using Lambda and Kinesis
Lab - Process mobile network logs in real time using Pyspark & Streamlit on ECS
Lab - CI/CD for AWS Services using GITHUB ACTIONS
Lab - Real time data ingestion of clickstreams using Kinesis Firehose and Redshift
Assignment 1 - Setup MySQL Database in AWS Aurora RDS
Assignment 2 - Build a lakehouse on S3 for Commercial flights dataset
Assignment 3 - Offer dynamic discounts to E-Commerce users using Real Time Events
Assignment 4 - Setup real time Pyspark streaming job for Spotify songs metrics
Assignment 5 - Automate deployment of Lambda functions using Github actions