Book Image

Modern Data Architecture on AWS

By : Behram Irani
5 (1)
Book Image

Modern Data Architecture on AWS

5 (1)
By: Behram Irani

Overview of this book

Many IT leaders and professionals are adept at extracting data from a particular type of database and deriving value from it. However, designing and implementing an enterprise-wide holistic data platform with purpose-built data services, all seamlessly working in tandem with the least amount of manual intervention, still poses a challenge. This book will help you explore end-to-end solutions to common data, analytics, and AI/ML use cases by leveraging AWS services. The chapters systematically take you through all the building blocks of a modern data platform, including data lakes, data warehouses, data ingestion patterns, data consumption patterns, data governance, and AI/ML patterns. Using real-world use cases, each chapter highlights the features and functionalities of numerous AWS services to enable you to create a scalable, flexible, performant, and cost-effective modern data platform. By the end of this book, you’ll be equipped with all the necessary architectural patterns and be able to apply this knowledge to efficiently build a modern data platform for your organization using AWS services.
Table of Contents (24 chapters)
1
Part 1: Foundational Data Lake
5
Part 2: Purpose-Built Services And Unified Data Access
17
Part 3: Govern, Scale, Optimize And Operationalize

What this book covers

Prologue, Data and Analytics Journey so far, provides a historical context around what a data platform looks like in the on-prem world. In this prologue we will discuss the traditional platform components and talk about their benefits; then pivot towards their shortcomings in meetings new business objectives. This will provide context for the need to build a modern data architecture.

Chapter 1, Modern Data Architecture on AWS, describes what it means to create a modern data architecture. We will also look at how AWS services help materialize this concept and why it is important to create this foundation for current and future business needs.

Chapter 2, Scalable Data Lakes, lays down the foundation of the modern data architecture by establishing a data lake on AWS. We will also look at different layers of the data lake and how each layer has a specific purpose.

Chapter 3, Batch Data Ingestion, provides options to move data in batches from multiple source systems into AWS. We will explore different AWS services that assist in migrating data in bulk from variety of source systems.

Chapter 4, Streaming Data Ingestion, provides an overview of the need for a real-time streaming architecture pattern and how AWS services assist in solving use-cases that require streaming data ingested and consumed in the modern data platform.

Chapter 5, Data Processing, provides options to process and transform data, so that it can eventually be consumed for analytics. We will look at some AWS services that help provide scalable, performant and cost-effective big data processing; especially for running Apache Spark based workloads.

Chapter 6, Interactive Analytics, provides insights around ad-hoc analytics use-cases along with AWS services that help solve it.

Chapter 7, Data Warehousing, covers a wide range of use-cases that can be solved using a modern cloud data warehouse on AWS. We will look at multiple design patterns, including data ingestion, data transformation and data consumption using the data warehouse on AWS.

Chapter 8, Data Sharing, provides context around how data can be shared within a modern data platform, without creating complete ETL pipelines and without duplicating data at multiple places.

Chapter 9, Data Federation, provides mechanisms of data federation and the types of use-cases that can be solved using federated queries.

Chapter 10, Predictive Analytics, covers a whole range of use-cases along with services, features and tools provided by AWS to solve AI, ML and deep learning-based business problems; with the common goal of achieving predictive analytics.

Chapter 11, Generative AI, provides variety of use-cases across multiple industries that can be solved using GenAI and how AWS provides services and tools to help fast-track building GenAI based applications.

Chapter 12, Operational Analytics, introduces the need for operational analytics, especially log analytics and how AWS helps with this aspect of the data platform.

Chapter 13, Business Intelligence, provides context around the need for a modern business intelligent tool for creating business friendly reports and dashboards, that support rich visualizations. We will look at how AWS helps with such use-cases.

Chapter 14, Data Governance, lays ground work for the need for a unified data governance and covers many dimensions of data governance along with AWS services that assist in solving for those use-cases.

Chapter 15, Data Mesh, introduces the concept of a data mesh along with its importance in the modern data platform. We will look at the pillars of data mesh and provide AWS services that help solve use-cases that require a data mesh pattern.

Chapter 16, Performant and Cost-Effective Data Platform, covers a wide range of options to ensure the data platform built using AWS services is cost-effective as well as performant.

Chapter 17, Automate, Operationalize and Monetize, wraps up the book with concepts around automating the data platform using DevOps, DataOps and MLOps mechanisms. Finally, we will look at options to monetize the modern data platform built on AWS.