Book Image

Modern Data Architecture on AWS

By : Behram Irani
5 (1)
Book Image

Modern Data Architecture on AWS

5 (1)
By: Behram Irani

Overview of this book

Many IT leaders and professionals are adept at extracting data from a particular type of database and deriving value from it. However, designing and implementing an enterprise-wide holistic data platform with purpose-built data services, all seamlessly working in tandem with the least amount of manual intervention, still poses a challenge. This book will help you explore end-to-end solutions to common data, analytics, and AI/ML use cases by leveraging AWS services. The chapters systematically take you through all the building blocks of a modern data platform, including data lakes, data warehouses, data ingestion patterns, data consumption patterns, data governance, and AI/ML patterns. Using real-world use cases, each chapter highlights the features and functionalities of numerous AWS services to enable you to create a scalable, flexible, performant, and cost-effective modern data platform. By the end of this book, you’ll be equipped with all the necessary architectural patterns and be able to apply this knowledge to efficiently build a modern data platform for your organization using AWS services.
Table of Contents (24 chapters)
1
Part 1: Foundational Data Lake
5
Part 2: Purpose-Built Services And Unified Data Access
17
Part 3: Govern, Scale, Optimize And Operationalize

The need for a data warehouse

Before we dive deeper into the topics of data warehouses, once again, let’s distinguish between using a data lake versus a data warehouse. Both systems help solve a lot of overlapping use cases and can be used interchangeably for most common use cases. However, there are major differences between them. Essentially, a data lake is a schema-on-read centralized repository that’s flexible enough to store all kinds of structured, semi-structured, and unstructured data at any scale and allows all personas in an organization to derive value from this data easily and cost-effectively. A data warehouse, on the other hand, is a schema-on-write structured repository that stores structured and semi-structured data that’s used for analytics and business intelligence (BI). It excels in data aggregations, slice and dice data operations, roll-up and roll-down data operations, data cubes, and all other OLAP kinds of use cases. Both systems co-exist...