Book Image

Modern Data Architecture on AWS

By : Behram Irani
5 (1)
Book Image

Modern Data Architecture on AWS

5 (1)
By: Behram Irani

Overview of this book

Many IT leaders and professionals are adept at extracting data from a particular type of database and deriving value from it. However, designing and implementing an enterprise-wide holistic data platform with purpose-built data services, all seamlessly working in tandem with the least amount of manual intervention, still poses a challenge. This book will help you explore end-to-end solutions to common data, analytics, and AI/ML use cases by leveraging AWS services. The chapters systematically take you through all the building blocks of a modern data platform, including data lakes, data warehouses, data ingestion patterns, data consumption patterns, data governance, and AI/ML patterns. Using real-world use cases, each chapter highlights the features and functionalities of numerous AWS services to enable you to create a scalable, flexible, performant, and cost-effective modern data platform. By the end of this book, you’ll be equipped with all the necessary architectural patterns and be able to apply this knowledge to efficiently build a modern data platform for your organization using AWS services.
Table of Contents (24 chapters)
1
Part 1: Foundational Data Lake
5
Part 2: Purpose-Built Services And Unified Data Access
17
Part 3: Govern, Scale, Optimize And Operationalize

Fine-grained access control using AWS Lake Formation

One of the biggest challenges with setting up and operating data lakes on a large scale is to make sure all the data is secure. This challenge arises due to data being all over the place in a data lake, across multiple S3 buckets, and accessible via many cataloged tables. Setting up a unified permission model around who gets access to what portion of the data is not a trivial task. Imagine a very large data lake with thousands of databases and thousands of tables with 10,000 users continuously trying to access the data; to complicate things further, new users are getting onboarded every day and new datasets are constantly getting added to the data lake. Unless there is a robust mechanism to control fine-grained data access across all the datasets, the data lake would become a governance nightmare.

AWS Lake Formation

In a few of the previous chapters, we touched upon AWS Lake Formation as a service that helps in multiple aspects...