Book Image

Data Modeling for Azure Data Services

By : Peter ter Braake
Book Image

Data Modeling for Azure Data Services

By: Peter ter Braake

Overview of this book

Data is at the heart of all applications and forms the foundation of modern data-driven businesses. With the multitude of data-related use cases and the availability of different data services, choosing the right service and implementing the right design becomes paramount to successful implementation. Data Modeling for Azure Data Services starts with an introduction to databases, entity analysis, and normalizing data. The book then shows you how to design a NoSQL database for optimal performance and scalability and covers how to provision and implement Azure SQL DB, Azure Cosmos DB, and Azure Synapse SQL Pool. As you progress through the chapters, you'll learn about data analytics, Azure Data Lake, and Azure SQL Data Warehouse and explore dimensional modeling, data vault modeling, along with designing and implementing a Data Lake using Azure Storage. You'll also learn how to implement ETL with Azure Data Factory. By the end of this book, you'll have a solid understanding of which Azure data services are the best fit for your model and how to implement the best design for your solution.
Table of Contents (16 chapters)
1
Section 1 – Operational/OLTP Databases
8
Section 2 – Analytics with a Data Lake and Data Warehouse
13
Section 3 – ETL with Azure Data Factory

The normalization steps

The process of normalizing data is defined by a number of steps that you need to perform in order to get from a report or screenshot to a normalized table structure. Let's examine these steps in detail, starting with what is called Step zero.

Step zero

Data is normalized using information requirements such as reports and application screens in a number of formal steps. We will explain these steps using the report as shown in Figure 3.1. This report shows the hours worked by employees on different projects as of January 1, 2021. There are two active projects, P1 and P2. Two people, Peter and Janneke, are working on project P1. Peter, Jari, and Mats are working on project P2:

Figure 3.1 – Project overview

The first step is to define a list of columns to be stored in the database. You have to consider three main points in this preliminary step:

  • Can the report be broken down into separate independent reports?
  • ...