Book Image

Data Modeling for Azure Data Services

By : Peter ter Braake
Book Image

Data Modeling for Azure Data Services

By: Peter ter Braake

Overview of this book

Data is at the heart of all applications and forms the foundation of modern data-driven businesses. With the multitude of data-related use cases and the availability of different data services, choosing the right service and implementing the right design becomes paramount to successful implementation. Data Modeling for Azure Data Services starts with an introduction to databases, entity analysis, and normalizing data. The book then shows you how to design a NoSQL database for optimal performance and scalability and covers how to provision and implement Azure SQL DB, Azure Cosmos DB, and Azure Synapse SQL Pool. As you progress through the chapters, you'll learn about data analytics, Azure Data Lake, and Azure SQL Data Warehouse and explore dimensional modeling, data vault modeling, along with designing and implementing a Data Lake using Azure Storage. You'll also learn how to implement ETL with Azure Data Factory. By the end of this book, you'll have a solid understanding of which Azure data services are the best fit for your model and how to implement the best design for your solution.
Table of Contents (16 chapters)
1
Section 1 – Operational/OLTP Databases
8
Section 2 – Analytics with a Data Lake and Data Warehouse
13
Section 3 – ETL with Azure Data Factory

Summary

Designing a NoSQL database follows less strict rules than designing a relational database. This follows logically from the fact that we have structured data in a relational database. We use that structure in designing the database.

NoSQL databases such as Cosmos DB should allow for more flexibility and scalability. We achieve this by letting go of strict rules and optimizing the data for its usage. The way you use the data should be the primary factor in deciding what you store together and what you store in separate documents.

Also, Cosmos DB is implemented on cluster technology. This means you have to create a partitioning strategy. You do this based on the specs of your data and, again, the way you expect to search for data.

In this chapter, you learned what big data is and when to use NoSQL databases. You learned different ways of distributing data over cluster nodes. You also learned how to choose a proper distribution strategy.

Now that you know how to design...