Book Image

Cloud Scale Analytics with Azure Data Services

By : Patrik Borosch
Book Image

Cloud Scale Analytics with Azure Data Services

By: Patrik Borosch

Overview of this book

Azure Data Lake, the modern data warehouse architecture, and related data services on Azure enable organizations to build their own customized analytical platform to fit any analytical requirements in terms of volume, speed, and quality. This book is your guide to learning all the features and capabilities of Azure data services for storing, processing, and analyzing data (structured, unstructured, and semi-structured) of any size. You will explore key techniques for ingesting and storing data and perform batch, streaming, and interactive analytics. The book also shows you how to overcome various challenges and complexities relating to productivity and scaling. Next, you will be able to develop and run massive data workloads to perform different actions. Using a cloud-based big data-modern data warehouse-analytics setup, you will also be able to build secure, scalable data estates for enterprises. Finally, you will not only learn how to develop a data warehouse but also understand how to create enterprise-grade security and auditing big data programs. By the end of this Azure book, you will have learned how to develop a powerful and efficient analytical platform to meet enterprise needs.
Table of Contents (20 chapters)
1
Section 1: Data Warehousing and Considerations Regarding Cloud Computing
4
Section 2: The Storage Layer
7
Section 3: Cloud-Scale Data Integration and Data Transformation
14
Section 4: Data Presentation, Dashboarding, and Distribution

Summary

In this chapter, we examined the differences between Data Warehouses and Data Lakes and the advantages of both approaches. We have the structured model of the Data Warehouse with its security, tuning possibilities, and accessibility for Self-Service BI on the one hand, and the capabilities of Data Lake systems to process vast amounts of data in high performance to support machine learning on the other.

Both concepts, when implemented in isolation, can help solve certain problems. However, even with the growing data, the disparate source systems, their various formats, and the required speed of delivery, as well as the requirements for security and usability, neither can succeed on their own: there is life in the old Data Warehouse yet!

The combination of the two concepts, together with the extended offerings of the Hyperscaler cloud vendors such as virtualization, container offerings, and serverless functions can open new opportunities in terms of the flexibility, agility, and speed of implementation. We are getting the best of both worlds.

In the next chapter, we will discuss a generic architecture sketch. You will learn about the different building blocks of a Modern Data Warehouse approach and how to ask the right questions during your requirements engineering process. In the second part of the next chapter, we will examine Azure Data Services and PaaS components. We'll explore alternative components for different sizes and map the Azure Services to the Modern Data Warehouse architecture.