Book Image

Cloud Scale Analytics with Azure Data Services

By : Patrik Borosch
Book Image

Cloud Scale Analytics with Azure Data Services

By: Patrik Borosch

Overview of this book

Azure Data Lake, the modern data warehouse architecture, and related data services on Azure enable organizations to build their own customized analytical platform to fit any analytical requirements in terms of volume, speed, and quality. This book is your guide to learning all the features and capabilities of Azure data services for storing, processing, and analyzing data (structured, unstructured, and semi-structured) of any size. You will explore key techniques for ingesting and storing data and perform batch, streaming, and interactive analytics. The book also shows you how to overcome various challenges and complexities relating to productivity and scaling. Next, you will be able to develop and run massive data workloads to perform different actions. Using a cloud-based big data-modern data warehouse-analytics setup, you will also be able to build secure, scalable data estates for enterprises. Finally, you will not only learn how to develop a data warehouse but also understand how to create enterprise-grade security and auditing big data programs. By the end of this Azure book, you will have learned how to develop a powerful and efficient analytical platform to meet enterprise needs.
Table of Contents (20 chapters)
1
Section 1: Data Warehousing and Considerations Regarding Cloud Computing
4
Section 2: The Storage Layer
7
Section 3: Cloud-Scale Data Integration and Data Transformation
14
Section 4: Data Presentation, Dashboarding, and Distribution

Chapter 1: Balancing the Benefits of Data Lakes Over Data Warehouses

Is the Data Warehouse dead with the advent of Data Lakes? There is disagreement everywhere about the need for Data Warehousing in a modern data estate. With the rise of Data Lakes and Big Data technology, many people use other, newer technologies compared to databases for their analytical efforts. Establishing a data-driven company seems to be possible without all those narrow definitions and planned structures, the ETL/ELT, and all the indexing for performance. But when we examine the technology carefully, when we compare the requirements that are formulated in analytical projects, free of prejudice to the functionality that the chosen services or software packages can deliver, we often find gaps on both ends. This chapter discusses the capabilities of Data Warehousing and Data Lakes and introduces the concept of the Modern Data Warehouse.

With all the innovations that have been brought to us in the last few years, such as faster hardware, new technologies, and new dogmas such as the Data Lake, older concepts and methods are being questioned and challenged. In this chapter, I would like to explore the evolution of the analytical world and try to answer the question, is the Data Warehouse really obsolete?

We'll find out by covering the following topics:

  • Distinguishing between Data Warehouses and Data Lakes
  • Understanding the opportunities of modern cloud computing
  • Exploring the benefits of AI and ML
  • Answering the question