Book Image

Learn Azure Synapse Data Explorer

By : Pericles (Peri) Rocha
Book Image

Learn Azure Synapse Data Explorer

By: Pericles (Peri) Rocha

Overview of this book

Large volumes of data are generated daily from applications, websites, IoT devices, and other free-text, semi-structured data sources. Azure Synapse Data Explorer helps you collect, store, and analyze such data, and work with other analytical engines, such as Apache Spark, to develop advanced data science projects and maximize the value you extract from data. This book offers a comprehensive view of Azure Synapse Data Explorer, exploring not only the core scenarios of Data Explorer but also how it integrates within Azure Synapse. From data ingestion to data visualization and advanced analytics, you’ll learn to take an end-to-end approach to maximize the value of unstructured data and drive powerful insights using data science capabilities. With real-world usage scenarios, you’ll discover how to identify key projects where Azure Synapse Data Explorer can help you achieve your business goals. Throughout the chapters, you'll also find out how to manage big data as part of a software as a service (SaaS) platform, as well as tune, secure, and serve data to end users. By the end of this book, you’ll have mastered the big data life cycle and you'll be able to implement advanced analytical scenarios from raw telemetry and log data.
Table of Contents (19 chapters)
1
Part 1 Introduction to Azure Synapse Data Explorer
6
Part 2 Working with Data
12
Part 3 Managing Azure Synapse Data Explorer

Introducing Azure Synapse Data Explorer

Every day, applications and devices connected to the internet generate massive amounts of data. To give some perspective, we expect to have 50 billion connected devices by 2030 generating data, and up to 175 zettabytes (ZB) of data generated by 2025 (from every possible source). As more and more new connected devices reach the market every year, and as companies make greater use of unstructured data from application logs, the amount of data generated daily will become difficult to measure. In fact, some companies are keeping certain types of data, such as telemetry and application logs, for no longer than a certain period (such as 90 to 120 days) because even with the fact that storage has never been cheaper, storing and managing large volumes of data can quickly become cost-prohibitive.

Being able to store, manage, and quickly analyze unstructured data has become a critical business need for most companies. From application logs, you can predict the behavior of users and respond quickly to user demand. By analyzing device telemetry, you can anticipate hardware failures, reduce downtime in factories, predict the weather, and detect patterns that help optimize your operation. Most importantly, the ability to correlate application and device data, apply machine learning (ML) algorithms, and visualize data in real time allows you to respond quickly to operational challenges, as well as customer and market demands.

Azure Synapse Data Explorer complements the Synapse Structured Query Language (Synapse SQL) engine and Apache Spark engine already present in Azure Synapse to offer a big data service that helps acquire, store, and manage big data to unlock insights from device telemetry and application logs. It works just like the Azure Data Explorer standalone service, but with the benefit of tightly integrating with the other services offered by Azure Synapse, allowing you to build end-to-end (E2E) advanced analytics projects from data ingestion to rich visualizations using Power BI.

By the end of this chapter, you should have a thorough understanding of where Azure Synapse Data Explorer fits in the data lifecycle, how to describe the service and differentiate it from the standalone service, and when to use Data Explorer pools in Azure Synapse.

In this chapter, we will go through the following topics:

  • Understanding the lifecycle of data
  • Introducing the Team Data Science Process
  • The need for a fast and highly scalable data exploration service
  • What is Azure Synapse?
  • What is Azure Synapse Data Explorer?
  • Integrating Data Explorer pools with other Azure Synapse services
  • Exploring the Data Explorer pool infrastructure and scalability
  • What makes Azure Synapse Data Explorer unique?
  • When to use Azure Synapse Data Explorer