Learn Azure Synapse Data Explorer

By : Pericles (Peri) Rocha

Learn Azure Synapse Data Explorer

By: Pericles (Peri) Rocha

Overview of this book

Large volumes of data are generated daily from applications, websites, IoT devices, and other free-text, semi-structured data sources. Azure Synapse Data Explorer helps you collect, store, and analyze such data, and work with other analytical engines, such as Apache Spark, to develop advanced data science projects and maximize the value you extract from data. This book offers a comprehensive view of Azure Synapse Data Explorer, exploring not only the core scenarios of Data Explorer but also how it integrates within Azure Synapse. From data ingestion to data visualization and advanced analytics, you’ll learn to take an end-to-end approach to maximize the value of unstructured data and drive powerful insights using data science capabilities. With real-world usage scenarios, you’ll discover how to identify key projects where Azure Synapse Data Explorer can help you achieve your business goals. Throughout the chapters, you'll also find out how to manage big data as part of a software as a service (SaaS) platform, as well as tune, secure, and serve data to end users. By the end of this book, you’ll have mastered the big data life cycle and you'll be able to implement advanced analytical scenarios from raw telemetry and log data.

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Download the color images

Conventions used

Get in touch

Share Your Thoughts

Download a free PDF copy of this book

Part 1 Introduction to Azure Synapse Data Explorer

Free Chapter

Chapter 1: Introducing Azure Synapse Data Explorer

Technical requirements

Understanding the lifecycle of data

The need for a fast and highly scalable data exploration service

What is Azure Synapse?

What is Azure Synapse Data Explorer?

Integrating Data Explorer pools with other Azure Synapse services

Exploring the Data Explorer pool infrastructure and scalability

What makes Azure Synapse Data Explorer unique?

When to use Azure Synapse Data Explorer

Summary

Chapter 2: Creating Your First Data Explorer Pool

Technical requirements

Creating a free Azure account

Creating an Azure Synapse workspace

Creating a Data Explorer pool using Azure Synapse Studio

Creating a Data Explorer pool using the Azure portal

Creating a Data Explorer pool using the Azure CLI

Summary

Chapter 3: Exploring Azure Synapse Studio

Technical requirements

Exploring the user interface of Azure Synapse Studio

Running your first query

Managing and monitoring Data Explorer pools

Monitoring Data Explorer pools

Summary

Chapter 4: Real-World Usage Scenarios

Technical requirements

Building a multi-purpose end-to-end analytics environment

Managing IoT data

Processing and analyzing geospatial data

Enabling real-time analytics with big data

Performing time series analytics

Summary

Part 2 Working with Data

Chapter 5: Ingesting Data into Data Explorer Pools

Technical requirements

Understanding the data loading process

Defining a retention policy

Choosing a data load strategy

Performing data ingestion

Summary

Chapter 6: Data Analysis and Exploration with KQL and Python

Technical requirements

Analyzing data with KQL

Exploring Data Explorer pool data with Python

Summary

Chapter 7: Data Visualization with Power BI

Technical requirements

Introduction to the Power BI integration

Creating a Power BI report

Adding data sources to your Power BI report

Connecting Power BI with your Azure Synapse workspace

Authoring Power BI reports from Azure Synapse Studio

Summary

Chapter 8: Building Machine Learning Experiments

Technical requirements

Understanding the application of ML

Introducing ML into your projects with AutoML

Exploring additional ML capabilities in Azure Synapse

Summary

Chapter 9: Exporting Data from Data Explorer Pools

Technical requirements

Understanding data export scenarios

Exporting data with client tools

Using server-side export to pull data

Performing robust exports with server-side data push

Configuring continuous data export

Summary

Part 3 Managing Azure Synapse Data Explorer

Chapter 10: System Monitoring and Diagnostics

Technical requirements

Monitoring your environment

Setting up alerts

Summary

Chapter 11: Tuning and Resource Management

Technical requirements

Implementing resource governance with workload groups

Speeding up queries using cache policies

Summary

Chapter 12: Securing Your Environment

Technical requirements

Security overview

Managing data encryption

Authenticating users

Configuring access to resources

Implementing network security

Protecting against external threats

Summary

Chapter 13: Advanced Data Management

Technical requirements

Managing extents

Purging personal data

Summary

Index

Why subscribe?

Other Books You May Enjoy

Packt is searching for authors like you

Share Your Thoughts

Download a free PDF copy of this book

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Integrating Data Explorer pools with other Azure Synapse services

As mentioned previously, before Azure Synapse, data science and advanced analytics projects required engineers to put together several pieces of a puzzle to deliver an E2E solution to users. By bringing Azure Data Explorer natively to Azure Synapse through Data Explorer pools, you no longer need to maintain external connectors and manage services separately. Furthermore, you benefit from the productivity gains of Azure Synapse workspaces, building everything they need on Azure Synapse Studio.

Data Explorer pools on Synapse workspaces offer several benefits, as detailed next.

Query experience integrated into Azure Synapse Studio’s query editor

You can query Data Explorer pools using the same tools and the same look and feel you experience with dedicated or serverless SQL pools. Additionally, you can go back and forth between a KQL query on a Data Explorer pool and a T-SQL query on a dedicated SQL pool to get the full context of your data, without having to switch browser tabs or different applications, enabling data correlation across all data sources. Finally, all your KQL scripts can be saved with your SQL scripts and Synapse notebooks into your workspace for future use (or merged into the Git source control mechanism of your choice). In Figure 1.14, you can see the Develop hub bringing together all your scripts, notebooks, data flows, and Power BI reports:

Figure 1.14 – Integrated authoring experience for all your Azure Synapse assets, with source control

Note

Azure Synapse exposes an endpoint for Data Explorer pools the same way as the standalone service Azure Data Explorer. You can still use Azure Data Explorer query tools such as Kusto.Explorer, the Azure Data Explorer web UI, and even the Kusto command-line interface (CLI) to perform queries if you wish to use them.

Exploring, preparing, and modeling data with Apache Spark

As discussed previously, you can simply right-click a table on a Data Explorer pool and quickly start a new Synapse notebook to use your programming language of choice for data exploration and preparation and to train (and consume!) ML models leveraging Apache Spark. Therefore, you can leverage other benefits of Apache Spark in Synapse, such as Azure Machine Learning integration, and use services such as AutoML.

Data ingestion made easy with pipelines

Among the diverse ways you can load data into Data Explorer pools, as you would expect, Synapse pipelines offer full, native support for the service. If you have existing pipelines and data flows, incorporating Data Explorer pools into your workflows is a simple task.

Unified management experience

Having a SPOG to manage and monitor your services is a huge productivity gain. From Azure Synapse Studio, you can create, delete, pause, resume, and scale Data Explorer pools up or down. You can also monitor the health of pools. Finally, you can control security and access-control rules. When managing settings for your Synapse workspace in the Azure portal, you will also find a central location under Analytics pools to create, pause, or delete your Data Explorer pools the same way you do it for SQL and Apache Spark pools. This is illustrated in Figure 1.15.

Figure 1.15 – Seamless experience across all analytics pools in the Azure portal

As you can see, Data Explorer is a native service in Azure Synapse and benefits from all the aspects mentioned. It’s different from Power BI and Purview in the sense that you don’t need to configure it as an external service—instead, Data Explorer pools are like natural cousins of SQL pools and Apache Spark pools, and they share the same experience.

Learn Azure Synapse Data Explorer

By : Pericles (Peri) Rocha

Learn Azure Synapse Data Explorer

By: Pericles (Peri) Rocha

Overview of this book

Related Content you might be interested in

Current Title:

Learn Azure Synapse Data Explorer

Limitless Analytics with Azure Synapse

Scalable Data Analytics with Azure Data Explorer

Cloud Analytics with Microsoft Azure.

Integrating Data Explorer pools with other Azure Synapse services

Query experience integrated into Azure Synapse Studio’s query editor

Exploring, preparing, and modeling data with Apache Spark

Data ingestion made easy with pipelines

Unified management experience