Book Image

Scalable Data Analytics with Azure Data Explorer

By : Jason Myerscough

Book Image

Scalable Data Analytics with Azure Data Explorer

By: Jason Myerscough

Overview of this book

Azure Data Explorer (ADX) enables developers and data scientists to make data-driven business decisions. This book will help you rapidly explore and query your data at scale and secure your ADX clusters. The book begins by introducing you to ADX, its architecture, core features, and benefits. You'll learn how to securely deploy ADX instances and navigate through the ADX Web UI, cover data ingestion, and discover how to query and visualize your data using the powerful Kusto Query Language (KQL). Next, you'll get to grips with KQL operators and functions to efficiently query and explore your data, as well as perform time series analysis and search for anomalies and trends in your data. As you progress through the chapters, you'll explore advanced ADX topics, including deploying your ADX instances using Infrastructure as Code (IaC). The book also shows you how to manage your cluster performance and monthly ADX costs by handling cluster scaling and data retention periods. Finally, you'll understand how to secure your ADX environment by restricting access with best practices for improving your KQL query performance. By the end of this Azure book, you'll be able to securely deploy your own ADX instance, ingest data from multiple sources, rapidly query your data, and produce reports with KQL and Power BI.

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Download the color images

Conventions used

Share Your Thoughts

Section 1: Introduction to Azure Data Explorer

Section 1: Introduction to Azure Data Explorer

Free Chapter

Chapter 1: Introducing Azure Data Explorer

Chapter 1: Introducing Azure Data Explorer

Technical requirements

Introducing the data analytics pipeline

What is Azure Data Explorer?

Azure Data Explorer use cases

Running your first query

Chapter 2: Building Your Azure Data Explorer Environment

Chapter 2: Building Your Azure Data Explorer Environment

Technical requirements

Creating an Azure subscription

Introducing Azure Cloud Shell

Creating and configuring ADX instances in the Azure portal

Introducing Infrastructure as Code

Creating and configuring ADX instances with PowerShell

Creating ADX clusters with ARM templates

Chapter 3: Exploring the Azure Data Explorer UI

Chapter 3: Exploring the Azure Data Explorer UI

Technical requirements

Ingesting the StormEvents sample dataset

Querying data in the Azure portal

Exploring the ADX Web UI

Section 2: Querying and Visualizing Your Data

Section 2: Querying and Visualizing Your Data

Chapter 4: Ingesting Data in Azure Data Explorer

Chapter 4: Ingesting Data in Azure Data Explorer

Technical requirements

Understanding data ingestion

Introducing schema mapping

Ingesting data using one-click ingestion

Ingesting data using KQL management commands

Ingesting data from Blob storage using Azure Event Grid

Chapter 5: Introducing the Kusto Query Language

Chapter 5: Introducing the Kusto Query Language

Technical requirements

Introducing the basics of KQL

Introducing KQL's scalar operators

Joining tables in KQL

Introducing KQL's management commands

Chapter 6: Introducing Time Series Analysis

Chapter 6: Introducing Time Series Analysis

Technical requirements

What is time series analysis?

Creating a time series with KQL

Calculating statistics for time series data

Chapter 7: Identifying Patterns, Anomalies, and Trends in your Data

Chapter 7: Identifying Patterns, Anomalies, and Trends in your Data

Technical requirements

Calculating moving averages with KQL

Trend analysis with KQL

Anomaly detection and forecasting with KQL

Chapter 8: Data Visualization with Azure Data Explorer and Power BI

Chapter 8: Data Visualization with Azure Data Explorer and Power BI

Technical requirements

Introducing data visualization

Creating dashboards with Azure Data Explorer

Connecting Power BI to Azure Data Explorer

Section 3: Advanced Azure Data Explorer Topics

Section 3: Advanced Azure Data Explorer Topics

Chapter 9: Monitoring and Troubleshooting Azure Data Explorer

Chapter 9: Monitoring and Troubleshooting Azure Data Explorer

Technical requirements

Introducing monitoring and troubleshooting

Troubleshooting ADX

Chapter 10: Azure Data Explorer Security

Chapter 10: Azure Data Explorer Security

Technical requirements

Introducing identity management

Introducing virtual networking and subnet delegation

Filtering traffic with NSGs

Chapter 11: Performance Tuning in Azure Data Explorer

Chapter 11: Performance Tuning in Azure Data Explorer

Technical requirements

Introducing performance tuning

Introducing workload groups

Introducing policy management

Monitoring queries

KQL best practices

Chapter 12: Cost Management in Azure Data Explorer

Chapter 12: Cost Management in Azure Data Explorer

Technical requirements

Scaling and cost management

Selecting the correct ADX cluster SKU

Introducing Azure Advisor

Introducing Cost Management + Billing

Chapter 13: Assessment

Other Books You May Enjoy

Other Books You May Enjoy

Packt is searching for authors like you

Share Your Thoughts

Customer Reviews

5 star

0

4 star

0

3 star

0

2 star

0

1 star

0

Understanding data ingestion

Before learning how data ingestion works with ADX, let's revisit the different types of data:

Structured data: When we think of structured data, we think of relational databases that are made up of tables consisting of rows and columns. Each column has a data type such as an integer or string, and it sometimes includes additional constraints such as fixed-length strings and strings with specific formats such as a postcode.
Semi-structured: When we think of semi-structured data, we think of JSON and XML. They have a structure defined with tags, but the format is typically less rigid than relational databases.
Unstructured data: Unstructured data is data that has no constraints, such as SMS messages, text files, and emails, and social media such as status posts, messages, and images.

As shown in Figure 4.1, ADX supports four categories of services that enable data ingestion:

Figure 4.1 – Data analysis pipeline

Figure 4.1 – Data...