Book Image

Learn Azure Synapse Data Explorer

By : Pericles (Peri) Rocha
Book Image

Learn Azure Synapse Data Explorer

By: Pericles (Peri) Rocha

Overview of this book

Large volumes of data are generated daily from applications, websites, IoT devices, and other free-text, semi-structured data sources. Azure Synapse Data Explorer helps you collect, store, and analyze such data, and work with other analytical engines, such as Apache Spark, to develop advanced data science projects and maximize the value you extract from data. This book offers a comprehensive view of Azure Synapse Data Explorer, exploring not only the core scenarios of Data Explorer but also how it integrates within Azure Synapse. From data ingestion to data visualization and advanced analytics, you’ll learn to take an end-to-end approach to maximize the value of unstructured data and drive powerful insights using data science capabilities. With real-world usage scenarios, you’ll discover how to identify key projects where Azure Synapse Data Explorer can help you achieve your business goals. Throughout the chapters, you'll also find out how to manage big data as part of a software as a service (SaaS) platform, as well as tune, secure, and serve data to end users. By the end of this book, you’ll have mastered the big data life cycle and you'll be able to implement advanced analytical scenarios from raw telemetry and log data.
Table of Contents (19 chapters)
1
Part 1 Introduction to Azure Synapse Data Explorer
6
Part 2 Working with Data
12
Part 3 Managing Azure Synapse Data Explorer

What makes Azure Synapse Data Explorer unique?

Even though the underlying service of Data Explorer pools in Azure Synapse is the same as Azure Data Explorer, some capabilities are available exclusively in Azure Synapse. Let us investigate those differences, as follows:

  • Firewall: Azure Synapse workspaces include a firewall and allow you to configure IP firewall rules to grant or deny access to a workspace. This is not available in the standalone service.
  • Availability Zones: Enabled by default for Azure Synapse workspaces where Availability Zones are available. This can optionally be enabled when using Azure Data Explorer alone.
  • VM sizes for compute: Azure Data Explorer offers more than 20 different VM configurations to choose from. For Azure Synapse Data Explorer, a simplified subset of the VM configurations is offered, ranging from extra small (two cores) to large (16 cores).
  • Code control: As previously mentioned, in Azure Synapse you can connect your workspace with a Git repository, Azure DevOps, or GitHub. This option is not available when using the standalone service.
  • Pricing: For a Azure Synapse workspace, Data Explorer pools pricing is simplified to two meters: VCore and Storage. When using Azure Data Explorer as a standalone service, you may be charged by using multiple meters such as Compute, Storage, Networking, and the Azure Data Explorer IP markup, which is applied when you make use of fast data ingestion, caching, querying, and management capabilities. Additionally, Reserved Instances, which offer discounted prices when you make a commitment to use an Azure service for a certain period (typically 1 or 3 years), are only available for the standalone service Azure Data Explorer, and not for Azure Synapse Data Explorer.

As seen from the preceding points, there is no significant loss of functionality by using Azure Synapse Data Explorer when compared to the standalone service Azure Data Explorer. Azure Synapse Data Explorer includes the benefits seen on the standalone service, and it also incorporates the enterprise features offered with Synapse workspaces. However, is Azure Synapse Data Explorer the solution to every analytical problem? In the next section, you will find out how to decide whether you need Data Explorer pools or not.