Book Image

Learn Azure Synapse Data Explorer

By : Pericles (Peri) Rocha
Book Image

Learn Azure Synapse Data Explorer

By: Pericles (Peri) Rocha

Overview of this book

Large volumes of data are generated daily from applications, websites, IoT devices, and other free-text, semi-structured data sources. Azure Synapse Data Explorer helps you collect, store, and analyze such data, and work with other analytical engines, such as Apache Spark, to develop advanced data science projects and maximize the value you extract from data. This book offers a comprehensive view of Azure Synapse Data Explorer, exploring not only the core scenarios of Data Explorer but also how it integrates within Azure Synapse. From data ingestion to data visualization and advanced analytics, you’ll learn to take an end-to-end approach to maximize the value of unstructured data and drive powerful insights using data science capabilities. With real-world usage scenarios, you’ll discover how to identify key projects where Azure Synapse Data Explorer can help you achieve your business goals. Throughout the chapters, you'll also find out how to manage big data as part of a software as a service (SaaS) platform, as well as tune, secure, and serve data to end users. By the end of this book, you’ll have mastered the big data life cycle and you'll be able to implement advanced analytical scenarios from raw telemetry and log data.
Table of Contents (19 chapters)
1
Part 1 Introduction to Azure Synapse Data Explorer
6
Part 2 Working with Data
12
Part 3 Managing Azure Synapse Data Explorer

Technical requirements

To build your own environment and experiment with the tools shown in this chapter (and throughout the book), you will need an Azure account and a subscription. If you don’t have an Azure account, you can create one for free at https://azure.microsoft.com/free/. Microsoft offers $200 in Azure credit for 30 days, as well as some popular services for free for 1 year. Azure Synapse is not one of the free services, but you should be able to use your free credit to run most examples in this book as long as you adhere to the following practices:

  • Using the smallest pool sizes: Azure Synapse Data Explorer offers pool sizes ranging from extra small (2 cores per instance) to large (16 cores per instance). Picking the smallest pool size options will help you save money and still learn about Azure Synapse Data Explorer without any constraints.
  • Keeping your scale to a minimum: As with pool sizes, you don’t need several instances running on your cluster to learn about Azure Synapse Data Explorer. Avoid using autoscale (discussed in Chapter 2), and keep your instance count to a minimum of two.
  • Manage your storage: Azure Synapse Data Explorer also charges you by storage usage, so if you’re trying to save costs in your learning journey, make sure you only have the data you need for your testing.
  • Stop your pools when not in use: You are charged for the time your cluster is running, even if you are not using it. Make sure you stop your Data Explorer pools when you are done with your experiments so that you are not charged. You can resume your pools next time you need them!

One or more examples in this chapter make use of the New York Yellow Taxi open dataset available at https://docs.microsoft.com/en-us/azure/open-datasets/dataset-taxi-yellow?tabs=azureml-opendatasets.

Note

The Azure free account offer may not be available in your country. Please check the conditions before you apply.