Introduction to jobs, stages, and tasks
In this recipe, you will learn how Spark breaks down an application into job, stages, and tasks. You will also learn how to view directed acyclic graphs (DAGs) and how pipelining works in Spark query execution.
By the end of this recipe, you will have learned how to check the DAG you've created for the query you have executed and look at the jobs, stages, and tasks associated with a specific query.
Getting ready
You can follow along by running the steps in the 3-1.Introduction to Jobs, Stages, and Tasks
notebook. This can be found in your local cloned repository, in the Chapter03
folder (https://github.com/PacktPublishing/Azure-Databricks-Cookbook/tree/main/Chapter03). Follow these steps before running the notebook:
- Mount your ADLS Gen-2 account by following the steps mentioned in the Mounting Azure Data Lake Storage (ADLS) Gen-2 and Azure Blob Storage to the Azure Databricks filesystem recipe of Chapter 2, Reading and...