Book Image

Azure Data Engineering Cookbook

By : Ahmad Osama
Book Image

Azure Data Engineering Cookbook

By: Ahmad Osama

Overview of this book

Data engineering is one of the faster growing job areas as Data Engineers are the ones who ensure that the data is extracted, provisioned and the data is of the highest quality for data analysis. This book uses various Azure services to implement and maintain infrastructure to extract data from multiple sources, and then transform and load it for data analysis. It takes you through different techniques for performing big data engineering using Microsoft Azure Data services. It begins by showing you how Azure Blob storage can be used for storing large amounts of unstructured data and how to use it for orchestrating a data workflow. You'll then work with different Cosmos DB APIs and Azure SQL Database. Moving on, you'll discover how to provision an Azure Synapse database and find out how to ingest and analyze data in Azure Synapse. As you advance, you'll cover the design and implementation of batch processing solutions using Azure Data Factory, and understand how to manage, maintain, and secure Azure Data Factory pipelines. You’ll also design and implement batch processing solutions using Azure Databricks and then manage and secure Azure Databricks clusters and jobs. In the concluding chapters, you'll learn how to process streaming data using Azure Stream Analytics and Data Explorer. By the end of this Azure book, you'll have gained the knowledge you need to be able to orchestrate batch and real-time ETL workflows in Microsoft Azure.
Table of Contents (11 chapters)

Implementing vertical scaling for an Azure SQL database using PowerShell

An Azure SQL database has multiple purchase model and service tiers for different workloads. There are two purchasing models: DTU-based and vCore-based. There are multiple service tiers within the purchasing models.

Having multiple service tiers gives the flexibility to scale up or scale down based on the workload or activity in an Azure SQL database.

In this recipe, we'll learn how to automatically scale up an Azure SQL database whenever the CPU percentage is above 40%.

Getting ready

In a new PowerShell window, execute the Connect-AzAccount command and follow the steps to log in to your Azure account.

You will need an existing Azure SQL database for this recipe. If you don't have one, create an Azure SQL database by following the steps mentioned in the Provisioning and connecting to an Azure SQL database using PowerShell recipe.

How to do it…

The steps for this recipe are as follows:

  1. Execute the following PowerShell command to create an Azure Automation account:
    #Create an Azure automation account
    $automation = New-AzAutomationAccount -ResourceGroupName packtade -Name adeautomate -Location centralus -Plan Basic
  2. Execute the following command to create an Automation runbook of the PowerShell workflow type:
    #Create a new automation runbook of type PowerShell workflow
    $runbook = New-AzAutomationRunbook -Name rnscalesql -Description "Scale up sql azure when CPU is 40%" -Type PowerShellWorkflow -ResourceGroupName packtade -AutomationAccountName $automation.AutomationAccountName
  3. Execute the following command to create Automation credentials. The credentials are passed as a parameter to the runbook and are used to connect to the Azure SQL database from the runbook:
    #Create automation credentials.
    $sqladminpassword = ConvertTo-SecureString 'Sql@Server@1234' -AsPlainText -Force
    $sqladmincredential = New-Object System.Management.Automation.PSCredential ('sqladmin', $sqladminpassword)
    $creds = New-AzAutomationCredential -Name sqlcred -Description "sql azure creds" -ResourceGroupName packtade -AutomationAccountName $automation.AutomationAccountName -Value $sqladmincredential
  4. The next step is to edit the runbook and PowerShell to modify the service tier of an Azure SQL database. To do that, open https://portal.azure.com and log in to your Azure account. Under All resources, search for and open the adeautomate automation account:
    Figure 2.20 – Opening the Azure Automation account

    Figure 2.20 – Opening the Azure Automation account

  5. On the Azure Automation page, locate and select Runbooks:
    Figure 2.21 – Opening the runbook in Azure Automation

    Figure 2.21 – Opening the runbook in Azure Automation

  6. Select the rnscalesql runbook to open the runbook page. On the runbook page, select Edit:
    Figure 2.22 – Editing the runbook in your Azure Automation account

    Figure 2.22 – Editing the runbook in your Azure Automation account

  7. On the Edit PowerShell Workflow Runbook page, copy and paste the following PowerShell code onto the canvas:
    workflow rnscalesql
    { 
        param 
        ( 
            # Name of the Azure SQL Database server (Ex: bzb98er9bp) 
            [parameter(Mandatory=$true)]  
            [string] $SqlServerName, 
            # Target Azure SQL Database name  
            [parameter(Mandatory=$true)]  
            [string] $DatabaseName,  
            # When using in the Azure Automation UI, please enter the name of the credential asset for the "Credential" parameter 
            [parameter(Mandatory=$true)]  
            [PSCredential] $Credential 
        ) 
        inlinescript 
        { 
            $ServerName = $Using:SqlServerName + ".database.windows.net"
            $db = $Using:DatabaseName
            $UserId = $Using:Credential.UserName 
            $Password = ($Using:Credential).GetNetworkCredential().Password 
            $ServerName
            $db
            $UserId
            $Password
            $MasterDatabaseConnection = New-Object System.Data.SqlClient.SqlConnection 
            $MasterDatabaseConnection.ConnectionString = "Server = $ServerName; Database = Master; User ID = $UserId; Password = $Password;" 
            $MasterDatabaseConnection.Open(); 
            $MasterDatabaseCommand = New-Object System.Data.SqlClient.SqlCommand 
            $MasterDatabaseCommand.Connection = $MasterDatabaseConnection 
            $MasterDatabaseCommand.CommandText =  
                " 
                    ALTER DATABASE $db MODIFY (EDITION = 'Standard', SERVICE_OBJECTIVE = 'S0');
                    
                "        
            $MasterDbResult = $MasterDatabaseCommand.ExecuteNonQuery();
        } 
    }

    The preceding code modifies the service tier of the given Azure SQL database to Standard S0.

  8. Click Save, and then click Publish to publish the runbook:
    Figure 2.23 – Saving and publishing the runbook

    Figure 2.23 – Saving and publishing the runbook

  9. The next step is to create a webhook to trigger the runbook. Execute the following command to create the webhook:
    # define the runbook parameters
    $Params = @{"SQLSERVERNAME"="azadesqlserver";"DATABASENAME"="azadesqldb";"CREDENTIAL"="sqlcred"}
    # Create a webhook
    $expiry = (Get-Date).AddDays(1)  
    New-AzAutomationWebhook -Name whscaleazure -RunbookName $runbook.Name -Parameters $Params -ResourceGroupName packtade -AutomationAccountName $automation.AutomationAccountName -IsEnabled $true -ExpiryTime $expiry

    Note

    When defining $Params, you may want to change the default values mentioned here if you have a different Azure SQL server, database, and cred values.

    You should get an output as shown in the following screenshot:

    Figure 2.24 – Creating a webhook

    Figure 2.24 – Creating a webhook

    Copy and save the WebhookURI value for later use.

  10. The next step is to create an alert for an Azure SQL database that when triggered will call the webhook URI. Execute the following query to create an alert action group receiver:
    #Create action group reciever
    $whr = New-AzActionGroupReceiver -Name agrscalesql -WebhookReceiver -ServiceUri "https://s25events.azure-automation.net/webhooks?token=NfL30nj%2fkuSo8TTT7CqDwRI WEdeXR1lklkK%2fzgELCiY%3d"

    Note

    Replace the value of the ServiceUri parameter with your webhook URI from the previous step.

  11. Execute the following query to create an action group with an action receiver as defined by the preceding command:
    #Create a new action group.
    $ag = Set-AzActionGroup -ResourceGroupName packtade -Name ScaleSQLAzure -ShortName scaleazure -Receiver $whr
  12. Execute the following query to create an alert condition to trigger the alert:
    #define the alert trigger condition
    $condition = New-AzMetricAlertRuleV2Criteria  -MetricName "cpu_percent" -TimeAggregation maximum -Operator greaterthan -Threshold 40 -MetricNamespace "Microsoft.Sql/servers/databases"

    The condition defines that the alert should trigger when the metric CPU percentage is greater than 40%.

  13. Execute the following query to create an alert on the Azure SQL database:
    #Create the alert with the condition and action defined in previous steps.
    $rid = (Get-AzSqlDatabase -ServerName azadesqlserver -ResourceGroupName packtade -DatabaseName azadesqldb).Resourceid
    Add-AzMetricAlertRuleV2 -Name monitorcpu -ResourceGroupName packtade -WindowSize 00:01:00 -Frequency 00:01:00 -TargetResourceId $rid -Condition $condition  -Severity 1 -ActionGroupId $ag.id

    You should get an output as shown in the following screenshot:

    Figure 2.25 – Creating the alert

    Figure 2.25 – Creating the alert

    The preceding command creates an Azure SQL database alert. The alert is triggered when the cpu_percent metric is greater than 40% for more than 1 minute. When the alert is triggered, as defined in the action group, the webhook is called. The webhook in turn runs the runbook. The runbook modifies the service tier of the database to Standard S0.

  14. To see the alert in action, connect to the Azure SQL database and execute the following query to simulate high CPU usage:
    --query to simulate high CPU usage
    While(1=1)
    Begin
    Select cast(a.object_id as nvarchar(max)) from sys.objects a, sys.objects b,sys.objects c, sys.objects d
    End

    As soon as the alert condition is triggered, the webhook is called and the database service tier is modified to Standard S0.

How it works…

To configure automatic scaling for an Azure SQL database, we create an Azure Automation runbook. The runbook specifies the PowerShell code to modify the service tier of an Azure SQL database.

We create a webhook to trigger the runbook. We create an Azure SQL database alert and define the alert condition to trigger when the cpu_percent metric is greater than 40% for at least 1 minute. In the alert action, we call the webhook defined earlier.

When the alert condition is reached, the webhook is called, which in turn executes the runbook, resulting in the Azure SQL database service tier change.