Book Image

Azure Data Engineering Cookbook

By : Ahmad Osama
Book Image

Azure Data Engineering Cookbook

By: Ahmad Osama

Overview of this book

Data engineering is one of the faster growing job areas as Data Engineers are the ones who ensure that the data is extracted, provisioned and the data is of the highest quality for data analysis. This book uses various Azure services to implement and maintain infrastructure to extract data from multiple sources, and then transform and load it for data analysis. It takes you through different techniques for performing big data engineering using Microsoft Azure Data services. It begins by showing you how Azure Blob storage can be used for storing large amounts of unstructured data and how to use it for orchestrating a data workflow. You'll then work with different Cosmos DB APIs and Azure SQL Database. Moving on, you'll discover how to provision an Azure Synapse database and find out how to ingest and analyze data in Azure Synapse. As you advance, you'll cover the design and implementation of batch processing solutions using Azure Data Factory, and understand how to manage, maintain, and secure Azure Data Factory pipelines. You’ll also design and implement batch processing solutions using Azure Databricks and then manage and secure Azure Databricks clusters and jobs. In the concluding chapters, you'll learn how to process streaming data using Azure Stream Analytics and Data Explorer. By the end of this Azure book, you'll have gained the knowledge you need to be able to orchestrate batch and real-time ETL workflows in Microsoft Azure.
Table of Contents (11 chapters)

Implementing active geo-replication for an Azure SQL database using PowerShell

The active geo-replication feature allows you to create up to four readable secondaries of a primary Azure SQL database. Active geo-replication uses SQL Server AlwaysOn to asynchronously replicate transactions to the secondary databases. The secondary database can be in the same or a different region than the primary database.

Active geo-replication can be used for the following cases:

  • To provide business continuity by failing over to the secondary database in case of a disaster. The failover is manual.
  • To offload reads to the readable secondary.
  • To migrate a database to a different server in another region.

In this recipe, we'll configure active geo-replication for an Azure SQL database and perform a manual failover.

Getting ready

In a new PowerShell window, execute the Connect-AzAccount command and follow the steps to log in to your Azure account.

You need an existing Azure SQL database for this recipe. If you don't have one, create an Azure SQL database by following the steps mentioned in the Provisioning and connecting to an Azure SQL database using PowerShell recipe.

How to do it…

First, let's create a readable secondary.

Creating a readable secondary

The steps are as follows:

  1. Execute the following command to provision a new Azure SQL server to host the secondary replica:
    #create credential object for the Azure SQL Server admin credential
    $sqladminpassword = ConvertTo-SecureString 'Sql@Server@1234' -AsPlainText -Force
    $sqladmincredential = New-Object System.Management.Automation.PSCredential ('sqladmin', $sqladminpassword)
    New-AzSQLServer -ServerName azadesqlsecondary -SqlAdministratorCredentials $sqladmincredential -Location westus -ResourceGroupName packtade
  2. Execute the following command to configure the geo-replication from the primary server to the secondary server:
    $primarydb = Get-AzSqlDatabase -DatabaseName azadesqldb -ServerName azadesqlserver -ResourceGroupName packtade
    $primarydb | New-AzSqlDatabaseSecondary -PartnerResourceGroupName packtade -PartnerServerName azadesqlsecondary -AllowConnections "All"

    You should get an output as shown in the following screenshot:

Figure 2.9 – Configuring geo-replication

Figure 2.9 – Configuring geo-replication

Moreover, we can also check this on the Azure portal, as shown in the following screenshot:

Figure 2.10 – Verifying geo-replication in the Azure portal

Figure 2.10 – Verifying geo-replication in the Azure portal

Performing manual failover to the secondary

The steps are as follows:

  1. Execute the following command to manually failover to the secondary database:
    $secondarydb = Get-AzSqlDatabase -DatabaseName azadesqldb -ServerName azadesqlsecondary -ResourceGroupName packtade
    $secondarydb | Set-AzSqlDatabaseSecondary -PartnerResourceGroupName packtade -Failover

    The preceding command performs a planned failover without data loss. To perform a manual failover with data loss, use the Allowdataloss switch.

    If we check the Azure portal, we'll see that azadesqlsecondary/azadesqldb in West US is the primary database:

    Figure 2.11 – Failing over to the secondary server

    Figure 2.11 – Failing over to the secondary server

  2. We can also get the active geo-replication information by executing the following command:
    Get-AzSqlDatabaseReplicationLink -DatabaseName azadesqldb -PartnerResourceGroupName packtade -PartnerServerName azadesqlsecondary -ServerName azadesqlserver -ResourceGroupName packtade

    You should get an output as shown in the following screenshot:

Figure 2.12 – Getting the geo-replication status

Figure 2.12 – Getting the geo-replication status

Removing active geo-replication

Execute the following command to remove the active geo-replication link between the primary and the secondary databases:

$primarydb = Get-AzSqlDatabase -DatabaseName azadesqldb -ServerName azadesqlserver -ResourceGroupName packtade
$primarydb | Remove-AzSqlDatabaseSecondary -PartnerResourceGroupName packtade -PartnerServerName azadesqlsecondary

You should get an output as shown in the following screenshot:

Figure 2.13 – Removing active geo-replication

Figure 2.13 – Removing active geo-replication

How it works…

To configure active geo-replication, we use the New-AzSqlDatabaseSecondary command. This command expects the primary database name, server name, and resource group name, and the secondary resource group name, server name, and the Allow connections parameter. If we want a readable secondary, then we set Allow connections to All; otherwise, we set it to No.

The active geo-replication provides manual failover with and without data loss. To perform a manual failover, we use the Set-AzSqlDatabaseSecondary command. This command expects the secondary server name, database name, resource group name, a failover switch, and the Allowdataloss switch in case of failover with data loss.

To remove active geo-replication, we use the Remove-AzSqlDatabaseSecondary command. This command expects the secondary server name, secondary database name, and resource name to remove the replication link between the primary and the secondary database.

Removing active geo-replication doesn't remove the secondary database.