The Azure SQL portfolio
Microsoft Azure SQL is a modern SQL portfolio of offerings for storing relational data as a service. It is powered by the industry-leading SQL Server engine, which has evolved greatly over the years, retaining backward compatibility at the code level and continuing to provide monumental leaps in performance and storage capacity version over version. Some of the Azure SQL offerings are evergreen, meaning the offering is always up to date with the latest updates and patches. Because each offering is essentially based on the same SQL Server code, database administrators (DBAs) and developers can often use the tools and resources they are already familiar with from their past SQL Server experience, including graphical, command-line, and T-SQL–based tools, for much of the work they need to do.
For many organizations, having to build, house, and manage hardware and software is a large burden for a variety of reasons, but in most cases, cost and security are the most important. Beyond the easily quantified cost of purchasing hardware and software, there are costs in finding qualified persons to manage the hardware, operating systems (OSes), and database platform, all before considering the day-to-day operations such as tuning queries and executing backups.
Using the Azure platform, upgrades to VM type and size can be done by a simple UI operation rather than buying new hardware, configuring it, and migrating all of your data onto it (that process may still occur, but your experience is often checking a box or sliding a slider on a web page and letting the automation do the work for you.)
The second reason is the most important: security. What all the data breaches in recent history have demonstrated is that most databases are accessible from the internet in some manner. Having the management and security of your infrastructure in the hands of a company such as Microsoft pushes the technology burdens of a very large part of securing your data onto them. You can take comfort in the fact that the entire Azure business model rests on the security of all its customers' data, including yours. It will still be your responsibility to build proper security principals with adequate passwords and two-factor authentication, as no security will stop a user with proper credentials from accessing your online resources.
Azure has multiple other database management systems in the Data Platform portfolio for different types of database needs, including Cosmos DB1, and Synapse Analytics2, to name a few. Here's a full list of current products in the Azure family of services3. Note that Synapse Analytics uses relational tables, but it is focused on large-scale, specialized analytics. This chapter focuses solely on the relational SQL Server–based offerings (Azure also has relational database offerings such as MySQL4, PostgreSQL5, and MariaDB6).
The Azure SQL portfolio provides a consistent and unified management experience spanning three SQL Server offerings in Azure, each with its own targeted use cases. Almost any of the offerings will be perfectly acceptable to provide support for transaction processing (commonly referred to as online transaction processing (OLTP)) as well as most analytics (reporting) scenarios. Each offering is positioned to provide different levels of service, as we will discuss. The three offerings we will discuss specifically are:
- SQL Server on Azure Virtual Machines.
- Azure SQL Managed Instance.
- Azure SQL Database.
There is one additional method of deployment using containers that we will not be specifically covering in this book. The container method is very similar in functionality to the VM deployment, except the VM is replaced by a lightweight, GUI-free container running Linux or Windows using Kubernetes or Docker.
In the following sections, we will introduce each of these offerings to provide you with an overview of their strengths and ideal usages. Each of the offerings provides you with the same SQL Server relational engine internals for storing and querying data using T-SQL. Each will have the same data manipulation language (DML), with only minor differences in data definition language (DDL) due to physical implementation differences. While some of the management tools and methods supported by each platform are different, the primary difference is based on how managed the service is.
SQL Server on Azure Virtual Machines
SQL Server on Azure Virtual Machines7 (or Azure SQL VMs for short) and indeed any of the Azure Virtual Machines offerings are considered IaaS. This is because Microsoft manages the hardware infrastructure, but you manage the software. As the DBA managing the server, it is generally no different than managing SQL Server on a computer that resides in your own server room.
When you create an Azure SQL VM, you are given the opportunity to use a pre-built VM image that has a supported version of SQL Server pre-installed, or you can choose to bring your own media to install from. There are licensing differences and benefits to both models, but we will not even begin to try to cover licensing in this book. Here are some more insights offered by Microsoft on Azure VM licensing8.
Whether you use a pre-built image or bring your own software, the VM can take advantage of some automation by using the SQL Server IaaS Agent Extension (not currently available on an Azure VM running Linux at the time of this book's publishing), which provides automated backup and patching capabilities, as well as configuration assistance with Azure Key Vault integration to store encryption keys outside of SQL Server. You are also fully able to use any method you wish for these tasks, including SQL Server Maintenance Plans or even third-party backup scripts and tools. Some additional tooling may be necessary in any case, because backups are just part of the regular upkeep needed for a healthy database that is even lightly used. The Agent Extension (along with several other features) is enabled automatically when using a pre-built image, or by registering9 your VM with the SQL VM resource provider.
It is important to catch the distinction between automation and a managed service. Automation provides tools that you can use to make managing your server easier. With the PaaS model of the next two Azure offerings, you don't need to monitor to see whether backups have failed, nor do you even need to do anything to ensure that your server is backed up. The Azure platform management system backs up your server based on the settings you choose (and you can't even accidentally choose not to back up at all either). With IaaS, only the hardware is truly managed by Microsoft. It is your responsibility to back up your databases and make sure those backups can be restored, even when using the SQL Server IaaS Agent Extension.
Managing and supporting are two different concepts. A supported service means the host will help you if the software is not working properly. A managed service will have the host in charge of making sure things work properly based on your configuration.
An SQL Server VM gives you a highly compatible method to lift and shift many workload types to the cloud. This includes transactional workloads capturing customer orders or business intelligence workloads using analytics features such as Machine Learning Services, Reporting Services, Analysis Services, and so on. This is because the Azure VM presents itself as very much the same as your on-premises hardware, the only real difference being how you configure networking and security over the internet to work with your local security infrastructure. For SQL servers that use SQL authentication, the application will require little, if any, change, but using Active Directory will require some configuration. (Using SQL-based authentication is not considered as good as using Active Directory integration for many reasons. Chapter 3 will cover security, including integrating with your existing Active Directory.)
One major choice you have, beginning with SQL Server 2017 and continuing into SQL Server 2019, is which OS to choose. Beyond Windows Server, SQL Server will run on Red Hat Enterprise Linux (RHEL), SUSE Linux Enterprise Server (SLES), and Ubuntu. This allows SQL Server to be run on any OS used by an organization that's heavily invested in open-source software (OSS), while still providing virtually all the database features of the Windows version.
Before we dig deeper into the topic surrounding OS choice for virtual machines, there are still two configurations of SQL in Azure that we want to cover, because they offer specific benefits that the IaaS platform may not.
Azure SQL Managed Instance
The Azure SQL Managed Instance deployment option is a PaaS database offering that sits right in the middle of the Azure SQL offerings in terms of management and is one of two Azure SQL PaaS offerings. It targets scenarios where the customer needs much of the rich functionality of the full SQL Server product but desires the value of the platform-managed services model as well.
One of the biggest differences between the managed instance model and the Azure SQL VM model is that a managed instance will always be on the most up-to-date version of SQL Server (a database can be set to an earlier compatibility level if needed). This means, at the time of this book's writing in early 2020, managed instance–hosted databases provide the user with at least everything that SQL Server 2019 has to offer, up to the latest public cumulative update (CU), including all the new and improved features in SQL Server 2019 (we will provide an overview later in this chapter). Soon after the next version of SQL Server is released, managed instances will be upgraded as part of the offering.
While it is extremely rare that Microsoft removes features from SQL Server (in fact, backward compatibility is an important part of their story), there are frequent changes in how the engine optimizes queries, so be sure to keep up with performance changes in your application. This process of determining and mitigating performance issues after releases is made all the easier by the Query Store feature10.
The managed instance model will automatically maintain backups11 of your data, restorable to a point in time, based on the level of retention you configure. It safeguards your backups from disaster by using read-access geo-redundant storage (RA-GRS).
While backups are part of the managed instance package, geo-replication (that is, maintaining active copies of your data in multiple regions in case of catastrophic failure) is not. The managed instance option provides you with Auto-Failover Groups12 you can configure like in SQL Server 2019. This lets you configure your server to fail over to a different Azure datacenter if desired.
In many ways, a managed instance looks and behaves just like a typical SQL Server, particularly when dealing with DML code, instance security, and tools such as SQL Agent. (SQL Agent is limited in the types of jobs it can schedule, due to a lack of OS and file system access.) There are important considerations, such as the inability to restore a database from a managed instance to a local or VM SQL Server instance or to use SQL Server Integration Services (SSIS) in the same way as you do with a typical instance. (SSIS projects will work using Azure Data Factory13, but the infrastructure of the SSIS DB does not exist natively in the managed instance itself. For more details, you should refer to Microsoft's guide to migrating SSIS packages to Azure SQL Managed Instance14. Any code that needs access to the file system, such as for processing files for import or export, will need to be modified to work with Azure Blob storage15 instead.
Managed instances provide a lot of parity with an up-to-date version of SQL Server, providing a way to lift and shift many workloads to the cloud with a reasonable amount of work. However, there are limitations, such as always being on the most recent version of SQL Server, a lack of access to the file system, and not being able to restore to a local SQL Server instance that may make this untenable for many scenarios, leading to using the IaaS VM offering.
In the next section, we will look at one more Azure SQL offering that goes deeper into the managed aspect of PaaS offerings, even beyond what you have in a managed instance. It may require more adjustment from an existing on-premises environment but offers conveniences in management that leave you to mostly design and develop tables and code.
Azure SQL Database
Azure SQL Database16 (or SQL Database for short) is the fully managed version of SQL Server, falling deeper into the category of PaaS than the managed instance model. Using SQL Database, you are provided with a database container (or containers, in more complex configurations) in which to create tables and coded objects. The T-SQL features of SQL Database may actually be more advanced than what you get with the other offerings, as they often release features to SQL Database first.
There are many configurations to choose from in terms of performance and size, starting small and scaling up to very large database sizes. In fact, using Hyperscale17 for SQL Database, this platform's offerings can currently support up to 100 TB of data, so the amount of space and computing power is not generally a limiting factor.
The SQL Database offering is tailored to cloud applications with the least amount of legacy management dependencies. This does not mean that your configuration must be one simple database, however, and at the time of writing this book, there are three options for how you can set up your SQL Database environment:
- Single database: A single database that can be used to store data. This will feel like an SQL Server instance to the user/administrator, with access to a TempDB and master database. If you have connected to a contained database in SQL Server, it is conceptually similar except the boundaries are far more firm. Compute resources are available via pre-provisioned or serverless options18, ensuring sufficient resources for both consistent and highly unpredictable workloads.
- Elastic Pool: Multiple independent single-database configurations that share the same set of computing resources. This allows you to have low-use and high-use pattern databases in the same pool, where both databases can use up to the maximum resources when needed, but not pay for large amounts of resources dedicated to one database when it is only rarely needed. This is particularly effective when the databases see heavy utilization at different times of the day.
- Database server: A group of single databases and elastic pools, banded together for administrative purposes, for things such as networking, security principals (logins), policies, and so on.
As the consumer, you are getting the SQL Server database as a platform to store data.
Some aspects of the table structures you create on Azure SQL Database may need to be different from the other offerings. One key difference is that every table will require a clustered index to enable the use of replication of your data to redundant storage. (Using clustered indexes on all tables is considered a best practice in most cases anyhow.) You have no control over where your data is located (outside of the region of the world), or the computer the data is located on. This is all done for you.
Just like managed instances, backups are part of the managed package, but additionally, high availability via geo-replication is also supported as part of the management of your database, with very little configuration. Microsoft's guide offers more insights on active geo-replication19.
A feature that is specific to SQL Database that is particularly useful is automated indexing20. Using the automatic index management feature, SQL Database can apply a
DROP INDEX automatically based on what the optimizer has recognized as a given need, then monitor that change to see whether it has helped or harmed performance and adjust accordingly.
The SQL Database offering provides the most management for you but may not exactly match the needs of an organization with a well-established working system in an SQL Server instance.
Now that we have covered these three offerings, in the next section, we will outline some of the important differences between them.
SQL Server in Azure comparisons
All three of the Azure SQL offerings that we have introduced in this section are based on the same SQL Server 2019 database engine, but there are key differences between them. In the following table, we list some of the key differences to remember when considering which option to choose:
Figure 1.1: Azure SQL offerings
In the next section, we will highlight the key new features that were included in SQL Server 2019 that make it a worthwhile upgrade from earlier versions (including from the most recent version before 2019: SQL Server 2017).