Book Image

SQL Server on Azure Virtual Machines

By : Joey D'Antoni, Louis Davidson, Allan Hirt, John Martin, Anthony Nocentino, Tim Radney, Randolph West
Book Image

SQL Server on Azure Virtual Machines

By: Joey D'Antoni, Louis Davidson, Allan Hirt, John Martin, Anthony Nocentino, Tim Radney, Randolph West

Overview of this book

Deploying SQL Server on Azure virtual machines allows you to work on full versions of SQL Server in the cloud without having to maintain on-premises hardware. The book begins by introducing you to the SQL portfolio in Azure and takes you through SQL Server IaaS scenarios, before explaining the factors that you need to consider while choosing an OS for SQL Server in Azure VMs. As you progress through the book, you'll explore different VM options and deployment choices for IaaS and understand platform availability, migration tools, and best practices in Azure. In later chapters, you'll learn how to configure storage to achieve optimized performance. Finally, you'll get to grips with the concept of Azure Hybrid Benefit and find out how you can use it to maximize the value of your existing on-premises SQL Server. By the end of this book, you'll be proficient in administering SQL Server on Microsoft Azure and leveraging the tools required for its deployment.
Table of Contents (10 chapters)

SQL Server 2019 highlights

SQL Server 2019 has a lot of new and enhanced features that make it not only a world-class relational database engine, but a world-class data platform. In this section, we will be taking a brief look at what SQL Server 2019 adds on from previous versions, as well as the features that have been added to running SQL Server on a Linux-based platform in SQL Server 2019.

You can read more about the new features available in SQL Server 2019 on Microsoft documentation21 or the Microsoft SQL Server 2019 Technical white paper22.

Intelligence over all of your data

Intelligence over all of your data is a phrase that you can find on multiple Microsoft websites describing an important SQL Server feature for integrating your disparate data sources: PolyBase. PolyBase23 is a feature that can be used to virtualize data from external sources including Azure Cosmos DB, Azure Blob Storage, and starting in SQL Server 2019, data in SQL Server, Oracle, Teradata, and MongoDB. Once connected and configured, you can query this data just like any normal relational table (as well as joined with your local relational data).

Additionally, SQL Server 2019 introduces Big Data Clusters24, providing scale-out capabilities with clusters of SQL Server, Apache Spark, and Hadoop Distributed File System (HDFS) data, allowing reading and writing of large quantities of data stored in SQL Server or big data sources.

Both features allow you to bring relational data and big data together to provide a uniform platform for data processing. This allows you to use tools such as Machine Learning Language Extensions, AI with Machine Learning Services, Reporting Services, and even SQL Server's primary language T-SQL with data from very differently formatted data structures, without doing any copying or transformation of data.

Enhancements in developer experience

SQL Server 2019 includes several important improvements to the developer experience that help to make developing data-based solutions easier:

  • UTF-8 support: Allows you to store Unicode data in the char and varchar data types via new collation support. This allows you to store data in the very popular Unicode encoding format natively, not requiring translation into the UTF-16 standard that is used in nchar and nvarchar columns. For more information, refer to the Microsoft documentation25 that covers collation and Unicode support in SQL Server. For an in-depth commentary on UTF-8 in SQL Server, refer to the enlightening blog26 by Pedro Lopez.
  • Machine Learning Services: Machine Learning Services27 enables R and Python support, allowing T-SQL to employ machine learning models where your data lives. It allows processing data at the partition level, rather than only at the object level, enabling parallel processing. Additionally, Machine Learning Services can now be used with Failover Clusters.
  • SQL Server Language Extensions and the Extensibility Framework: Improvements made to allow additional languages to be run in a very similar manner to R and Python, which are used by Machine Learning Services. This gives developers more choices for running established code right in the SQL Server engine. Currently, only Java is supported, but more languages will follow. For more details, check out Microsoft's overview of language extensions28.
  • Lightweight query profiling: Improvements to live query plan gathering29 to make it cheaper to get statistics and progress on currently executing SQL statements.
  • SQL Graph enhancements: SQL Server graph database capabilities30 allow for the creation of nodes and edges (many-to-many relationships), which are often needed for applications where a more traditional relational schema is too rigid and complex to query. Improvements now allow edge constraints (foreign keys on edges) and query improvements to query for nodes that are multiple edges away from one another.

Performance enhancements

SQL Server has been a leader in performance for years, with great backward compatibility at the code level, and tremendous leaps in performance in every new version. SQL Server 2019 brings with it even more improvements to performance, many built on enhancements from recent releases. For example, memory-optimized tables can bring tremendous performance improvements when storing data, and in SQL Server 2019 that feature has been leveraged to enhance the metadata for TempDB.

In this section, we will briefly look at some of the performance enhancements in SQL Server 2019:

  • Enhancements to Intelligent Query Processing (IQP): IQP31 is a family of loosely connected technologies designed to improve your SQL Server, many without any changes to your code. Several new methods of improving performance, such as expanding batch mode to include rowstore structures and inlining scalar functions, have been added.
  • Accelerated Database Recovery (ADR): ADR32 dramatically reduces the time required to return control to the user in a rollback of a large data change/recovery on restart, doing much of the work asynchronously.
  • Hybrid Buffer Pool: Hybrid Buffer Pool33 provides support for persistent memory modules (PMEMs), which allows the engine to use PMEMs in the classic roles typically played by RAM and the data file, eliminating the need to checkpoint data from memory, greatly enhancing performance for scenarios where very high performance is required.
  • Memory-Optimized TempDB Metadata: The metadata for the TempDB34 database can be altered to use in-memory data structures, removing bottlenecks that occur when rapidly and concurrently creating a large number of temporary tables.

Security improvements

Security is (or at least should be) one of the most important concerns for any data engineer/administrator. SQL Server 2019 introduces several important security improvements, building on improvements in recent versions of SQL Server, such as Always Encrypted35, row-level security36, dynamic data masking37, transparent data encryption38, and far more39 than we will cover in this book.

Changes in SQL Server 2019 to security include:

  • Always Encrypted with secure enclaves: An enhanced form of Always Encrypted40 that allows computations/searches to occur on the server side on encrypted string data, but still never shares the plaintext with the user or administrator without the required key.
  • Data classification and auditing: This starts with an SQL Server Management Studio (SSMS) tool to help locate, classify, and tag sensitive data that may need to be handled specially. Next, to know when sensitive data is being accessed, SQL Server Audit41 includes a new field in its output that indicates the data sensitivity of data that is included in the audit output. For more details, refer to this Microsoft guide42 on data discovery and classification.
  • Simplified certificate management: Certificate management43 is integrated with SQL Server Configuration Manager.

High Availability/Disaster Recovery (HADR)

As your demand for around-the-clock access to data increases, the following features are designed to make usage of SQL Server even more possible during maintenance and during a failover to a different server:

  • Index maintenance enhancements: SQL Server 2019 adds to the types of indexes that can be rebuilt online to include clustered columnstore indexes, as well as allowing you to pause and resume rowstore index rebuilds. For more details, check the information44 provided by Microsoft.
  • Availability group enhancements: Availability groups45 are an HADR feature that was first introduced in SQL Server 2012. They allow you to maintain copies of your database in a different location to fail over to when there is a failure in your primary database/server (as well as other uses). SQL Server 2019 increases the number of synchronous secondary replicas from three to five. Automatic client redirection46 has been added, so clients can fail over without changing the connection string. Additionally, there have been licensing improvements47 for Software Assurance customers pertaining to HADR scenarios.

Platform of choice

SQL Server has been around for over 20 years on Windows, but in SQL Server 2017, the platform choices grew to include Linux. SQL Server 2017 on Linux included most of the primary features of a relational engine that customers needed, but not all of them. SQL Server 2019 adds most of the features that were missing in SQL Server 2017.

Features added for SQL Server 2019 on Linux include:

  • Replication: Data is allowed to be copied automatically between databases on the same or different instance (including Windows instances). Transaction and snapshot replication is now in Linux.
  • Distributed Transactions: Enables transactions that extend beyond the confines of the instance.
  • Change Data Capture: Maintains a history of changes to data in a database, commonly for processes duplicating data where replication doesn't make sense.
  • Extended Active Directory Support: Adds support for third-party Active Directory integrations.
  • Machine Learning/Language Extensions: Adds the ability to run R, Python, and Java inside the SQL Server engine.
  • PolyBase: Ability to query external data and leverage data virtualization using T-SQL, as described earlier.

For a complete list of improvements to the Linux version of SQL Server with links to more details, check out the information at the Microsoft documentation48.

Beyond Linux on a VM or on-premises server, improvements have been made when installing SQL Server on a container. Azure Container Registry49 provides a location to manage containers for Docker and Open Container Initiative images.

SQL Server is also now available50 on Red Hat Enterprise Linux 8, as well as using Red Hat Universal Image Containers.

Finally, when using containers, SQL Server 2019 does not need to be started as a root container51 in Linux, providing a more secure experience.