Book Image

Introducing Microsoft SQL Server 2019

By : Kellyn Gorman, Allan Hirt, Dave Noderer, Mitchell Pearson, James Rowland-Jones, Dustin Ryan, Arun Sirpal, Buck Woody
Book Image

Introducing Microsoft SQL Server 2019

By: Kellyn Gorman, Allan Hirt, Dave Noderer, Mitchell Pearson, James Rowland-Jones, Dustin Ryan, Arun Sirpal, Buck Woody

Overview of this book

Microsoft SQL Server comes equipped with industry-leading features and the best online transaction processing capabilities. If you are looking to work with data processing and management, getting up to speed with Microsoft Server 2019 is key. Introducing SQL Server 2019 takes you through the latest features in SQL Server 2019 and their importance. You will learn to unlock faster querying speeds and understand how to leverage the new and improved security features to build robust data management solutions. Further chapters will assist you with integrating, managing, and analyzing all data, including relational, NoSQL, and unstructured big data using SQL Server 2019. Dedicated sections in the book will also demonstrate how you can use SQL Server 2019 to leverage data processing platforms, such as Apache Hadoop and Spark, and containerization technologies like Docker and Kubernetes to control your data and efficiently monitor it. By the end of this book, you'll be well versed with all the features of Microsoft SQL Server 2019 and understand how to use them confidently to build robust data management solutions.
Table of Contents (15 chapters)

Clustered Columnstore Indexes

Clustered Columnstore indexes can make a dramatic difference and are the technology used to optimize real-time analytics. They can achieve an order of magnitude performance gain over a normal row table, a dramatic compression of the data, and minimize interference with real-time transaction processing.

A columnstore has rows and columns, but the data is stored in a column format.

A rowgroup is a set of rows that are compressed into a columnstore format — a maximum of a million rows (1,048,576).

There are an optimum number of rows in a rowgroup that are stored column-wise, and this represents a trade-off between large overhead, if there are too few rows, and an inability to perform in-memory operations if the rows are too big.

Each row consists of column segments, each of which represents a column from the compressed row.

Columnstore is illustrated in Figure 1.1, showing how to load data into a non-clustered columnstore index:

Figure 1.1: Loading data into a non-clustered columnstore index
Figure 1.1: Loading data into a non-clustered columnstore index

A clustered columnstore index is how the columnstore table segments are stored in physical media. For performance reasons, and to avoid fragmenting the data, the columnstore index may store some data in a deltastore and a list of the IDs of deleted rows. All deltastore operations are handled by the system and not visible directly to the user. Deltastore and columnstore data is combined when queried.

A delta rowgroup is used to store columnstore indexes until there are enough to store in the columnstore. Once the maximum number of rows is reached, the delta rowgroup is closed, and a background process detects, compresses, and writes the delta rowgroup into the columnstore.

There may be more than one delta rowgroup. All delta rowgroups are described as the deltastore. While loading data, anything less than 102,400 rows will be kept in the deltastore until they group to the maximum size and are written to the columnstore.

Batch mode execution is used during a query to process multiple rows at once.

Loading a clustered columnstore index and the deltastore are shown in Figure 1.2.

Figure 1.2: Loading a clustered columnstore index
Figure 1.2: Loading a clustered columnstore index

Further information can be found here: https://docs.microsoft.com/en-us/sql/relational-databases/indexes/get-started-with-columnstore-for-real-time-operational-analytics?view=sql-server-2017.

Adding Clustered Columnstore Indexes to memory-optimized tables

When using a memory-optimized table, add a non-clustered columnstore index. A clustered columnstore index is especially useful for running analytics on a transactional table.

A clustered columnstore index can be added to an existing memory-optimized table, as shown in the following code snippet:

-- Add a clustered columnstore index to a memory-optimized table
ALTER TABLE MyMemOpttable 
ADD INDEX MyMemOpt_ColIndex clustered columnstore