-
Book Overview & Buying
-
Table Of Contents
Introducing Microsoft SQL Server 2019
By :
The Data Discovery and Classification feature enables you to identify, classify, and label data held across your SQL Server estate. The sheer volume of data now held within databases makes this a challenging process, coupled with the fact that regulatory mandates such as GDPR, SOX, and PCI demand that businesses protect sensitive data. So you can see how this feature will help. Before you can develop a security strategy for your SQL Server databases, it makes logical sense to know what data you hold, and from this you can then classify and label the more sensitive data and implement the relevant security controls, therefore minimizing potential sensitive data leaks.
Key components for this feature include two metadata attributes, labels and information types. Labels are used to define the sensitivity of data. Information types are used to provide additional granularity into the types of data stored in a column. As you can see in Figure 2.1, email addresses and phone numbers have been classified as contact information under the GDPR label.
To start the classification process, you will need to right-click on the database and find the Data Discovery and Classification option (Figure 2.2).
While you are connected to the database via SSMS (SQL Server Management Studio), you can issue the following query to get a really good summary of the classification that has just taken place:
SELECT schema_name(O.schema_id) AS schema_name, O.NAME AS table_name, C.NAME AS column_name, information_type, sensitivity_label FROM ( SELECT IT.major_id, IT.minor_id, IT.information_type, L.sensitivity_label FROM ( SELECT major_id, minor_id, value AS information_type FROM sys.extended_properties WHERE NAME = 'sys_information_type_name' ) IT FULL OUTER JOIN ( SELECT major_id, minor_id, value AS sensitivity_label FROM sys.extended_properties WHERE NAME = 'sys_sensitivity_label_name' ) L ON IT.major_id = L.major_id AND IT.minor_id = L.minor_id ) EP JOIN sys.objects O ON EP.major_id = O.object_id JOIN sys.columns C ON EP.major_id = C.object_id AND EP.minor_id = C.column_id
You can delegate this to SQL Server and let it carry out a review of the data and an automatic implementation of the classification process.
With SQL Server 2019, is it not possible to use T-SQL to add metadata about the sensitivity classification, such as the following:
ADD SENSITIVITY CLASSIFICATION TO
<object_name> [, ...n ]
WITH ( <sensitivity_label_option> [, ...n ]
This is only possible with Azure SQL Database.
Another advantage of this feature is the visibility of the classification states in the form of a report, which you can then export to different formats as required. This will benefit you regarding compliance and auditing. The following screenshot shows a copy of a report in Excel format:
Once you understand your data via the classification processes, you can then leverage different features from SQL Server 2019, such as Always Encrypted or Data Masking, to protect these sensitive columns.