Book Image

Data Science with SQL Server Quick Start Guide

By : Dejan Sarka
Book Image

Data Science with SQL Server Quick Start Guide

By: Dejan Sarka

Overview of this book

SQL Server only started to fully support data science with its two most recent editions. If you are a professional from both worlds, SQL Server and data science, and interested in using SQL Server and Machine Learning (ML) Services for your projects, then this is the ideal book for you. This book is the ideal introduction to data science with Microsoft SQL Server and In-Database ML Services. It covers all stages of a data science project, from businessand data understanding,through data overview, data preparation, modeling and using algorithms, model evaluation, and deployment. You will learn to use the engines and languages that come with SQL Server, including ML Services with R and Python languages and Transact-SQL. You will also learn how to choose which algorithm to use for which task, and learn the working of each algorithm.
Table of Contents (15 chapters)
Title Page
Copyright and Credits
Packt Upsell
Contributors
Preface
Index

Creating dummies


Some algorithms, for example, regression analysis algorithms, need numerical input variables. If you want to use a categorical variable in the analysis, you need to convert it somehow to a numerical one. If the variable is ordinal, this is not a problem; you just assign appropriate integers to the naturally ordered values of the variable. From a nominal variable, you can create a set of indicators. There is one indicator for each possible value, showing whether the value it represents is taken for a case. If a specific value is taken for a case, you assign 1 to the indicator for this value value, or otherwise 0. Such new variables, the indicators, are also called dummy variables, or dummies. In T-SQL, you can use the IIF() function to generate dummies, as the following code shows:

SELECT TOP 3 MaritalStatus,
 IIF(MaritalStatus = 'S', 1, 0)
 AS [TM_S],
 IIF(MaritalStatus = 'M', 1, 0)
 AS [TM_M]
FROM dbo.vTargetMail;

 

 

 

Here are the results:

MaritalStatus TM_S        TM_M
--...