Book Image

Data Science with SQL Server Quick Start Guide

By : Dejan Sarka
Book Image

Data Science with SQL Server Quick Start Guide

By: Dejan Sarka

Overview of this book

SQL Server only started to fully support data science with its two most recent editions. If you are a professional from both worlds, SQL Server and data science, and interested in using SQL Server and Machine Learning (ML) Services for your projects, then this is the ideal book for you. This book is the ideal introduction to data science with Microsoft SQL Server and In-Database ML Services. It covers all stages of a data science project, from businessand data understanding,through data overview, data preparation, modeling and using algorithms, model evaluation, and deployment. You will learn to use the engines and languages that come with SQL Server, including ML Services with R and Python languages and Transact-SQL. You will also learn how to choose which algorithm to use for which task, and learn the working of each algorithm.
Table of Contents (15 chapters)
Title Page
Copyright and Credits
Packt Upsell
Contributors
Preface
Index

Ways to measure data values


I already introduced statistical terminology, which is also used in data science: you analyze cases using their variables. In a RDBMS database, a case is represented as a row in a table and a variable as a column in the same table. In Python and R,  you analyze data frames, which are similar to tables, just with positional access.

The first thing you need to decide is what the case is you want to analyze. Sometimes, it is not so easy to exactly define your case. For example, if you're performing a credit risk analysis, you might define a family as a case rather than a single customer.

The next thing you have to understand is how data values are measured in your data set. A typical data science team should include a subject matter expert that can explain the meaning of the variable values. There are several different types of variables:

  • Continuous variables have an infinitive range of values. There are also couple of different types of continuous variables:
    • True numeric...