Book Image

Data Science with SQL Server Quick Start Guide

By : Dejan Sarka
Book Image

Data Science with SQL Server Quick Start Guide

By: Dejan Sarka

Overview of this book

SQL Server only started to fully support data science with its two most recent editions. If you are a professional from both worlds, SQL Server and data science, and interested in using SQL Server and Machine Learning (ML) Services for your projects, then this is the ideal book for you. This book is the ideal introduction to data science with Microsoft SQL Server and In-Database ML Services. It covers all stages of a data science project, from businessand data understanding,through data overview, data preparation, modeling and using algorithms, model evaluation, and deployment. You will learn to use the engines and languages that come with SQL Server, including ML Services with R and Python languages and Transact-SQL. You will also learn how to choose which algorithm to use for which task, and learn the working of each algorithm.
Table of Contents (15 chapters)
Title Page
Copyright and Credits
Packt Upsell
Contributors
Preface
Index

Exploring associations between continuous variables


For a single continuous variable, I introduced a couple of measures for the spread. One of the basic measures for the spread is the variance. If you have two continuous variables, each one has its own variance. However, you can ask yourself whether these two variables vary together. For example, if a value for the first variable of some case is quite high, well above the mean of the first variable, the value of the second variable of the same case could also be above its own mean. This would be a positive association. If the value of the second variable would be lower when the value for the first one is higher, then you would have a negative association. If there is no connection between the positive and negative deviations from the mean for both variables, then you can accept the null hypothesis—there is no association between these two variables. The formula for the covariance, which is the first measure for the association I am introducing...