Book Image

Data Science with SQL Server Quick Start Guide

By : Dejan Sarka
Book Image

Data Science with SQL Server Quick Start Guide

By: Dejan Sarka

Overview of this book

SQL Server only started to fully support data science with its two most recent editions. If you are a professional from both worlds, SQL Server and data science, and interested in using SQL Server and Machine Learning (ML) Services for your projects, then this is the ideal book for you. This book is the ideal introduction to data science with Microsoft SQL Server and In-Database ML Services. It covers all stages of a data science project, from businessand data understanding,through data overview, data preparation, modeling and using algorithms, model evaluation, and deployment. You will learn to use the engines and languages that come with SQL Server, including ML Services with R and Python languages and Transact-SQL. You will also learn how to choose which algorithm to use for which task, and learn the working of each algorithm.
Table of Contents (15 chapters)
Title Page
Copyright and Credits
Packt Upsell
Contributors
Preface
Index

Expressing dependencies with a linear regression formula


The simplest linear regression formula for two continuous variables is as follows:

The slope for this linear function is denoted with b and the intercept with a. When calculating these values, you try to find the line that fits the data points the best, where the deviations from the line are the smallest. The formula for the slope is as follows:

Once you have the slope, it is easy to calculate the intercept, as shown here:

The decision regarding which variable is dependent and which independent is up to you. Of course, this also depends on the problem you are trying to solve, and on common sense. For example, you would probably not model gender as a dependent variable of income, but would do the opposite. The formulas don't tell you that. You actually calculate two formulas, name the first regression line and the second regression line, with both variables playing a different role in each equation.

Here is the calculation of both slopes...