Book Image

Data Science with SQL Server Quick Start Guide

By : Dejan Sarka
Book Image

Data Science with SQL Server Quick Start Guide

By: Dejan Sarka

Overview of this book

SQL Server only started to fully support data science with its two most recent editions. If you are a professional from both worlds, SQL Server and data science, and interested in using SQL Server and Machine Learning (ML) Services for your projects, then this is the ideal book for you. This book is the ideal introduction to data science with Microsoft SQL Server and In-Database ML Services. It covers all stages of a data science project, from businessand data understanding,through data overview, data preparation, modeling and using algorithms, model evaluation, and deployment. You will learn to use the engines and languages that come with SQL Server, including ML Services with R and Python languages and Transact-SQL. You will also learn how to choose which algorithm to use for which task, and learn the working of each algorithm.
Table of Contents (15 chapters)
Title Page
Copyright and Credits
Packt Upsell
Contributors
Preface
Index

Introducing descriptive statistics for continuous variables


A continuous variable can take any value from an interval, or from all real numbers; there is no limit for the number of distinct values here. Of course, you could try to discretize a continuous variable and then treat it as a discrete one. I will discuss different discretization options in the next chapter. In this section, I am going to introduce some statistical measures that describe the distribution of a continuous variable, measures that are called descriptive statistics.

You want to understand the distribution of a continuous variable. You can create graphs for all continuous variables. However, comparing tens or even hundreds of graphs visually does not tell you much. You can also describe the distribution numerically, with descriptive statistics. Comparing numbers is much faster and easier than comparing graphs. The you use are the moments of a .

 

 

The well-known are the center, spread, skewness, and tailedness of distribution...