Unsupervised Machine Learning | Data Science with SQL Server Quick Start Guide

Sign In Start Free Trial

Book Overview & Buying
Table Of Contents

Data Science with SQL Server Quick Start Guide

By : Dejan Sarka

4 (1)

Data Science with SQL Server Quick Start Guide

4 (1)

By: Dejan Sarka

Overview of this book

SQL Server only started to fully support data science with its two most recent editions. If you are a professional from both worlds, SQL Server and data science, and interested in using SQL Server and Machine Learning (ML) Services for your projects, then this is the ideal book for you. This book is the ideal introduction to data science with Microsoft SQL Server and In-Database ML Services. It covers all stages of a data science project, from businessand data understanding,through data overview, data preparation, modeling and using algorithms, model evaluation, and deployment. You will learn to use the engines and languages that come with SQL Server, including ML Services with R and Python languages and Transact-SQL. You will also learn how to choose which algorithm to use for which task, and learn the working of each algorithm.

Preface

Preface

Who this book is for

What this book covers

To get the most out of this book

Get in touch

Free Chapter

Writing Queries with T-SQL

Writing Queries with T-SQL

Before starting – installing SQL Server

Core T-SQL SELECT statement elements

Advanced SELECT techniques

Summary

Introducing R

Introducing R

Obtaining R

Your first line R of code in R

Learning the basics of the R language

Using R data structures

Summary

Getting Familiar with Python

Getting Familiar with Python

Selecting the Python environment

Writing your first python code

Using functions, branches, and loops

Organizing the data

Integrating SQL Server and ML

Summary

Data Overview

Data Overview

Getting familiar with a data science project life cycle

Ways to measure data values

Introducing descriptive statistics for continuous variables

Using frequency tables to understand discrete variables

Showing associations graphically

Summary

Data Preparation

Data Preparation

Handling missing values

Creating dummies

Discretizing continuous variables

The entropy of a discrete variable

Advanced data preparation topics

Summary

Intermediate Statistics and Graphs

Intermediate Statistics and Graphs

Exploring associations between continuous variables

Measuring dependencies between discrete variables

Discovering associations between continuous and discrete variables

Expressing dependencies with a linear regression formula

Summary

Unsupervised Machine Learning

Unsupervised Machine Learning

Installing ML services (In-Database) packages

Performing market-basket analysis

Finding clusters of similar cases

Principal components and factor analyses

Summary

Supervised Machine Learning

Supervised Machine Learning

Evaluating predictive models

Using the Naive Bayes algorithm

Predicting with logistic regression

Trees, forests, and more trees

Predicting with T-SQL

Summary

Other Books You May Enjoy

Other Books You May Enjoy

Leave a review - let other readers know what you think

Finding clusters of similar cases

With cluster analysis, you try to find specific groups of cases, based on the similarity of the input variables. These groups, or clusters, help you understand your cases, for example, your customers or your employees. The clustering process groups the data based on the values of the variables, so the cases within a cluster have high similarity; however, these cases are very dissimilar to cases in other clusters. Similarity can be measured with different measures. Geometric distance is an example of a measure for similarity. You define an n-dimensional hyperspace, where each input variable defines one dimension, or one axis. Values of the variables define points in this hyperspace; these points are, of course, the cases. Now you can measure the geometric distance of each case from all other cases.

There are many different clustering algorithms. The most popular one is the K-means algorithm. With this algorithm, you define the number of K clusters in advance...

CONTINUE READING

83

Tech Concepts

36

Programming languages

73

Tech Tools

Unlimited access to the largest independent learning library in tech of over 8,000 expert-authored tech books and videos.

Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.

50+ new titles added per month and exclusive early access to books as they are being written.

Data Science with SQL Server Quick Start Guide

Search

Your notes and bookmarks