Book Image

Data Science with SQL Server Quick Start Guide

By : Dejan Sarka
Book Image

Data Science with SQL Server Quick Start Guide

By: Dejan Sarka

Overview of this book

SQL Server only started to fully support data science with its two most recent editions. If you are a professional from both worlds, SQL Server and data science, and interested in using SQL Server and Machine Learning (ML) Services for your projects, then this is the ideal book for you. This book is the ideal introduction to data science with Microsoft SQL Server and In-Database ML Services. It covers all stages of a data science project, from businessand data understanding,through data overview, data preparation, modeling and using algorithms, model evaluation, and deployment. You will learn to use the engines and languages that come with SQL Server, including ML Services with R and Python languages and Transact-SQL. You will also learn how to choose which algorithm to use for which task, and learn the working of each algorithm.
Table of Contents (15 chapters)
Title Page
Copyright and Credits
Packt Upsell
Contributors
Preface
Index

Principal components and factor analyses


The principal component analysis (PCA) is a well-known undirected method for reducing the number of variables used in further analyses. This is also called dimensionality-reduction. In some projects, you could have hundreds, maybe even thousands of input variables. Using all of them for can input to clustering algorithm could lead to enormous time needed to train the model. However, many of those input variables that might vary together, might have some association.

PCA starts again with the hyperspace, where each input variable defines one axis. PCA searches for a set of new axes, a set of new variables, which should be linearly uncorrelated, called the principal components. The principal components are calculated in such a way that the first one includes the largest possible variability of the whole input variable set, the second the second largest, and so on. The calculation of the principal components is derived from linear algebra. The principal...