Index
A
- activation function / Predicting with logistic regression
- advanced data preparation
- about / Advanced data preparation topics
- grouping / Efficient grouping and aggregating in T-SQL
- aggregating / Efficient grouping and aggregating in T-SQL
- Microsoft scalable libraries, leveraging in Python / Leveraging Microsoft scalable libraries in Python
- dplyr package, using in R / Using the dplyr package in R
- advanced SELECT techniques
- about / Advanced SELECT techniques
- subqueries / Introducing subqueries
- window functions / Window functions
- table expressions / Common table expressions
- top n rows, searching / Finding top n rows and using the APPLY operator
- APPLY operator, using / Finding top n rows and using the APPLY operator
- analysis of variance (ANOVA) / Discovering associations between continuous and discrete variables
- arithmetic mean / Calculating centers of a distribution
- array / Using R data structures
- assignment operator / Learning the basics of the R language
- association rules algorithm / Performing market-basket analysis
- associations
- displaying graphically / Showing associations graphically
- discovering, between continuous and discrete variables / Discovering associations between continuous and discrete variables
- attributes / Using R data structures
B
- bar chart / Using frequency tables to understand discrete variables
- bell curve / Higher population moments
- Binary Exchange Language Server / Integrating SQL Server and ML
- branches
- BxlServer / Integrating SQL Server and ML
C
- cases / Using R data structures
- center
- calculating, of distribution / Calculating centers of a distribution
- centroids / Finding clusters of similar cases
- chaining / Using the dplyr package in R
- chi-squared critical points / Measuring dependencies between discrete variables
- chi-squared value / Measuring dependencies between discrete variables
- classification matrix / Evaluating predictive models
- cluster analysis / Finding clusters of similar cases
- clusters
- similar cases, searching / Finding clusters of similar cases
- coefficient of determination / Exploring associations between continuous variables
- coefficient of the variation (CV) / Measuring the spread
- Comprehensive R Archive Network (CRAN)
- URL / Obtaining R
- confusion matrix / Evaluating predictive models
- contingency tables / Measuring dependencies between discrete variables
- continuous variables
- discretizing / Discretizing continuous variables
- equal width discretization / Equal width discretization
- equal height discretization / Equal height discretization
- custom discretization / Custom discretization
- associations, exploring / Exploring associations between continuous variables
- core T-SQL SELECT statement
- about / Core T-SQL SELECT statement elements, The simplest form of the SELECT statement
- multiple tables, joining / Joining multiple tables
- data, grouping / Grouping and aggregating data
- data, aggregating / Grouping and aggregating data
- correlation coefficient / Exploring associations between continuous variables
- covariance / Exploring associations between continuous variables
- Cross Industry Standard Process for Data Mining (CRISP) model
D
- data
- organizing / Organizing the data
- data frame / Using R data structures
- data science project life cycle / Getting familiar with a data science project life cycle
- dataset / Using R data structures
- data source name (DSN) / Learning the basics of the R language
- data structures
- using, in R / Using R data structures
- data values
- measuring / Ways to measure data values
- decision trees algorithm / Trees, forests, and more trees
- degrees of freedom / Measuring the spread
- descriptive statistics
- for continuous variables / Introducing descriptive statistics for continuous variables
- dichotomous variables / Ways to measure data values
- dimensionality-reduction / Principal components and factor analyses
- discrete variable
- about / Using R data structures
- entropy / The entropy of a discrete variable
- dependencies, measuring / Measuring dependencies between discrete variables
- dplyr package
- using, in R / Using the dplyr package in R
- dummies
- creating / Creating dummies
E
- entropy / The entropy of a discrete variable
- equal height discretization / Equal height discretization
- equal width discretization / Equal width discretization
- exploratory factor analysis (EFA) / Principal components and factor analyses
F
- factor analysis (FA) / Principal components and factor analyses
- factors / Using R data structures
- false negatives / Evaluating predictive models
- false positives / Evaluating predictive models
- frequency tables
- using, for discrete variables / Using frequency tables to understand discrete variables
- functions
G
- Gaussian curve / Higher population moments
- Gaussian distribution / Higher population moments
- Gaussian mixture models (GMM) algorithm / Principal components and factor analyses
- gradient-boosting trees / Trees, forests, and more trees
H
- histogram / Using frequency tables to understand discrete variables
- hyperparameters / Expressing dependencies with a linear regression formula
I
- indicators / Creating dummies
- inter-quartile range (IQR) / Measuring the spread
- intervals / Ways to measure data values
K
- K-means algorithm / Finding clusters of similar cases
- key–value pairs / Using functions, branches, and loops
- kurtosis / Higher population moments
L
- lift chart / Evaluating predictive models
- linear regression formula
- used, for expressing dependencies / Expressing dependencies with a linear regression formula
- lists / Using R data structures
- logical operators / Learning the basics of the R language
- logistic function / Predicting with logistic regression
- logistic regression algorithm
- predicting with / Predicting with logistic regression
- loops
M
- market-basket analysis
- performing / Performing market-basket analysis
- matrix / Using R data structures
- median / Calculating centers of a distribution
- Microsoft ML
- integrating, with SQL Server / Integrating SQL Server and ML
- ML Services (In-database) / Integrating SQL Server and ML
- Microsoft ML Server / Integrating SQL Server and ML
- Microsoft R Application Network (MRAN)
- URL / Obtaining R
- missing values
- handling / Handling missing values
- ML services (In-Database) packages
- installing / Installing ML services (In-Database) packages
- mode / Learning the basics of the R language
- monotonic variable / Ways to measure data values
N
- Naive Bayes algorithm
- using / Using the Naive Bayes algorithm
- native predictions / Predicting with T-SQL
O
- object / Learning the basics of the R language
- oblique / Principal components and factor analyses
- observations / Using R data structures
- one-way ANOVA / Discovering associations between continuous and discrete variables
- ordinal variable / Ways to measure data values
- orthogonal / Principal components and factor analyses
P
- permissions / Learning the basics of the R language
- piecemeal linear regression / Trees, forests, and more trees
- polynomial regression / Expressing dependencies with a linear regression formula
- predictive models
- training set / Evaluating predictive models
- evaluating / Evaluating predictive models
- test set / Evaluating predictive models
- principal component analysis (PCA)
- about / Principal components and factor analyses
- factor analyses / Principal components and factor analyses
- promax / Principal components and factor analyses
- Python
- Microsoft scalable libraries, leveraging / Leveraging Microsoft scalable libraries in Python
- python code
- writing / Writing your first python code
- Python environment
- selecting / Selecting the Python environment
Q
- quantiles / Measuring the spread
- quartiles / Measuring the spread
R
- R
- obtaining / Obtaining R
- coding / Your first line R of code in R
- learning / Learning the basics of the R language
- data structures, using / Using R data structures
- dplyr package, using / Using the dplyr package in R
- random forests algorithm / Trees, forests, and more trees
- range / Measuring the spread
- rank / Ways to measure data values
- real-time scoring / Predicting with T-SQL
- receiver operating characteristic (ROC) / Evaluating predictive models
- recursive partitioning / Trees, forests, and more trees
- regression trees / Trees, forests, and more trees
- relative entropy / The entropy of a discrete variable
- rotation / Principal components and factor analyses
- R Tools for Visual Studio (RTVS) / Obtaining R
S
- scatterplot / Showing associations graphically
- sequences / Learning the basics of the R language
- sigmoid function / Predicting with logistic regression
- skewness / Higher population moments
- spread
- measuring / Measuring the spread
- SQL Server
- installing / Before starting – installing SQL Server
- setting up / SQL Server setup
- integrating, with ML / Integrating SQL Server and ML
- SQL Server Management Studio (SSMS)
- URL / SQL Server setup
- about / Integrating SQL Server and ML
- SQL Server Reporting Services (SSRS) reports / Integrating SQL Server and ML
- standard deviation (σ) / Measuring the spread
- standard normal distribution / Higher population moments
- subqueries / Introducing subqueries
- support / Performing market-basket analysis
T
- taildness / Introducing descriptive statistics for continuous variables
- Transact-SQL (T-SQL)
- predicting with / Predicting with T-SQL
- true negatives / Evaluating predictive models
- true positives / Evaluating predictive models
V
- variables / Learning the basics of the R language, Using R data structures
- variables, types
- about / Ways to measure data values
- continuous variables / Ways to measure data values
- discrete variables / Ways to measure data values
- variance / Measuring the spread
- variance of a sample / Measuring the spread
- variance of the population / Measuring the spread
- varimax / Principal components and factor analyses
W
- weighted sum / Predicting with logistic regression
- window functions / Window functions
- workspace / Your first line R of code in R
Z
- Z distribution / Higher population moments