Index
A
- Aikake's Information Criterion (AIC) / Modeling and evaluation
- about / Granger causality
- algorithm flowchart
- about / Algorithm flowchart
- American Diabetes Association (ADA)
- URL / Business understanding
- apriori algorithms
- Area Under the Curve (AUC)
- about / Model selection
- Artificial Neural Networks (ANNs)
- about / Neural network
- reference link / Neural network
- arules* Mining Association Rules and Frequent Itemsets
- Augmented Dickey-Fuller (ADF) test
- Autocorrelation Function (ACF)
- about / Univariate time series analysis
- Autoregressive Integrated Moving Average (ARIMA) models
- about / Univariate time series analysis
B
- Back Propagation
- about / Neural network
- backward stepwise regression / Modeling and evaluation
- bagging
- about / Random forest
- Bayesian Information Criterion (BIC) / Modeling and evaluation
- bias-variance trade-off
- about / Discriminant analysis overview
- bivariate regression
- for univariate time series / Bivariate regression
- bootstrap aggregation
- about / Random forest
- Breusch-Pagan (BP) / Modeling and evaluation
- business case, regularization
- about / Business case
- business understanding / Business understanding
- data, understanding / Data understanding and preparation
- data, preparing / Data understanding and preparation
- business understanding, CRISP-DM process
- about / Business understanding
- business objective, identifying / Identify the business objective
- situation, assessing / Assess the situation
- analytical goals, determining / Determine the analytical goals
- project plan, producing / Produce a project plan
C
- Carbon Dioxide Information Analysis Center (CDIAC)
- URL / Business understanding
- caret package
- about / Elastic net
- URL / Elastic net
- classification methods
- classification models
- selecting / Model selection
- classification trees
- overview / Classification trees
- business case / Business case
- evaluation / Classification tree
- modeling / Classification tree
- cluster analysis
- business understanding / Business understanding
- data, understanding / Data understanding and preparation
- data, preparing / Data understanding and preparation
- with mixed data / Clustering with mixed data
- Cohen's Kappa statistic
- about / KNN modeling
- collaborative filtering
- about / An overview of a recommendation engine
- user-based collaborative filtering (UBCF) / User-based collaborative filtering
- item-based collaborative filtering (IBCF) / Item-based collaborative filtering
- singular value decomposition (SVD) / Singular value decomposition and principal components analysis
- principal components analysis (PCA) / Singular value decomposition and principal components analysis
- Cook's distance / Business understanding
- Corpus
- Cosine Similarity
- CRISP-DM process
- about / The process
- URL / The process
- business understanding / Business understanding
- data, understanding / Data understanding
- data preparation / Data preparation
- modeling / Modeling
- evaluation / Evaluation
- deployment / Deployment
- algorithm flowchart / Algorithm flowchart
- Cross-Entropy
- about / Neural network
- cross-validation
- for logistic regression / Logistic regression with cross-validation
- Cross Correlation Function (CCF)
- CRUTEM4 surface air temperature
- URL / Business understanding
D
- data frame
- creating / Data frames and matrices
- data preparation process
- about / Data preparation
- data understanding process
- about / Data understanding
- deep learning
- overview / Deep learning, a not-so-deep overview
- reference link / Deep learning, a not-so-deep overview, Modeling
- example / An example of deep learning
- H2O / H2O background
- deployment process / Deployment
- dirichlet distribution
- about / Topic models
- Discriminant Analysis (DA)
- overview / Discriminant analysis overview
- Linear Discriminant Analysis (LDA) / Discriminant analysis overview
- Quadratic Discriminant Analysis (QDA) / Discriminant analysis overview
- application / Discriminant analysis application
- Document-Term Matrix (DTM)
- dynamic topic modelling
- about / Topic models
E
- ECLAT algorithms
- eigenvalues
- eigenvectors
- elastic net
- about / Elastic net
- using / Elastic net
- equimax
- about / Rotation
- Euclidian Distance
- about / K-Nearest Neighbors
- evaluation process
- about / Evaluation
- exponential smoothing models
- about / Univariate time series analysis
- Extract, Transport, and Load (ETL)
- about / Data understanding
F
- F-Measure
- about / Other quantitative analyses
- False Positive Rate (FPR)
- about / Model selection
- Feed Forward network
- about / Neural network
- Final Prediction Error (FPE)
- about / Granger causality
- Fine Needle Aspiration (FNA)
- about / Business understanding
- first principal component
- forward stepwise selection / Modeling and evaluation
G
- Gedeon Method
- about / Modeling
- glmnet package
- used, for performing cross-validation for regularization / Cross-validation with glmnet
- Gower
- gradient boosted trees
- about / Introduction
- gradient boosting
- overview / Gradient boosting
- reference link / Gradient boosting
- business case / Business case
- model selection / Model selection
- gradient boosting classification
- modeling / Gradient boosting classification
- evaluation / Gradient boosting classification
- gradient boosting regression
- evaluation / Gradient boosting regression
- modeling / Gradient boosting regression
- Granger causality
- about / Granger causality
- Graphical User Interface (GUI)
- about / Getting R up and running
H
- H2O
- about / H2O background
- URL / H2O background
- data, preparing / Data preparation and uploading it to H2O
- data, uploading / Data preparation and uploading it to H2O
- train dataset, creating / Create train and test datasets
- test dataset, creating / Create train and test datasets
- modeling / Modeling
- HadCRUT4 annual time series
- URL / Business understanding
- HadSST3 sea-surface datasets
- URL / Business understanding
- Hannan-Quinn Criterion (HQ)
- about / Examining the causality
- Hat Matrix / Modeling and evaluation
- heatmaps / Data understanding and preparation
- heteroscedasticity / Business understanding
- hierarchical clustering
- about / Hierarchical clustering
- distance calculations / Distance calculations
- modeling / Hierarchical clustering
- evaluation / Hierarchical clustering
- Holt-Winter's Method
- about / Univariate time series analysis
I
- Integrated Development Environment (IDE)
- about / Getting R up and running
- interquartile range
- about / Hierarchical clustering
- item-based collaborative filtering (IBCF)
K
- K-fold cross-validation
- k-means clustering
- about / K-means clustering
- modeling / K-means clustering
- evaluation / K-means clustering
- K-Nearest Neighbors (KNN)
- about / K-Nearest Neighbors
- case study / Business case
- business understanding / Business understanding
- data, understanding / Data understanding and preparation
- data, preparing / Data understanding and preparation
- modeling / KNN modeling
- K-sets
- kernel trick
- about / Support Vector Machines
- KNN modeling
- versus SVM modeling / Model selection
L
- L1-norm
- about / LASSO
- L2-norm
- about / Ridge regression
- LASSO
- Latent Dirichlet Allocation (LDA)
- about / Topic models
- lazy learning
- about / K-Nearest Neighbors
- Leave-One-Out-Cross-Validation (LOOCV)
- Leave-One-Out Cross-Validation (LOOCV) / Modeling and evaluation
- Linear Discriminant Analysis (QDA)
- about / Discriminant analysis overview
- linear model considerations
- about / Other linear model considerations
- qualitative feature / Qualitative feature
- interaction term / Interaction term
- linear regression
- linear regression model
- linearity / Business understanding
- non-correlation of errors / Business understanding
- homoscedasticity / Business understanding
- no collinearity / Business understanding
- presence of outliers / Business understanding
- logistic regression
- about / Logistic regression
- business understanding / Business understanding
- data, understanding / Data understanding and preparation
- data, preparing / Data understanding and preparation
- modeling / Modeling and evaluation
- evaluation / Modeling and evaluation
- with cross-validation / Logistic regression with cross-validation
- Discriminant Analysis (DA) / Discriminant analysis overview
- logistic regression model
- about / The logistic regression model
- loss function
- about / Gradient boosting
M
- Mallow's Cp (Cp) / Modeling and evaluation
- margin
- about / Support Vector Machines
- market basket analysis
- about / An overview of a market basket analysis
- business understanding / Business understanding
- data, understanding / Data understanding and preparation
- data, preparing / Data understanding and preparation
- modeling / Modeling and evaluation
- evaluation / Modeling and evaluation
- matrices
- creating / Data frames and matrices
- mean squared error (MSE)
- medoid
- about / PAM
- modeling process
- about / Modeling
- multivariate linear regression
- about / Multivariate linear regression
- business understanding / Business understanding
- data, understanding / Data understanding and preparation
- data, preparing / Data understanding and preparation
- modeling / Modeling and evaluation
- evaluation / Modeling and evaluation
N
- neural network
- about / Neural network
- business understanding / Business understanding
- reference link / Business understanding
- data, understanding / Data understanding and preparation
- data, preparing / Data understanding and preparation
- modeling / Modeling and evaluation
- evaluation / Modeling and evaluation
- Normal Q-Q plot / Business understanding
O
- OPRC / Modeling and evaluation
- OPSLAKE / Modeling and evaluation
- out-of-bag (oob)
- about / Random forest
P
- 2p models
- about / Regularization in a nutshell
- Partial Autocorrelation Function (PACF)
- about / Univariate time series analysis
- Partitioning Around Medoids (PAM)
- about / Gower and partitioning around medoids, PAM
- Pearson Correlation Coefficient
- Polarity
- about / Other quantitative analyses
- Porter stemming algorithm
- Prediction Error Sum of Squares (PRESS) / Modeling and evaluation
- principal components
- overview / An overview of the principal components
- rotation / Rotation
- principal components analysis (PCA)
- Principal Components Analysis (PCA)
- about / Gower and partitioning around medoids
- business understanding / Business understanding
- data, understanding / Data understanding and preparation
- data, preparing / Data understanding and preparation
- modeling / Modeling and evaluation
- evaluation / Modeling and evaluation
- component, extraction / Component extraction
- orthogonal rotation / Orthogonal rotation and interpretation
- interpretation / Orthogonal rotation and interpretation
- factor scores, creating from components / Creating factor scores from the components
- regression analysis / Regression analysis
Q
- Quadratic Discriminant Analysis (QDA)
- about / Discriminant analysis overview
- Quantile-Quantile (Q-Q) / Business understanding
- quartimax
- about / Rotation
R
- R
- installing / Getting R up and running
- running / Getting R up and running
- URL / Getting R up and running
- using / Using R
- radical
- random forest
- about / Introduction
- overview / Random forest
- business case / Business case
- model selection / Model selection
- random forest classification
- modeling / Random forest classification
- evaluation / Random forest classification
- random forest regression
- evaluation / Random forest regression
- modeling / Random forest regression
- Receiver Operating Characteristic (ROC)
- about / Model selection
- reference link / Model selection
- Receiver Operating Characteristic Curves (ROC)
- recommendation engine
- overview / An overview of a recommendation engine
- collaborative filtering / An overview of a recommendation engine
- business understanding / Business understanding and recommendations
- data, understanding / Data understanding, preparation, and recommendations
- data, preparing / Data understanding, preparation, and recommendations
- modeling / Modeling, evaluation, and recommendations
- evaluation / Modeling, evaluation, and recommendations
- recommendations / Modeling, evaluation, and recommendations
- recommenderlab library
- regression trees
- overview / Regression trees
- business case / Business case
- modeling / Regression tree
- evaluation / Regression tree
- regularization
- about / Regularization in a nutshell
- ridge regression / Ridge regression
- LASSO / LASSO
- elastic net / Elastic net
- business case / Business case
- modeling / Modeling and evaluation
- evaluation / Modeling and evaluation
- cross-validation, performing with glmnet package / Cross-validation with glmnet
- model selection / Model selection
- regularization, modeling
- best subsets, creating / Best subsets
- ridge regression / Ridge regression
- LASSO, running / LASSO
- elastic net, using / Elastic net
- Residual Sum of Squares (RSS) / Univariate linear regression
- Residuals vs Leverage plot / Business understanding
- Restricted Boltzmann Machine
- ridge regression
- about / Ridge regression
- executing / Ridge regression
- Root Mean Square Error (RMSE)
- about / Elastic net
- R packages
- installing / Installing and loading the R packages
- loading / Installing and loading the R packages
- RStudio
- URL / Getting R up and running
S
- Schwarz-Bayes Criterion (SC)
- about / Examining the causality
- second principal component
- shrinkage penalty
- about / Regularization in a nutshell
- singular value decomposition (SVD)
- slack variables
- about / Support Vector Machines
- Sparse Coding Model
- summary stats
- displaying / Summary stats
- Sum of Squared Error
- about / Neural network
- sum of squared error (SSE)
- about / Univariate time series analysis
- Support Vector Machines (SVM)
- about / Support Vector Machines
- case study / Business case
- business understanding / Business understanding
- data, understanding / Data understanding and preparation
- data, preparing / Data understanding and preparation
- modeling / SVM modeling
- feature selection / Feature selection for SVMs
- suspected outliers
- about / Hierarchical clustering
- SVM modeling
- versus KNN modeling / Model selection
T
- Term-Document Matrix (TDM)
- text mining
- methods / Text mining framework and methods
- topic models / Topic models
- other quantitative analyses / Other quantitative analyses
- business understanding / Business understanding
- data, understanding / Data understanding and preparation
- data, preparing / Data understanding and preparation
- modeling / Modeling and evaluation
- evaluation / Modeling and evaluation
- word frequency, exploring / Word frequency and topic models
- topic models, building / Word frequency and topic models
- quantitative analysis, with qdap package / Additional quantitative analysis
- topic models
- about / Topic models
- tree-based learning
- about / Gradient boosting
- True Positive Rate (TPR)
- about / Model selection
U
- univariate linear regression
- about / Univariate linear regression
- business understanding / Business understanding
- univariate time series
- analyzing / Univariate time series analysis
- with bivariate regression / Bivariate regression
- analyzing, with Granger causality / Granger causality
- business understanding / Business understanding
- data, understanding / Data understanding and preparation
- data, preparing / Data understanding and preparation
- modeling / Modeling and evaluation
- evaluation / Modeling and evaluation
- forecasting / Univariate time series forecasting
- examining, with regression / Time series regression
- Granger causality, examining / Examining the causality
- user-based collaborative filtering (UBCF)
V
- valence shifters
- about / Other quantitative analyses
- Variance Inflation Factor (VIF) / Modeling and evaluation
- varimax
- about / Rotation
- Vector Autoregression (VAR)
- about / Granger causality
W
- whiskers
- about / Hierarchical clustering