Index
A
- ACF function
- about / Time series summary functions
- activators
- about / The biological neuron
- acyclic graph
- about / A little graph theory
- AdaBoost / AdaBoost
- AdaBoost, for binary classification
- adaptive boosting
- about / AdaBoost
- additive smoothing
- Akaike Information Criterion (AIC)
- algorithms
- building, to train decision trees / Algorithms for training decision trees
- analysis of variance
- ARCH models / Autoregressive conditional heteroscedasticity models
- ARIMA models / Autoregressive integrated moving average models
- ARMA model / Autoregressive moving average models
- artificial neural networks (ANNs)
- about / The biological neuron
- artificial neuron
- about / The artificial neuron
- atmospheric gamma ray radiation
- predicting / Predicting atmospheric gamma ray radiation
- Augmented Dickey-Fuller (ADF) test
- authenticity, of banknotes
- predicting / Predicting the authenticity of banknotes
- author-topic model
- about / LDA extensions
- autocorrelation function
- about / Time series summary functions
- autocovariance function
- about / Time series summary functions
- autoregressive models (AR) / Autoregressive models
- axon
- about / The biological neuron
- axon terminals
- about / The biological neuron
B
- backpropagation algorithm
- backward elimination
- about / Feature selection
- backward selection
- about / Feature selection
- bagging
- about / Bagging
- margin / Margins and out-of-bag observations
- out-of-bag observations / Margins and out-of-bag observations
- complex skill learning, predicting with / Predicting complex skill learning with bagging
- heart disease, predicting with / Predicting heart disease with bagging
- limitations / Limitations of bagging
- bagging, for binary classification
- Banknote Authentication data set
- batch machine learning model / Real-time and batch machine learning models
- Baum-Welch algorithm
- about / Hidden Markov models
- Bayesian Information Criterion (BIC)
- Bayesian networks
- defining / Bayesian networks
- Bayesian probability
- about / Learning from data
- Bayes Theorem
- defining / Bayes' Theorem
- bias
- about / The biological neuron
- Big Data
- handling, in R / R and Big Data
- about / R and Big Data
- binary classification models
- assessing / Assessing binary classification models
- biological neuron
- about / The biological neuron
- boosting
- about / Boosting, Limitations of boosting
- AdaBoost / AdaBoost
- limitations / Limitations of boosting
- bootstrapped samples
- bootstrapping
- about / Bagging
- bootstrap resampling
- about / Bagging
- bootstrap sampling
- about / Bagging
- Box-Cox transformation / Feature transformations
- Brownian Motion
- about / Random walk
C
- C5.0 algorithm
- about / C5.0
- caret package
- CART classification trees
- about / CART classification trees
- CART methodology
- about / Classification and regression trees
- CART regression trees / CART regression trees
- tree pruning / Tree pruning
- missing data / Missing data
- CART regression trees / CART regression trees
- categorical features
- encoding / Encoding categorical features
- characteristic polynomial
- about / Moving average models
- chemical biodegradation
- predicting / Predicting chemical biodegration
- CHI squared
- about / Model deviance
- classification metrics
- about / Classification metrics
- classification model / Regression and classification models
- classification models
- assessing / Assessing classification models
- binary classification models, assessing / Assessing binary classification models
- class membership
- predicting, on synthetic 2D data / Predicting class membership on synthetic 2D data
- clustering
- coefficients
- interpreting, in logistic regression / Interpreting coefficients in logistic regression
- collaborative filtering
- about / Collaborative filtering
- user-based collaborative filtering / User-based collaborative filtering
- item-based collaborative filtering / Item-based collaborative filtering
- complex skill learning
- predicting, with boosting / Predicting complex skill learning with boosting
- complex skills
- predicting / Predicting complex skill learning
- conditional independence
- defining / Conditional independence
- confidence interval
- confusion matrix
- about / Assessing classification models
- Correlated Topic Model (CTM)
- correlogram
- cost-complexity tuning
- about / Tree pruning
- cost function
- about / Stochastic gradient descent
- CPU performance
- predicting / Predicting CPU performance
- credit scores
- predicting / Predicting credit scores
- cross-validation
- about / Feature transformations, Cross-validation
- cycle
- about / A little graph theory
D
- data, pre-processing
- about / Preprocessing the data
- exploratory data analysis / Exploratory data analysis
- feature transformations / Feature transformations
- categorical features, encoding / Encoding categorical features
- data, missing value / Missing data
- outliers / Outliers
- problematic features, removing / Removing problematic features
- data, recommendation systems
- loading / Loading and preprocessing the data
- preprocessing / Loading and preprocessing the data
- exploring / Exploring the data
- binary top-N recommendations, evaluating / Evaluating binary top-N recommendations
- non-binary top-N recommendations, evaluating / Evaluating non-binary top-N recommendations
- individual predictions, evaluating / Evaluating individual predictions
- data columns
- about / Predicting CPU performance
- data set
- LDA_VEM / Modeling the topics of online news stories
- LDA_VEM_? / Modeling the topics of online news stories
- LDA_GIB / Modeling the topics of online news stories
- CTM_VEM / Modeling the topics of online news stories
- data sets
- decision tree models
- about / The intuition for tree models
- decision trees
- training, via building algorithms / Algorithms for training decision trees
- dendrites
- about / The biological neuron
- deviance
- about / Model deviance
- dimensionality reduction / Feature engineering and dimensionality reduction
- directed acyclic graph (DAG)
- about / A little graph theory
- directed graph
- about / A little graph theory
- Dirichlet distribution / The Dirichlet distribution
- Discrete AdaBoost
- about / AdaBoost
- discrete white noise
- about / White noise
- document term matrix
- dynamic programming
- about / Hidden Markov models
E
- Ecotect / Predicting the energy efficiency of buildings
- emitted symbol
- about / Hidden Markov models
- energy efficiency, of buildings
- predicting / Predicting the energy efficiency of buildings
- Energy Efficiency data set
- entropy
- about / C5.0
- expectation
- Expectation Maximization (EM) algorithm
- about / Fitting an LDA model
- exploratory data analysis / Exploratory data analysis
- exponential smoothing
- about / Other time series models
- extensions, binary logistic classifier
- about / Extensions of the binary logistic classifier
- multinomial logistic regression / Multinomial logistic regression
- ordinal logistic regression / Ordinal logistic regression
F
- false negative
- false positive
- false positive rate (FPR) / Evaluating binary top-N recommendations
- feature engineering / Feature engineering and dimensionality reduction
- features
- selecting / Feature selection
- features, data set
- features, used cars
- Price / Predicting the price of used cars
- Mileage / Predicting the price of used cars
- Cylinder / Predicting the price of used cars
- Doors / Predicting the price of used cars
- Cruise / Predicting the price of used cars
- Sound / Predicting the price of used cars
- Leather / Predicting the price of used cars
- Buick / Predicting the price of used cars
- Cadillac / Predicting the price of used cars
- Chevy / Predicting the price of used cars
- Pontiac / Predicting the price of used cars
- Saab / Predicting the price of used cars
- Saturn / Predicting the price of used cars
- convertible / Predicting the price of used cars
- coupe / Predicting the price of used cars
- hatchback / Predicting the price of used cars
- sedan / Predicting the price of used cars
- wagon / Predicting the price of used cars
- feature transformations / Feature transformations
- feedforward neural network
- about / Multilayer perceptron networks
- foreign exchange rates
- predicting / Predicting foreign exchange rates
- forward selection
- about / Feature selection
- frequency domain methods
- about / Other time series models
- Frequentist probability
- about / Learning from data
- F statistic
G
- gamma function
- about / The Dirichlet distribution
- GARCH models / Generalized autoregressive heteroscedasticity models
- generalized linear models (GLMs) / Generalized linear models
- generative models
- generative process
- about / The generative process
- working / The generative process
- German Credit Dataset
- about / Predicting credit scores
- URL / Predicting credit scores
- Gini index
- calculating / CART classification trees
- glass type revisited
- predicting / Predicting glass type revisited
- gradient descent
- about / Stochastic gradient descent
- graphical models
- about / A little graph theory
- graphs
- about / A little graph theory
- graph theory
- about / A little graph theory
H
- handwritten digits
- predicting / Predicting handwritten digits
- URL / Predicting handwritten digits
- ROC curve / Receiver operating characteristic curves
- HMM
- defining / Hidden Markov models
- hyperplanes
- about / Maximal margin classification
- property / Maximal margin classification
I
- ID3
- about / C5.0
- Independence of Irrelevant Alternatives (IIA)
- about / Multinomial logistic regression
- independent and identically distributed (iid)
- information statistic
- about / C5.0
- inhibitors
- about / The biological neuron
- inner products / Inner products
- intense earthquakes
- predicting / Predicting intense earthquakes
- intercept
- interquartile range
- about / Residual analysis
- invertible
- about / Moving average models
- item-based collaborative filtering
J
- J48
- about / C5.0
K
- k-fold cross-validation
- about / Cross-validation
- k-nearest neighbors / Our first model: k-nearest neighbors
- Kappa statistic
- defining / Assessing classification models
- kernel functions
- kernels
- kNN
- Kwiatkowski-Phillips-Schmidt-Shin (KPSS) test
L
- laplacian smoothing
- LDA
- defining / Latent Dirichlet Allocation
- Dirichlet distribution / The Dirichlet distribution
- generative process / The generative process
- model, fitting / Fitting an LDA model
- LDA extensions / LDA extensions
- LDA model
- fitting / Fitting an LDA model
- training / Modeling the topics of online news stories
- least absolute shrinkage / Least absolute shrinkage and selection operator (lasso)
- letter patterns
- predicting, in English words / Predicting letter patterns in English words
- Likert scale
- about / Ordinal logistic regression
- linear kernel
- linear regression
- about / Introduction to linear regression
- assumptions / Assumptions of linear regression
- classifying with / Classifying with linear regression
- linear regression models
- assessing / Assessing linear regression models
- residual analysis / Residual analysis
- tests, used for / Significance tests for linear regression
- performance metrics / Performance metrics for linear regression
- comparing / Comparing different regression models
- outliers / Outliers
- link function
- about / Generalized linear models
- local kernel
- log-odds
- about / Generalized linear models
- logistic function
- logistic neuron / The logistic neuron
- logistic regression
- about / Introduction to logistic regression
- generalized linear models (GLMs) / Generalized linear models
- coefficients, interpreting / Interpreting coefficients in logistic regression
- assumptions / Assumptions of logistic regression
- maximum likelihood estimation / Maximum likelihood estimation
- logistic regression models
- assessing / Assessing logistic regression models
- model deviance / Model deviance
- test set performance / Test set performance
- logit function
- about / Generalized linear models
- lynx trappings
- predicting / Predicting lynx trappings
M
- M5
- about / Regression model trees
- magic
- MAGIC Gamma Telescope data set
- margin
- about / Maximal margin classification
- Markov Chain Monte Carlo (MCMC)
- about / Fitting an LDA model
- matrix, recommendation systems
- rating / Rating matrix
- user similarity, measuring / Measuring user similarity
- Matrix Market format
- maximal margin classification
- about / Maximal margin classification
- maximal margin hyperplane
- about / Maximal margin classification
- maximum likelihood estimation / Maximum likelihood estimation
- McCulloch-Pitts model of a neuron
- about / The artificial neuron
- mean
- mean average error (MAE) / Evaluating individual predictions
- mean function
- mean squared error (MSE) / Evaluating individual predictions
- Mean Square Error (MSE)
- about / Assessing regression models
- mean square error (MSE)
- median
- about / Residual analysis
- Missing At Random (MAR)
- about / Missing data
- Missing Completely At Random (MCAR)
- about / Missing data
- missing data / Missing data
- Missing Not At Random (MNAR)
- about / Missing data
- missing values
- handling / Missing data
- mixed selection
- about / Feature selection
- MLP network
- about / Multilayer perceptron networks
- characteristic / Multilayer perceptron networks
- advantages / Multilayer perceptron networks
- training / Training multilayer perceptron networks
- evaluating, for regression / Evaluating multilayer perceptrons for regression
- model
- task requirements / Picking a model
- selecting / Repeating with different models and final model selection
- model, deploying
- guidelines / Deploying the model
- model deviance / Model deviance
- model parameters
- tuning, in CART trees / Tuning model parameters in CART trees
- variable importance, in tree models / Variable importance in tree models
- regression model trees / Regression model trees in action
- models
- about / Models
- data, defining / Learning from data
- components / The core components of a model
- k-nearest neighbors / Our first model: k-nearest neighbors
- model types
- about / Types of models
- supervised model / Supervised, unsupervised, semi-supervised, and reinforcement learning models
- unsupervised model / Supervised, unsupervised, semi-supervised, and reinforcement learning models
- semi-supervised model / Supervised, unsupervised, semi-supervised, and reinforcement learning models
- reinforcement learning model / Supervised, unsupervised, semi-supervised, and reinforcement learning models
- parametric model / Parametric and nonparametric models
- nonparametric model / Parametric and nonparametric models
- regression model / Regression and classification models
- classification model / Regression and classification models
- real-time machine learning model / Real-time and batch machine learning models
- batch machine learning model / Real-time and batch machine learning models
- moving average (MA)
- about / Moving average models
- moving average models / Moving average models
- multi-class classification, with support vector machines
- multinom()function
- about / Predicting glass type
- multinomial logistic regression
- about / Multinomial logistic regression
- glass type, predicting / Predicting glass type
- multiple linear regression
- about / Introduction to linear regression, Multiple linear regression
- CPU performance, predicting / Predicting CPU performance
- price of used cars, predicting / Predicting the price of used cars
N
- Naïve Bayes Classifier
- about / The Naïve Bayes classifier
- movie reviews, predicting / Predicting the sentiment of movie reviews
- nodes
- non-stationary time series models
- about / Non-stationary time series models
- ARIMA models / Autoregressive integrated moving average models
- ARCH models / Autoregressive conditional heteroscedasticity models
- GARCH models / Generalized autoregressive heteroscedasticity models
- nonparametric model / Parametric and nonparametric models
- nucleotides
- nucleus
- about / The biological neuron
- null deviance
- about / Model deviance
- null model
O
- odds ratio
- about / Generalized linear models
- options, feature scaling
- about / Feature transformations
- Z-score normalization / Feature transformations
- unit interval / Feature transformations
- Box-Cox transformation / Feature transformations
- ordered factor
- about / Predicting glass type
- ordinal logistic regression
- about / Predicting glass type, Ordinal logistic regression
- wine quality, predicting / Predicting wine quality
- out-of-bag (OOB)
- out-of-bag observations / Margins and out-of-bag observations
- outliers / Outliers, Outliers
- Overfitting
- about / Training and assessing the model
- limitations / Random forests
P
- p-value
- parametric model / Parametric and nonparametric models
- partial autocorrelation function (PACF)
- about / Autoregressive models
- Perception Action Cycles (PACs)
- perceptron algorithm
- performance metrics
- about / Performance metrics
- regression models, assessing / Assessing regression models
- classification models, assessing / Assessing classification models
- performance metrics, for linear regression / Performance metrics for linear regression
- pocket perceptron algorithm
- about / The perceptron algorithm
- input / The perceptron algorithm
- output / The perceptron algorithm
- method / The perceptron algorithm
- polynomial kernel
- Porter Stemmer
- post-pruning
- about / Tree pruning
- Precision
- predictive modeling
- process / The process of predictive modeling
- objective, defining / Defining the model's objective
- data, collecting / Collecting the data
- model, selecting / Picking a model, Repeating with different models and final model selection
- data, pre-processing / Preprocessing the data
- feature engineering / Feature engineering and dimensionality reduction
- dimensionality reduction / Feature engineering and dimensionality reduction
- model, training / Training and assessing the model
- model, assessing / Training and assessing the model
- model, deploying / Deploying the model
- price, of used cars
- predicting / Predicting the price of used cars
- Principal Component Analysis (PCA)
- Principles and Practice of Knowledge Discovery in Databases
- probabilistic graphical models
- about / A little graph theory
- probability
- about / Learning from data
- promoter gene sequences
- predicting / Predicting promoter gene sequences
- proportional odds
- about / Ordinal logistic regression
- pruning
- about / Tree pruning
Q
- Q-Q plots
- about / Residual analysis
- QSAR biodegradation
- Quantile-Quantile plot (Q-Q plot)
- about / Residual analysis
R
- R
- regularization, implementing / Implementing regularization in R
- radial basis function kernel
- radial kernel
- random forest
- about / Random forests
- variables, defining / The importance of variables in random forests
- random walk
- about / Random walk
- fitting / Fitting a random walk
- real-time machine learning model / Real-time and batch machine learning models
- real-time strategy (RTS)
- Recall
- recommendation systems
- Big Data, handling, in R / R and Big Data
- building, for movies and jokes / Predicting recommendations for movies and jokes
- other approaches / Other approaches to recommendation systems
- recursive partitioning
- about / CART regression trees
- regression coefficients
- about / Introduction to linear regression
- estimating / Estimating the regression coefficients
- regression model / Regression and classification models
- regression models
- assessing / Assessing regression models
- regression model trees
- about / Regression model trees
- drawbacks / Regression model trees
- regularization
- about / Regularization
- ridge regression / Ridge regression
- least absolute shrinkage / Least absolute shrinkage and selection operator (lasso)
- selection operator (lasso) / Least absolute shrinkage and selection operator (lasso)
- implementing, in R / Implementing regularization in R
- with lasso / Regularization with the lasso
- reinforcement learning model / Supervised, unsupervised, semi-supervised, and reinforcement learning models
- residual analysis / Residual analysis
- residual sum of squares (RSS)
- ridge regression / Ridge regression
- ROC Area Under the Curve (ROC AUC) / Receiver operating characteristic curves
- ROC curve / Receiver operating characteristic curves
- root mean square error (RMSE) / Evaluating individual predictions
- Root Mean Square Error (RMSE)
- about / Assessing regression models
- root node
- about / The intuition for tree models
S
- selection operator (lasso) / Least absolute shrinkage and selection operator (lasso)
- semi-supervised model / Supervised, unsupervised, semi-supervised, and reinforcement learning models
- sensitivity
- about / Classification metrics
- sentiment analysis
- Sentiment node
- simple linear regression model
- about / Introduction to linear regression, Simple linear regression
- advantages / Simple linear regression
- regression coefficients, estimating / Estimating the regression coefficients
- singular value decomposition (SVD)
- about / Singular value decomposition
- Singular Value Decomposition (SVD)
- about / Removing problematic features
- SkillCraft
- SkillCraft1 Master Table data set
- slack variables
- about / Support vector classification
- softmax function
- about / Multinomial logistic regression
- spectral methods
- about / Other time series models
- splines
- stationarity
- about / Stationarity
- stationary time series models
- about / Stationary time series models
- moving average models / Moving average models
- autoregressive models (AR) / Autoregressive models
- ARMA model / Autoregressive moving average models
- Statlog (Heart) data set
- working with / Predicting heart disease
- step function
- about / The artificial neuron
- stepwise regression
- about / Feature selection
- stochastic gradient boosting
- stochastic gradient descent
- defining / Stochastic gradient descent
- about / Stochastic gradient descent
- gradient descent / Gradient descent and local minima
- local minima / Gradient descent and local minima
- perceptron algorithm / The perceptron algorithm, Linear separation
- logistic neuron / The logistic neuron
- stochastic model
- about / Learning from data
- example / Learning from data
- stochastic process
- stump
- about / Boosting
- Stuttgart Neural Network Simulator (SNNS) / Predicting handwritten digits
- sum of squared error (SSE) / Predicting the energy efficiency of buildings
- about / CART regression trees
- Sum of Squared Error (SSE)
- about / Assessing regression models
- sum of squared errors (SSE)
- supervised model / Supervised, unsupervised, semi-supervised, and reinforcement learning models
- support vector classification
- about / Support vector classification
- inner products / Inner products
- support vector machines
- Symmetric Dirichlet Distribution
- about / The Dirichlet distribution
- synaptic neurotransmitters
- about / The biological neuron
- synthetic 2D data
- class membership, predicting on / Predicting class membership on synthetic 2D data
T
- tests, for linear regression
- test set
- about / Training and assessing the model
- time-domain methods
- about / Other time series models
- time series
- defining / Fundamental concepts of time series
- summary functions / Time series summary functions
- examples / Some fundamental time series
- white noise / White noise
- random walk / Random walk
- time series models
- defining / Other time series models
- topic modeling
- about / An overview of topic modeling
- topics, of online news stories
- modeling / Modeling the topics of online news stories
- model stability / Model stability
- number of topics, finding / Finding the number of topics
- topic distribution / Topic distributions
- word distributions / Word distributions
- LDA extensions / LDA extensions
- total sum of squares (TSS)
- training set
- about / Training and assessing the model
- tree models
- defining / The intuition for tree models
- tree pruning / Tree pruning
- trend
- about / Other time series models
- true negatives
- true positives
- true sum of squares (TSS)
- about / Model deviance
- Type I error
- Type II error
U
- UCI Machine Learning Repository
- UCI Machine Learning repository
- UCI Machine Repository
- URL / Predicting heart disease
- unit interval / Feature transformations
- unit root tests
- unsupervised model / Supervised, unsupervised, semi-supervised, and reinforcement learning models
- user-based collaborative filtering
V
- Variational Expectation Maximization (VEM)
- Viterbi algorithm
- about / Hidden Markov models
W
- wavelet transform
- weight
- about / The biological neuron
- white noise time series
- about / White noise
- fitting / Fitting a white noise time series
- wine quality
- URL / Predicting wine quality
- word cloud
- about / Word distributions
- word distributions / Word distributions
Z
- Z-score normalization / Feature transformations