Index
A
- activation function / The biological neuron
- activators / The biological neuron
- acyclic graph / A little graph theory
- AdaBoost / AdaBoost
- for binary classification / AdaBoost for binary classification
- adaptive boosting / AdaBoost
- additive smoothing / Predicting the sentiment of movie reviews
- adjacency matrix / A little graph theory
- adjusted R2 / Comparing different regression models
- Akaike Information Criterion (AIC) / Comparing different regression models
- alternative approach
- about / Alternatives
- chunking / Chunking
- alternative language integrations / Alternative language integrations, Summary
- analysis of variance / Significance tests for linear regression
- artificial neural networks (ANN's) / Improvements to the M5 model
- artificial neural networks (ANNs) / The biological neuron, Artificial neural networks
- reference link / Artificial neural networks
- artificial neuron / The artificial neuron
- association / Correlated data analyses
- atmospheric gamma ray radiation
- prediction, example / Predicting atmospheric gamma ray radiation
- attributes / Learning from data
- author-topic model / LDA extensions
- axon / The biological neuron
- axon terminals / The biological neuron
B
- back-propagation algorithm / The back propagation algorithm
- backpropagation algorithm / Training multilayer perceptron networks
- backward selection / Feature selection
- bagging
- about / Bagging
- out-of-bag observations (OOB) / Margins and out-of-bag observations
- margin / Margins and out-of-bag observations
- out-of-bag observations / Margins and out-of-bag observations
- complex skill learning, prediction example / Predicting complex skill learning with bagging
- heart disease, prediction example / Predicting heart disease with bagging
- limitations / Limitations of bagging
- banknote authentication dataset
- reference link / Predicting the authenticity of banknotes
- banknotes
- authenticity, predicting / Predicting the authenticity of banknotes
- batch learning / Real-time and batch machine learning models
- Baum-Welch algorithm / Predicting the sentiment of movie reviews
- Bayes' theorem / Bayes' theorem
- Bayesian Information Criterion (BIC) / Comparing different regression models
- Bayesian networks / Bayesian networks
- Bayesian probability / Learning from data
- better prediction / Experience
- data of scale / Data of scale – big data
- bias / The biological neuron
- bias-variance tradeoff / Training and assessing the model
- bidirectional elimination / Feature selection
- big data / Data of scale – big data
- characteristics / Characteristics of big data
- volume / Volume
- varieties / Varieties
- sources and spans / Sources and spans
- structure / Structure
- statistical noise / Statistical noise
- binary logistic classifier
- extensions / Extensions of the binary logistic classifier
- multinomial logistic regression / Multinomial logistic regression
- binary threshold neuron / The logistic neuron
- biological neuron / The biological neuron
- boosting
- about / Boosting
- AdaBoost / AdaBoost
- limitations / Limitations of boosting
- bootstrapped samples / Margins and out-of-bag observations
- Box-Cox transformation / Feature transformations
C
- C4.5 algorithm / C5.0
- C5.0 algorithm / C5.0
- CART classification trees / CART classification trees
- chemical biodegration
- prediction, example / Predicting chemical biodegration
- child / Bayesian networks
- classification and regression tree (CART) / Classification and regression trees
- CART regression trees / CART regression trees
- tree pruning / Tree pruning
- data, missing / Missing data
- classification metrics / Classification metrics
- classification models / Regression and classification models
- accessing / Assessing classification models
- binary classification models, assessing / Assessing binary classification models
- class membership
- predicting, on synthetic 2D data / Predicting class membership on synthetic 2D data
- clustering / Supervised, unsupervised, semi-supervised, and reinforcement learning models
- cold start problem / Other approaches to recommendation systems
- collaborative filtering / Collaborative filtering
- user-based collaborative filtering / User-based collaborative filtering
- item-based collaborative filtering / Item-based collaborative filtering
- collinearity / Multicollinearity
- complex skill learning
- predicting / Predicting complex skill learning
- model parameters, tuning in CART trees / Tuning model parameters in CART trees
- variable importance, in tree models / Variable importance in tree models
- regression model trees, in action / Regression model trees in action
- predicting, with boosting / Predicting complex skill learning with boosting
- conditional independence / Conditional independence
- conditionally independent / Conditional independence
- conditional probability / Bayes' theorem
- confirmatory factor analysis (CFA) / Explore and confirm
- content-based recommendation systems / Other approaches to recommendation systems
- convex function / Gradient descent and local minima
- Cook's distance / Outliers
- corpus / Predicting the sentiment of movie reviews
- correlated data / Correlated data analyses
- Correlated Topic Model (CTM) / The generative process, Modeling the topics of online news stories
- correlation / Performance metrics for linear regression
- correlation coefficient / Scatterplots
- Correlation works for quantifiable data / Scatterplots
- cosine distance / Measuring user similarity
- cost-complexity tuning / Tree pruning
- cost function / Stochastic gradient descent
- covariance / Estimating the regression coefficients
- credit scores
- prediction, example / Predicting credit scores
- cross-validation / Cross-validation
- curse of dimensionality / Feature engineering and dimensionality reduction
- cycle / A little graph theory
D
- 3D Data Management
- reference link / Data of scale – big data
- data
- pre-processing / Pre-processing the data, Loading and pre-processing the data
- exploratory data analysis / Exploratory data analysis
- feature transformations / Feature transformations
- categorical features, encoding / Encoding categorical features
- missing / Missing data
- outliers / Outliers
- problematic features, removing / Removing problematic features
- loading / Loading and pre-processing the data
- exploring / Exploring the data
- binary top-N recommendations, evaluating / Evaluating binary top-N recommendations
- non-binary top-N recommendations, evaluating / Evaluating non-binary top-N recommendations
- individual predictions, evaluating / Evaluating individual predictions
- Data profiling / Principal component analysis
- data quality
- categorizing / Categorizing data quality
- steps / The first step, The next step, The final step
- data quality assurance (DQA) / Categorizing data quality
- dataset
- URL, for downloading / Predicting heart disease
- datasets
- reference link / Modeling the topics of online news stories
- data sparsity / Feature engineering and dimensionality reduction
- decision boundaries / Training and assessing the model
- decision tree / The intuition for tree models
- regression model trees / Regression model trees
- CART classification trees / CART classification trees
- decision trees
- algorithms, for training / Algorithms for training decision trees
- classification and regression tree (CART) / Classification and regression trees
- deep learning
- about / Machine learning or deep learning, What is deep learning?
- alternative, to manual instruction / An alternative to manual instruction
- importance / Growing importance
- deeper data / Deeper data?
- for IOT / Deep learning for IoT
- use cases / Use cases
- implementations / Implementations
- architectures / Deep learning architectures
- unsupervised pre-trained neural networks / Deep learning architectures
- convolutional neural networks / Deep learning architectures
- recursive neural networks / Deep learning architectures
- deep structured learning
- about / What is deep learning?
- delta update rule / Training multilayer perceptron networks
- dendrites / The biological neuron
- dependence / Correlated data analyses
- dependent variable / Learning from data
- deploying to production / Deploying the model
- descendant / Bayesian networks
- deterministic model / Learning from data
- deviance / Model deviance
- deviance residual / Model deviance
- Dimensional (ity) reduction
- about / Defining DR
- defining / Defining DR
- correlated data analyses / Correlated data analyses
- scatterplots / Scatterplots
- causation / Causation
- degree of correlation / The degree of correlation
- correlation, reporting / Reporting on correlation
- principal component analysis (PCA) / Principal component analysis
- R, used for PCA / Using R to understand PCA
- independent component analysis (ICA) / Independent component analysis
- independence, defining / Defining independence
- ICA pre-processing / ICA pre-processing
- factor analysis / Factor analysis
- explore / Explore and confirm
- confirm / Explore and confirm
- R programming language, used for factor analysis / Using R for factor analysis
- output / The output
- Non-negative matrix factorization (NMF, NNMF) / NNMF
- dimensionality reduction / Feature engineering and dimensionality reduction
- dimensions / Learning from data
- direct descendant / Bayesian networks
- directed acyclic graph (DAG) / A little graph theory
- directed graph / A little graph theory
- Dirichlet distribution / Latent Dirichlet Allocation
- Discrete AdaBoost / AdaBoost
- document term matrix / Predicting the sentiment of movie reviews
- dynamic programming / Predicting the sentiment of movie reviews
E
- edges / A little graph theory
- emission probability matrix / Predicting the sentiment of movie reviews
- emitted symbol / Predicting the sentiment of movie reviews
- energy efficiency dataset
- reference link / Predicting the energy efficiency of buildings
- multilayer perceptrons, evaluating for regression / Evaluating multilayer perceptrons for regression
- energy efficiency of buildings
- predicting / Predicting the energy efficiency of buildings
- entropy / Predicting glass type revisited, C5.0
- epoch / The perceptron algorithm
- Euclidean distance / Our first model – k-nearest neighbors, Measuring user similarity
- expectation / Estimating the regression coefficients
- Expectation Maximization (EM) algorithm / Fitting an LDA model
- experimental design / Assumptions of linear regression
- explicit feedback / Other approaches to recommendation systems
- exploratory data analysis / Exploratory data analysis
- exploratory factor analysis (EFA) / Explore and confirm
- Extreme Gradient Boosting (XGBoost) / XGBoost
F
- factor analysis / Principal component analysis, Factor analysis
- false positive rate(FPR) / Evaluating binary top-N recommendations
- features / Learning from data
- feature selection / Feature engineering and dimensionality reduction, Feature selection
- feedforward neural network / Multilayer perceptron networks
- forward propagation / Multilayer perceptron networks
- forward selection / Feature selection
- Frequentist probability / Learning from data
- F statistic / Significance tests for linear regression
- functional form / Parametric and nonparametric models
G
- Games Won and Merchandise Sold / Scatterplots
- gamma function / The Dirichlet distribution
- generalized linear models (GLMs) / Generalized linear models
- generative models / Predicting letter patterns in English words
- generative process / The generative process
- gene transcription / Predicting promoter gene sequences
- German Credit Dataset
- reference link / Predicting credit scores
- Gini index / CART classification trees
- glass type revisited
- predicting / Predicting glass type revisited
- gradient descent / Stochastic gradient descent
- graphical models / A little graph theory
- graphs / A little graph theory
- graph theory
- about / A little graph theory
- reference link / A little graph theory
- greedy algorithm / Feature selection
H
- handwritten digits
- predicting / Predicting handwritten digits
- receiver operating characteristic (ROC) / Receiver operating characteristic curves
- head vertex / A little graph theory
- heart disease
- prediction, example / Predicting heart disease
- heteroskedastic errors / Assumptions of linear regression
- hidden layer / Radial basis function networks
- hidden layers / The logistic neuron
- Hidden Markov model (HMM) / Predicting the sentiment of movie reviews
- hidden states / Predicting the sentiment of movie reviews
- hierarchical learning
- about / What is deep learning?
- high influence / Outliers
- high leverage points / Outliers
- homoscedasticity / Assumptions of linear regression
I
- ID3 / C5.0
- implementations / Implementations
- implicit feedback / Other approaches to recommendation systems
- independence
- Minimization of mutual information (MMI) / Defining independence
- Non-Gaussianity Maximization (NGM) / Defining independence
- Independence of Irrelevant Alternatives (IIA) / Multinomial logistic regression
- independent and identically distributed (iid) / Margins and out-of-bag observations
- independent variables / Learning from data
- inflection / Predicting the sentiment of movie reviews
- information gain / C5.0
- information gain ratio / C5.0
- information statistic / C5.0
- inhibitors / The biological neuron
- inner node / The intuition for tree models
- inner products / Inner products
- input layer / The logistic neuron, Radial basis function networks
- intercept / Introduction to linear regression
- internet of things (IoT) / Deep learning for IoT
- interquartile range / Residual analysis
- invariance property of maximum likelihood / Assumptions of logistic regression
- irreducible error / Learning from data
- item-based collaborative filtering / Collaborative filtering
J
- J48 / C5.0
- Jaccard similarity / Measuring user similarity
- Jester system
- references / Predicting recommendations for movies and jokes
K
- k-nearest neighbors
- kernel function / Kernels and support vector machines
- kernels / Kernels and support vector machines
- k neighbors / Our first model – k-nearest neighbors
- knowledge-based recommendation systems / Other approaches to recommendation systems
L
- laplacian smoothing / Predicting the sentiment of movie reviews
- lasso / Least absolute shrinkage and selection operator (lasso)
- Latent Dirichlet Allocation (LDA) / Latent Dirichlet Allocation
- Dirichlet distribution / The Dirichlet distribution
- generative process / The generative process
- fitting / Fitting an LDA model
- latent states / Predicting the sentiment of movie reviews
- lazy learner / Our first model – k-nearest neighbors
- leaf nodes / The intuition for tree models
- learning curves / Collecting the data, Learning curves
- ping / Plot and ping
- plot / Plot and ping
- learning rate / Stochastic gradient descent
- left child / The intuition for tree models
- left subtree / The intuition for tree models
- lemma / Predicting the sentiment of movie reviews
- likelihood / Maximum likelihood estimation
- Likert scale / Ordinal logistic regression
- linear combinations / Removing problematic features
- linear kernel / Kernels and support vector machines
- linearly separable / Linear separation
- linear neuron / The logistic neuron
- linear regression
- about / Introduction to linear regression
- assumptions / Assumptions of linear regression
- issues / Problems with linear regression
- multicollinearity / Multicollinearity
- outlier / Outliers
- classifying / Classifying with linear regression
- linear regression models
- assessing / Assessing linear regression models
- residual analysis / Residual analysis
- significance tests / Significance tests for linear regression
- confidence interval / Significance tests for linear regression
- performance metrics / Performance metrics for linear regression
- comparing / Comparing different regression models
- test set performance / Test set performance
- link function / Generalized linear models
- loadings / Feature engineering and dimensionality reduction
- local kernel / Kernels and support vector machines
- local Markov property / Bayesian networks
- log-odds / Generalized linear models
- logistic function / Introduction to logistic regression
- logistic neuron / The logistic neuron
- logistic regression
- about / Introduction to logistic regression
- generalized linear models (GLMs) / Generalized linear models
- coefficients, interpreting / Interpreting coefficients in logistic regression
- assumptions / Assumptions of logistic regression
- likelihood estimation, maximizing / Maximum likelihood estimation
- logistic regression models
- assessing / Assessing logistic regression models
- model deviance / Model deviance
- test set performance / Test set performance
- logit function / Generalized linear models
- log likelihood / Maximum likelihood estimation
M
- M5 algorithm / Regression model trees
- M5 model
- improvements / Improvements to the M5 model
- Global optimization / Improvements to the M5 model
- Greedy searching / Improvements to the M5 model
- M5opt / Improvements to the M5 model
- machine learning / Machine learning or deep learning
- Major Atmospheric Gamma Imaging Cherenkov (MAGIC) / Limitations of boosting
- Mallow's Cp / Comparing different regression models
- margin / Maximal margin classification, Margins and out-of-bag observations
- Markov Chain Monte Carlo (MCMC) / Fitting an LDA model
- matrix energy / Singular value decomposition
- Matrix Market format / Modeling the topics of online news stories
- maximal margin classification / Maximal margin classification
- maximal margin hyperplane / Maximal margin classification
- McCulloch-Pitts model / The artificial neuron
- mean / Estimating the regression coefficients
- mean average error(MAE) / Evaluating individual predictions
- mean function / Generalized linear models
- mean squared error (MSE) / Evaluating individual predictions
- Mean Square Error (MSE) / Assessing regression models
- median / Residual analysis
- memory-based collaborative filtering / Collaborative filtering
- merely states / Predicting the sentiment of movie reviews
- meta parameter / Ridge regression
- Missing At Random (MAR) / Missing data
- Missing Completely At Random (MCAR) / Missing data
- Missing Not At Random (MNAR) / Missing data
- mixed selection / Feature selection
- MNIST database
- reference link / Predicting handwritten digits
- model-based collaborative filtering / Collaborative filtering
- model overfitting / Training and assessing the model
- models
- about / Models
- data, learning / Learning from data
- core components / The core components of a model
- k-nearest neighbors / Our first model – k-nearest neighbors
- types / Types of model
- supervised models / Supervised, unsupervised, semi-supervised, and reinforcement learning models
- unsupervised models / Supervised, unsupervised, semi-supervised, and reinforcement learning models
- semi-supervised models / Supervised, unsupervised, semi-supervised, and reinforcement learning models
- reinforcement learning models / Supervised, unsupervised, semi-supervised, and reinforcement learning models
- parametric models / Parametric and nonparametric models
- nonparametric models / Parametric and nonparametric models
- classification models / Regression and classification models
- regression models / Regression and classification models
- real-time machine learning models / Real-time and batch machine learning models
- batch machine learning models / Real-time and batch machine learning models
- training, at scale / Training models at scale
- pain, by phase / Pain by phase
- specific challenges / Specific challenges
- heterogeneity / Heterogeneity
- scale / Scale
- location / Location
- timeliness / Timeliness
- privacy / Privacy
- collaborations / Collaborations
- reproducibility / Reproducibility
- path forward / A path forward
- opportunities / Opportunities
- bigger data / Bigger data, bigger hardware
- bigger hardware / Bigger data, bigger hardware
- breaking up / Breaking up
- sampling / Sampling
- aggregation / Aggregation
- dimensional reduction / Dimensional reduction
- molecular biology
- reference link / Predicting promoter gene sequences
- morphological lemma / Predicting the sentiment of movie reviews
- multiclass classification
- with support vector machine / Multiclass classification with support vector machines
- multicollinearity / Multicollinearity
- multilayer perceptron (MLP) / Multilayer perceptron networks
- multilayer perceptron networks / Multilayer perceptron networks
- training / Training multilayer perceptron networks
- multinomial distribution / The Dirichlet distribution
- multinomial logistic regression / Multinomial logistic regression
- about / Multinomial logistic regression
- glass type, predicting / Predicting glass type
- multiple linear regression / Introduction to linear regression
- about / Multiple linear regression
- CPU performance, predicting / Predicting CPU performance
- cars price, predicting / Predicting the price of used cars
N
- named entity recognition / Predicting the sentiment of movie reviews
- Natural language processing (NLP) / Word embedding
- Naïve Bayes classifier
- about / The Naïve Bayes classifier
- sentiment of reviews, prediction example / Predicting the sentiment of movie reviews
- promoter gene sequences, prediction example / Predicting promoter gene sequences
- letter patterns, prediction example / Predicting letter patterns in English words
- Negative Binomial regression / Negative Binomial regression
- Netflix learns / Netflix learns
- Neural Network Simulator (SNNS) / Predicting handwritten digits
- nodes / The intuition for tree models, A little graph theory
- Non-negative matrix factorization (NMF, NNMF) / NNMF
- nondeterministic component / Learning from data
- nonparametric models / Parametric and nonparametric models
- norm / Least absolute shrinkage and selection operator (lasso)
- nucleotides / Predicting promoter gene sequences
- nucleus / The biological neuron
- null deviance / Model deviance
- null model / Significance tests for linear regression
- numerical representations
- of contextual similarities / Numerical representations of contextual similarities
O
- observation / Learning from data
- observations / Predicting the sentiment of movie reviews
- observed states / Predicting the sentiment of movie reviews
- odds ratio / Generalized linear models
- one versus all approach / Multiclass classification with support vector machines
- one versus one approach / Multiclass classification with support vector machines
- ordered factor / Predicting glass type
- ordinal logistic regression / Predicting glass type, Ordinal logistic regression
- wine quality, prediction example / Predicting wine quality
- out-of-bag observations (OOB) / Margins and out-of-bag observations
- outlier / Outliers
- output / Learning from data
- output layer / The logistic neuron, Radial basis function networks
P
- p-norms / Least absolute shrinkage and selection operator (lasso)
- p-value / Significance tests for linear regression
- parent / Bayesian networks
- part of speech tagging / Predicting the sentiment of movie reviews
- path / A little graph theory
- Perception Action Cycles (PACs) / Predicting complex skill learning
- perceptron convergence theorem / Linear separation
- perfect collinearity / Multicollinearity
- performance metrics
- about / Performance metrics
- regression models, accessing / Assessing regression models
- classification models, accessing / Assessing classification models
- pocket perceptron algorithm / The perceptron algorithm
- Poisson regression / Poisson regression
- polynomial kernel / Kernels and support vector machines
- polynomial regression / Polynomial regression
- population regression line / Simple linear regression
- Porter Stemmer
- post-pruning / Tree pruning
- posterior probability / Bayes' theorem
- pre-pruning / Tree pruning
- predictive analytics project
- begining / Starting the project
- data definition / Data definition
- experience / Experience
- Excel, used to gauge data / Using Excel to gauge your data
- predictive modelling
- process / The process of predictive modeling
- model's objective, defining / Defining the model's objective
- data, collecting / Collecting the data
- model, picking / Picking a model
- data, pre-processing / Pre-processing the data
- dimensionality reduction / Feature engineering and dimensionality reduction
- feature engineering / Feature engineering and dimensionality reduction
- model, training / Training and assessing the model
- model, assessing / Training and assessing the model
- different models, repeating / Repeating with different models and final model selection
- final model selection / Repeating with different models and final model selection
- model, deploying / Deploying the model
- predictors / Learning from data
- principal component analysis (PCA) / Principal component analysis
- Principal Component Analysis (PCA) / Removing problematic features, Feature engineering and dimensionality reduction
- principal components / Feature engineering and dimensionality reduction, Principal component analysis
- prior distribution / The Dirichlet distribution
- prior probability / Bayes' theorem
- probabilistic graphical models / A little graph theory
- promoter sequences / Predicting promoter gene sequences
- proportional odds / Ordinal logistic regression
- pruning / Tree pruning
- pseudo R2 value / Model deviance
Q
- QSARbiodegration data set
- reference link / Predicting chemical biodegration
- Quantile-Quantile plot (Q-Q plot) / Residual analysis
- quantiles / Residual analysis
R
- R2 statistic / Performance metrics for linear regression
- Radial basis function (RBF) / Radial basis function networks
- radial basis function kernel / Kernels and support vector machines
- Radial Basis Function Network / Predicting chemical biodegration
- radial basis function network / Radial basis function networks
- radial kernel / Kernels and support vector machines
- random forest
- about / Random forests
- variables, importance / The importance of variables in random forests
- Extreme Gradient Boosting (XGBoost) / XGBoost
- random variable / Learning from data
- rating matrix / Rating matrix
- user similarity, measuring / Measuring user similarity
- real-time machine learning / Real-time and batch machine learning models
- real-time strategy (RTS) / Predicting complex skill learning
- Real AdaBoost / AdaBoost
- receiver operating characteristic (ROC) / Receiver operating characteristic curves
- recommendations
- prediction, example / Predicting recommendations for movies and jokes
- recommendation system / Rating matrix
- recommendation systems
- approaches / Other approaches to recommendation systems
- recurrent neural network / Multilayer perceptron networks
- recurrent neural networks (RNNs)
- about / Artificial neural networks, Recurrent neural networks
- reference link / Recurrent neural networks
- recursive partitioning / CART regression trees
- reducible error / Learning from data
- regression coefficients / Introduction to linear regression
- estimating / Estimating the regression coefficients
- regression models / Regression and classification models
- accessing / Assessing regression models
- regression model trees / Regression model trees
- regularization
- about / Regularization
- ridge regression / Ridge regression
- least absolute shrinkage / Least absolute shrinkage and selection operator (lasso)
- selection operator / Least absolute shrinkage and selection operator (lasso)
- implementing, in R / Implementing regularization in R
- with lasso / Regularization with the lasso
- reinforcement learning / Supervised, unsupervised, semi-supervised, and reinforcement learning models
- residual / Assumptions of linear regression
- residual plot / Residual analysis
- Residual Standard Error (RSE) / Performance metrics for linear regression
- Residual Sum of Squares (RSS) / Performance metrics for linear regression, Model deviance
- reusability / Getting started
- ridge regression / Ridge regression
- right child / The intuition for tree models
- right subtree / The intuition for tree models
- ROC Area Under the Curve (ROC AUC) / Receiver operating characteristic curves
- Root Mean Square Error (RMSE) / Assessing regression models, Evaluating individual predictions
- root node / The intuition for tree models
- Rosenblatt perceptron / The artificial neuron
S
- saturation / Predicting the energy efficiency of buildings
- scree plot / Feature engineering and dimensionality reduction
- selection bias / C5.0
- semi-supervised / Supervised, unsupervised, semi-supervised, and reinforcement learning models
- sensitivity / Classification metrics
- sentiment analysis / Predicting the sentiment of movie reviews
- shrinkage methods / Regularization
- simple linear regression / Introduction to linear regression
- about / Simple linear regression
- regression coefficients, estimating / Estimating the regression coefficients
- singular value decomposition / Singular value decomposition
- Singular Value Decomposition (SVD) / Removing problematic features
- singular values / Singular value decomposition
- SkillCraft / Predicting complex skill learning
- SkillCraft1 Master Table dataset
- URL, for downloading / Predicting complex skill learning
- slack variables / Support vector classification
- soft margin / Support vector classification
- softmax function / Multinomial logistic regression
- soma / The biological neuron
- source / A little graph theory
- specificity / Classification metrics
- Splines / Parametric and nonparametric models
- split information value / C5.0
- stable / Model stability
- standard error / Significance tests for linear regression
- starting probability vector / Predicting the sentiment of movie reviews
- steady state calculation / Predicting letter patterns in English words
- stemming / Predicting the sentiment of movie reviews
- step function / The artificial neuron
- stepwise regression / Feature selection
- stochastic / Learning from data
- stochastic gradient boosting / Predicting complex skill learning with boosting
- stochastic gradient descent / Stochastic gradient descent
- gradient descent / Gradient descent and local minima
- local minima / Gradient descent and local minima
- perceptron algorithm / The perceptron algorithm
- linear separation / Linear separation
- logistic neuron / The logistic neuron
- stop words / Predicting the sentiment of movie reviews
- stump / Boosting
- subject matter expert (SME) / Experience
- subject matter experts (SME's) / An alternative to manual instruction
- Sum of Squared Error (SSE) / Assessing regression models
- sum of squared error (SSE) / CART regression trees
- Sum of Squared Errors (SSE) / Performance metrics for linear regression
- supervised LDA model / LDA extensions
- supervised learning / Supervised, unsupervised, semi-supervised, and reinforcement learning models
- support vector classification / Support vector classification, Kernels and support vector machines
- inner products / Inner products
- support vector machine / Kernels and support vector machines
- support vectors / Maximal margin classification
- surrogate splits / Missing data
- Symmetrical Dirichlet distribution / The Dirichlet distribution
- synaptic neurotransmitters / The biological neuron
T
- t-value / Significance tests for linear regression
- tail vertex / A little graph theory
- target / Learning from data, A little graph theory
- target function / Learning from data
- test set / Training and assessing the model
- tidying data / Getting started, Tidying data
- topic modelling
- topic models / An overview of topic modeling
- topics
- modeling, of online stories / Modeling the topics of online news stories
- modelling / Modeling the topics of online news stories
- model stability / Model stability
- finding / Finding the number of topics
- distributions / Topic distributions
- word distributions / Word distributions
- LDA extensions / LDA extensions
- modelling, of tweets / Modeling tweet topics
- word clouding / Word clouding
- Total Sum of Squares (TSS) / Performance metrics for linear regression
- training set / Training and assessing the model
- transition probability matrix / Predicting the sentiment of movie reviews
- tree models
- intuition / The intuition for tree models
- Trigram HMM / Predicting promoter gene sequences
- true positive rate (TPR) / Evaluating binary top-N recommendations
- True Sum of Squares (TSS) / Model deviance
U
- UCI Machine Learning Repository
- reference link / Predicting CPU performance, Predicting glass type
- undirected / A little graph theory
- unsupervised learning / Supervised, unsupervised, semi-supervised, and reinforcement learning models
- use cases
- about / Use cases
- word embedding / Word embedding
- word prediction / Word prediction
- word vectors / Word vectors
- numerical representations, of contextual similarities / Numerical representations of contextual similarities
- Netflix learns / Netflix learns
- user's neighborhood / User-based collaborative filtering
- user-based collaborative filtering / Collaborative filtering
V
- validation set / Training and assessing the model
- variable importance / Variable importance in tree models
- variable sport / Negative Binomial regression
- variance / Estimating the regression coefficients
- variance inflation factor (VIF) / Multicollinearity
- Variational Expectation Maximization (VEM) / Fitting an LDA model, Modeling the topics of online news stories
- Viterbi algorithm / Predicting the sentiment of movie reviews
W
- wavelet transform / Predicting the authenticity of banknotes
- weak learners / Boosting
- weight decay / Predicting glass type revisited
- Wine Quality Data Set
- reference link / Predicting wine quality
- Word2vec
- about / Implementations
- reference link / Implementations
- word cloud / Word distributions
- word embedding
- about / Word embedding
- Dimensionality Reduction / Word embedding
- Contextual Similarity / Word embedding
- word prediction / Word prediction
- word vectors
- about / Word vectors
- reference link / Word vectors
Z
- Z-score normalization / Feature transformations
- z-statistic / Assessing logistic regression models