Index
A
- adaptive boosting (AdaBoost) / Boosting
- agent / Comparison of RL with other ML algorithms
- agglomerative nesting (AGNES) algorithm
- about / Categories of clustering algorithms
- customer segmentation, identifying in wholesale customers data / Identifying the customer segments in the wholesale customers data using AGNES
- Aikake's Information Criterion (AIC) / Understanding Granger causality
- Alexa / ML versus software engineering
- Amazon reviews dataset
- reference / Getting started
- about / Understanding the Amazon reviews dataset
- using / Understanding the Amazon reviews dataset
- Apache Incubator
- reference / Introduction to the MXNet framework
- Apache MXNet
- using / Technical requirements
- Apriori algorithm
- about / The Apriori algorithm
- item / The Apriori algorithm
- itemset / The Apriori algorithm
- support count / The Apriori algorithm
- support / The Apriori algorithm
- frequent itemset / The Apriori algorithm
- itemset support, in context / The Apriori algorithm
- area under the curve (AUC) / Cross-validation and logistic regression, Classification tree, LASSO model
- area under the curve of the receiver operating characteristic (AUROC) / Class imbalance problem
- Artificial Intelligence (AI) / ML versus software engineering, Machine learning in credit card fraud detection
- artificial neural network (ANN) / Deep learning, Achieving computer vision with deep learning, Autoencoders explained
- artificial neural networks (ANNs)
- reference / Introduction to neural networks
- about / Introduction to neural networks
- aspect based sentiment analysis / The sentiment analysis problem
- association-rule mining
- about / Unsupervised learning, Building a recommendation system based on an association-rule mining technique
- recommendation system, building with / Building a recommendation system based on an association-rule mining technique
- Apriori algorithm / The Apriori algorithm
- association analysis
- about / An overview of association analysis
- transactional data, creating / Creating transactional data
- data / Data understanding
- data, preparing / Data preparation
- modeling process / Modeling and evaluation
- evaluation process / Modeling and evaluation
- attrition
- attrition prediction model
- implementing, with random forests / Implementing an attrition prediction model with random forests
- implementing, with gradient boosting machine (GBM) / The GBM implementation
- building, with extreme gradient boosting (XGBoost) / Building attrition prediction model with XGBoost
- building, with stacking / Building attrition prediction model with stacking
- Augmented Dickey-Fuller (ADF) / Data exploration, Vector autoregression
- Autocorrelation Function (ACF) / Univariate time series analysis
- autoencoders (AEs)
- about / Deep learning, Autoencoders explained
- architecture / Autoencoders explained
- applications / Applications of AEs
- building, with H2O library in R / Building AEs with the H2O library in R
- code implementation, for credit card fraud detection / Autoencoder code implementation for credit card fraud detection
- autoencoders (AEs), based on hidden layers
- under complete autoencoders / Types of AEs based on hidden layers
- over complete autoencoders / Types of AEs based on hidden layers
- architecture / Types of AEs based on hidden layers
- autoencoders (AEs), based on restrictions
- about / Types of AEs based on restrictions
- plain Vanilla autoencoders / Types of AEs based on restrictions
- sparse autoencoders / Types of AEs based on restrictions
- denoising autoencoders / Types of AEs based on restrictions
- convolutional autoencoders / Types of AEs based on restrictions
- stacked autoencoders / Types of AEs based on restrictions
- variational autoencoder (VAE) / Types of AEs based on restrictions
- automated prose generator
- building, with RNN / Building an automated prose generator with an RNN
- project, implementing / Implementing the project
- Automated Readability Index (ARI) / Additional quantitative analysis
- Autoregressive Integrated Moving Average (ARIMA) model / Univariate time series analysis
B
- backpropagation / Introduction to neural networks
- backpropagation through time (BPTT) / Comparison of feedforward neural networks and RNNs, Backpropagation through time
- bandit / The multi-arm bandit problem
- batch mode / Model deployment
- Bayesian optimization-based hyperparameter tuning / Hyperparameter tuning
- Bernoulli / The multi-arm bandit problem
- big data / Big data
- Boltzmann exploration / Boltzmann or softmax exploration
- boosting / Boosting
- bootstrap aggregation / Random forest
- bootstrap aggregation (bagging)
- about / Bagging
- bagged classification and regression trees (treeBag), implementing / Bagged classification and regression trees (treeBag) implementation
- support vector machine bagging (SVMBag), implementing / Support vector machine bagging (SVMBag) implementation
- naive Bayes (nbBag) bagging, implementing / Naive Bayes (nbBag) bagging implementation
- bootstrapping with replacement / Bagging
- BoW approach
- text sentiment classifier, building with / Building a text sentiment classifier with the BoW approach
- pros / Pros and cons of the BoW approach
- cons / Pros and cons of the BoW approach
C
- Carbon Dioxide Information Analysis Center (CDIAC) / Time series data
- categorical variables
- exploring / Exploring categorical variables
- central processing unit (CPU) / Achieving computer vision with deep learning
- centroid / Identifying the customer segments in wholesale customer data using k-means clustering
- classification / Supervised learning
- classification and regression trees (CART) / Supervised learning, Bagging
- classification methods / Classification methods and linear regression
- classification model
- creating / Classifying text
- feature, creating with text2vec package / Data preparation
- feature, modeling with text2vec package / Data preparation
- creating, with LASSO / LASSO model
- classification tree
- about / Classification trees
- building, with simulated data / Classification tree
- class imbalance problem / Class imbalance problem, The credit card fraud dataset
- cluster analysis / Understanding customer segmentation
- clustering / Unsupervised learning, Understanding customer segmentation
- clustering algorithms
- categories / Categories of clustering algorithms
- flat or partitioning algorithms / Categories of clustering algorithms
- hierarchical algorithms / Categories of clustering algorithms
- clustering tendency / Understanding the wholesale customer dataset and the segmentation problem
- cluster quality / Working mechanics of the k-means algorithm
- Cohen's Kappa statistic / Cross-validation and logistic regression, KNN modeling
- collaborative filtering
- about / Collaborative filtering
- memory-based / Collaborative filtering
- model-based / Collaborative filtering
- computer vision
- about / Computer vision, Understanding computer vision
- applications / Understanding computer vision
- with deep learning / Achieving computer vision with deep learning
- implementing, with pretrained model / Implementing computer vision with pretrained models
- confusion matrix / Confusion matrix
- consolidated learning / Philosophy behind ensembling
- content-based filtering / Content-based filtering
- content-based recommendation engine
- about / Content-based recommendation engine
- versus item-based collaborative filtering (ITCF) / Differentiating between ITCF and content-based recommendations
- Continuous Bag of Words (CBOW) / Understanding word embedding
- convolutional autoencoders / Types of AEs based on restrictions
- Convolutional Neural Network (CNN) / Deep learning, Convolutional Neural Networks
- Convolutional Neural Network (CNN), layers
- about / Layers of CNNs
- convolution / Layers of CNNs
- Rectified Linear Unit (ReLU) / Layers of CNNs
- max pooling / Layers of CNNs
- fully-connected layer / Layers of CNNs
- softmax / Layers of CNNs
- sigmoid / Layers of CNNs
- convolutional neural networks (CNN) / Deep learning resources and advanced methods
- Cook's D / Reviewing model assumptions
- correlation / Correlation and linearity
- cost function / Cost function
- credit card fraud dataset
- about / The credit card fraud dataset
- reference / The credit card fraud dataset
- characteristics / The credit card fraud dataset
- credit card fraud detection
- with machine learning / Machine learning in credit card fraud detection
- autoencoder code, implementation / Autoencoder code implementation for credit card fraud detection
- Cross-Correlation Function (CCF) / Data exploration
- cross-entropy / Introduction to neural networks
- cross-industry standard process for data mining (CRISP-DM) / ML project pipeline
- CRUTEM4 / Time series data
- customer segmentation
- about / Understanding customer segmentation
- objectives / Understanding customer segmentation
- problem / Understanding the wholesale customer dataset and the segmentation problem
- wholesale customer dataset / Understanding the wholesale customer dataset and the segmentation problem
- identifying, in wholesale customer data with k-means clustering / Identifying the customer segments in wholesale customer data using k-means clustering
- identifying, in wholesale customer data with DIANA / Identifying the customer segments in the wholesale customer data using DIANA
- identifying, in wholesale customers data with AGNES / Identifying the customer segments in the wholesale customers data using AGNES
- customer segmentation, data points
- demographics / Understanding customer segmentation
- psychographics / Understanding customer segmentation
- behavioral / Understanding customer segmentation
D
- data
- reading / Reading the data
- treating / Treating the data
- manipulating / Manipulating data
- creating / Dataset creation
- preparing / Data preparation, Data understanding and preparation
- about / Data understanding, Data understanding and preparation
- overview / Data overview
- data analytics / ML versus software engineering
- data cleansing
- techniques / Preparing the data
- data frame
- creating / Data frame creation
- data mining / ML versus software engineering
- data preprocessing / Data preprocessing
- data science / ML versus software engineering
- data science professional, skills / ML versus software engineering
- Hacking Skills / ML versus software engineering
- Substantive Expertise / ML versus software engineering
- dataset
- creating / Data creation
- about / Dataset background
- datasets / Datasets
- decayed epsilon greedy / Decayed epsilon greedy
- decoder / Autoencoders explained
- deep belief networks (DBNs) / Deep learning
- deep deterministic policy gradient (DDPG) / Reinforcement learning
- deep learning
- about / Deep learning – a not-so-deep overview, Deep learning, Achieving computer vision with deep learning, Comparison of RL with other ML algorithms
- resources / Deep learning resources and advanced methods
- methods / Deep learning resources and advanced methods
- used, for computer vision / Achieving computer vision with deep learning
- Convolutional Neural Network (CNN) / Convolutional Neural Networks
- deep learning model
- building / An example of deep learning
- data, loading / Loading the data
- functions, creating / Creating the model function
- learning / Model training
- deep learning network
- implementing, for handwritten digit recognition / Implementing a deep learning network for handwritten digit recognition
- dropout, implementing to avoid overfitting / Implementing dropout to avoid overfitting
- LeNet architecture, implementing with MXNet library / Implementing the LeNet architecture with the MXNet library
- deep Q network (DQN) / Reinforcement learning
- dense layer / Layers of CNNs
- descriptive statistics / Descriptive statistics
- dimensionality reduction / Dimensionality reduction
- Dirichlet distribution / Topic models
- distance calculations / Distance calculations
- DIvisive ANAlysis (DIANA)
- about / Categories of clustering algorithms, Identifying the customer segments in the wholesale customer data using DIANA
- customer segmentation, identifying in wholesale customer data / Identifying the customer segments in the wholesale customer data using DIANA
- divisive hierarchical clustering / Identifying the customer segments in the wholesale customer data using DIANA
- document-term matrix (DTM) / Text mining framework and methods
- Document Term Matrix (DTM) / Building a text sentiment classifier with the BoW approach
- dropout / Deep learning – a not-so-deep overview, Underfitting and overfitting, Implementing a deep learning network for handwritten digit recognition
- Duan's Smearing Estimator / Modeling and evaluation – MARS
- duplicate observations
- handling / Handling duplicate observations
- dynamic topic modeling / Topic models
- dynamic treatment regime (DTR) / Reinforcement learning
E
- eigenvalue / An overview of the principal components
- eigenvectors / An overview of the principal components
- elastic net
- about / Elastic net
- modeling process / Elastic net
- evaluation process / Elastic net
- Elbow method / Categories of clustering algorithms
- encoder / Autoencoders explained
- ensemble / Philosophy behind ensembling
- ensembles
- about / Ensembles
- creating / Creating an ensemble
- ensembling
- philosophy / Philosophy behind ensembling
- environment / Comparison of RL with other ML algorithms
- epsilon-greedy algorithm / The epsilon-greedy algorithm
- equimax / Rotation
- equivalence class transformation (ECLAT) / Unsupervised learning, Building a recommendation system based on an association-rule mining technique
- Euclidean distance / K-nearest neighbors, Hierarchical clustering
- exploding gradients
- about / Problems and solutions to gradients in RNN, Exploding gradients
- problem, handling / Exploding gradients
- exploitation / The multi-arm bandit problem
- exploration
- about / The multi-arm bandit problem
- selecting, over exploitation / Solving the MABP with UCB and Thompson sampling algorithms
- exploratory data analysis (EDA) / ML versus software engineering, Preparing the data , Understanding the Jokes recommendation problem and the dataset, Understanding customer segmentation
- exponential smoothing / Univariate time series analysis
- eXtreme Gradient Boosting
- about / Gradient boosting
- reference / Gradient boosting
- working / Extreme gradient boosting – classification
- extreme gradient boosting (XGBoost)
- about / Boosting
- attrition prediction model, building / Building attrition prediction model with XGBoost
F
- F-Measure / Other quantitative analysis
- False Positive Rate (FPR) / Model comparison
- fastText
- text sentiment classifier, building / Building a text sentiment classifier with fastText
- feature engineering / Feature engineering
- feedforward network / Introduction to neural networks
- feedforward neural network
- versus recurrent neural networks (RNNs) / Comparison of feedforward neural networks and RNNs
- Final Prediction Error (FPE) / Understanding Granger causality
- frequency pattern growth (FPG) / Unsupervised learning, Building a recommendation system based on an association-rule mining technique
- fully-connected layer / Layers of CNNs
G
- gated recurrent units (GRUs) / Exploding gradients
- Generalized Cross Validation (GCV) / Modeling and evaluation – MARS, Multivariate adaptive regression splines
- Generalized Linear Model (GLM) / Cross-validation and logistic regression
- generalized weights / Modeling and evaluation
- generative adversarial networks (GANs) / Semi-supervised learning, Deep learning
- Gini index / Classification trees
- GloVe word embedding
- text sentiment classifier, building with / Building a text sentiment classifier with GloVe word embedding
- Google Allo / ML versus software engineering
- Google Home / ML versus software engineering
- Google Now / ML versus software engineering
- Gower dissimilarity coefficient / Gower and PAM, Gower
- Gradient boosting
- about / Gradient boosting
- reference / Gradient boosting
- gradient boosting machine (GBM) / Boosting
- gradient clipping / Exploding gradients
- gradients
- about / Problems and solutions to gradients in RNN
- exploding gradients / Problems and solutions to gradients in RNN
- vanishing gradients / Problems and solutions to gradients in RNN
- problems and solutions / Problems and solutions to gradients in RNN
- Granger causality / Understanding Granger causality
- graphical processing unit (GPU) / Achieving computer vision with deep learning, Implementing the project
- grid search / Hyperparameter tuning
- Gutenberg
H
- H2O library
- autoencoders, building in R / Building AEs with the H2O library in R
- h2o package
- reference / Building AEs with the H2O library in R
- HadCRUT4 / Time series data
- HadSST3 / Time series data
- handwritten digit recognition
- deep learning network, implementing / Implementing a deep learning network for handwritten digit recognition
- Hannan-Quinn Criterion (HQ) / Vector autoregression
- Hartigan-Wong algorithm / Identifying the customer segments in wholesale customer data using k-means clustering
- heavy-tailed / Modeling and evaluation – stepwise regression
- heteroscedasticity / Reviewing model assumptions
- hierarchical algorithms
- about / Categories of clustering algorithms
- divisive type / Categories of clustering algorithms
- agglomerative type / Categories of clustering algorithms
- hierarchical clustering
- about / Hierarchical clustering, Hierarchical clustering
- distance calculations / Distance calculations
- high-quality clusters / Working mechanics of the k-means algorithm
- holdout sample / Holdout sample
- Holt-Winters method / Univariate time series analysis
- hybrid filtering / Hybrid filtering
- hybrid recommendation engine
- combination strategies / Building a hybrid recommendation system for Jokes recommendations
- hybrid recommendation system
- building, for Jokes recommendation / Building a hybrid recommendation system for Jokes recommendations
- hyperparameter tuning / Hyperparameter tuning
I
- identity function / Types of AEs based on restrictions
- immediate reward / The multi-arm bandit problem
- independent variables / Predictor variables
- Information Value (IV) / Training a logistic regression algorithm, Weight of evidence and information value
- item-based collaborative filtering (ITCF)
- about / Building a recommendation system with an item-based collaborative filtering technique
- recommendation system, building with / Building a recommendation system with an item-based collaborative filtering technique
- item-based similarities, computing through distance measure / Building a recommendation system with an item-based collaborative filtering technique
- targeted item rating for specific user, predicting / Building a recommendation system with an item-based collaborative filtering technique
- top N items, recommending / Building a recommendation system with an item-based collaborative filtering technique
- versus content-based recommendation engine / Differentiating between ITCF and content-based recommendations
- itemset / The Apriori algorithm
J
- Jokes recommendation
- problem / Understanding the Jokes recommendation problem and the dataset
- dataset / Understanding the Jokes recommendation problem and the dataset
- DataFrame, converting / Converting the DataFrame
- DataFrame, dividing / Dividing the DataFrame
- hybrid recommendation system, building for / Building a hybrid recommendation system for Jokes recommendations
K
- k-fold cross validation / Underfitting and overfitting
- k-means clustering / K-means clustering, K-means clustering
- customer segmentation, identifying in wholesale customer data / Identifying the customer segments in wholesale customer data using k-means clustering
- execution / Working mechanics of the k-means algorithm
- k-nearest neighbors (KNN)
- K-Nearest Neighbors (KNN) / K-nearest neighbors
- k-value, determining
- with direct methods / Identifying the customer segments in wholesale customer data using k-means clustering
- with testing methods / Identifying the customer segments in wholesale customer data using k-means clustering
- Keras
- reference / Keras and TensorFlow background
- about / Keras and TensorFlow background
- kernel trick / Support vector machines
- KNN modeling / KNN modeling
L
- L1-norm / LASSO
- L2-norm / Ridge regression
- language models
- about / Understanding language models
- machine translation / Understanding language models
- spelling correction / Understanding language models
- LASSO
- about / LASSO
- modeling process / LASSO
- evaluation process / LASSO
- used, for creating classification model / LASSO model
- Latent Dirichlet Allocation (LDA) / Topic models
- latent space representation / Autoencoders explained
- lazy learning / K-nearest neighbors
- learning by coding / Learning paradigm
- learning paradigm / Learning paradigm
- LeNet architecture
- about / Implementing the LeNet architecture with the MXNet library
- implementing, with MXNet library / Implementing the LeNet architecture with the MXNet library
- lexicon method / The sentiment analysis problem
- light gradient boosting machine (LightGBM) / Boosting
- likelihood / Logistic regression
- Lincoln's word frequency / Lincoln's word frequency
- linear combination / An overview of the principal components
- linearity / Correlation and linearity
- linear regression / Classification methods and linear regression
- Lloyd and Forgy algorithm / Identifying the customer segments in wholesale customer data using k-means clustering
- locally-optimal solution / Working mechanics of the k-means algorithm
- logistic regression / Logistic regression
- logistic regression algorithm
- training / Training a logistic regression algorithm
- Weight Of Evidence (WOE) / Weight of evidence and information value
- Information Value (IV) / Weight of evidence and information value
- Information Value (IV), used for selecting feature / Feature selection
- cross-validation, used for building model / Cross-validation and logistic regression
- logits / Ridge regression
- long-term memory / Vanishing gradients
- long short-term memory (LSTM) / Deep learning resources and advanced methods, Exploding gradients, Vanishing gradients
- loss function / Gradient boosting
M
- machine learning (ML) / Understanding a regression tree
- versus software engineering / ML versus software engineering
- about / ML versus software engineering
- in credit card fraud detection / Machine learning in credit card fraud detection
- MacQueen algorithm / Identifying the customer segments in wholesale customer data using k-means clustering
- MapReduce / Big data
- margin / Support vector machines
- market-basket analysis / Building a recommendation system based on an association-rule mining technique
- Markov Assumption / Understanding language models
- Markov chain methods / Semi-supervised learning
- Markov Property / Understanding language models
- max pooling / Layers of CNNs
- mean-squared-error (MSE) / Creating the model function
- Mean Squared Error (MSE) / Performance metrics, Autoencoders explained
- medoid / PAM
- Mining Association Rules and Frequent Itemsets / An overview of association analysis
- missing values
- handling / Handling missing values
- ML methods, types
- about / Types of ML methods
- supervised learning / Supervised learning
- unsupervised learning / Unsupervised learning
- semi-supervised learning / Semi-supervised learning
- reinforcement learning / Reinforcement learning
- transfer learning / Transfer learning
- ML project, pipeline
- about / ML project pipeline
- business understanding / Business understanding
- data, understanding / Understanding and sourcing the data
- data, sourcing / Understanding and sourcing the data
- data, preparing / Preparing the data
- model, building / Model building and evaluation
- model evaluation / Model building and evaluation
- model deployment / Model deployment
- ML terminology
- reviewing / ML terminology – a quick review
- deep learning / Deep learning
- big data / Big data
- natural language processing (NLP) / Natural language processing
- computer vision / Computer vision
- cost function / Cost function
- model accuracy / Model accuracy
- confusion matrix / Confusion matrix
- predictor variables / Predictor variables
- response variable / Response variable
- dimensionality reduction / Dimensionality reduction
- class imbalance problem / Class imbalance problem
- model variance / Model bias and variance
- model bias / Model bias and variance
- overfitting / Underfitting and overfitting
- underfitting / Underfitting and overfitting
- data preprocessing / Data preprocessing
- holdout sample / Holdout sample
- hyperparameter tuning / Hyperparameter tuning
- performance metrics / Performance metrics
- feature engineering / Feature engineering
- model, interpretability / Model interpretability
- MNIST dataset
- using / Understanding the MNIST dataset
- reference / Understanding the MNIST dataset
- model / Supervised learning
- model accuracy / Model accuracy
- model assumptions, univariate linear regression
- linearity / Reviewing model assumptions
- non-correlation of errors / Reviewing model assumptions
- homoscedasticity / Reviewing model assumptions
- no collinearity / Reviewing model assumptions
- presence of outliers / Reviewing model assumptions
- model bias / Model bias and variance
- models, text mining framework
- topic models / Topic models
- model variance / Model bias and variance
- modes, model deployment
- batch mode / Model deployment
- real-time mode / Model deployment
- monetary units (m.u.) / Understanding the wholesale customer dataset and the segmentation problem
- multi-arm bandit problem (MABP)
- about / The multi-arm bandit problem
- solving, with strategies / Strategies for solving MABP
- epsilon-greedy algorithm / The epsilon-greedy algorithm
- Boltzmann exploration / Boltzmann or softmax exploration
- decayed epsilon greedy / Decayed epsilon greedy
- upper confidence bound (UCB) algorithm / The upper confidence bound algorithm
- Thompson sampling / Thompson sampling
- real-world use cases / Multi-arm bandit – real-world use cases
- solving, with with UCB and Thompson sampling algorithms / Solving the MABP with UCB and Thompson sampling algorithms
- multicollinearity / Dimensionality reduction, Understanding the attrition problem and the dataset
- multilayered neural networks / Achieving computer vision with deep learning
- Multivariate Adaptive Regression Splines (MARS)
- about / Multivariate linear regression
- training / Multivariate adaptive regression splines
- evolution / Multivariate adaptive regression splines
- multivariate linear regress
- reverse transformation, of natural log predictions / Reverse transformation of natural log predictions
- multivariate linear regression
- about / Multivariate linear regression
- data, loading / Loading and preparing the data
- data, preparing / Loading and preparing the data
- stepwise regression, modeling process / Modeling and evaluation – stepwise regression
- stepwise regression, evaluation process / Modeling and evaluation – stepwise regression
- MARS, modeling process / Modeling and evaluation – MARS
- MARS, evaluation process / Modeling and evaluation – MARS
- reverse transformation, of natural log predictions / Reverse transformation of natural log predictions
- multivariate normality / An overview of the principal components
- MXNet framework / Introduction to the MXNet framework
- MXNet library
- LeNet architecture, implementing / Implementing the LeNet architecture with the MXNet library
N
- n-grams / N-grams
- Naive Bayes / Building a text sentiment classifier with the BoW approach
- naive Bayes (nbBag) bagging
- implementation / Naive Bayes (nbBag) bagging implementation
- Natick Soldier Research, Development and Engineering Center (NSRDEC) / Data
- natural language processing (NLP) / Transfer learning, Natural language processing
- natural log predictions
- reverse transformation / Reverse transformation of natural log predictions
- near-zero variance / Zero and near-zero variance features
- network architecture / Implementing a deep learning network for handwritten digit recognition
- neural networks
- about / Introduction to neural networks
- creating / Creating a simple neural network
- data / Data understanding and preparation
- data preparation / Data understanding and preparation
- model, building / Modeling and evaluation
- model, evaluation / Modeling and evaluation
- non-globally-optimal solution / Working mechanics of the k-means algorithm
- normal-gamma prior / Solving the MABP with UCB and Thompson sampling algorithms
- null function / Types of AEs based on restrictions
O
- object-oriented programming (OOP) / Transfer learning
- one-class classification algorithm / Machine learning in credit card fraud detection
- out-of-bag (oob) / Random forest
- outlier detection
- drawbacks / Machine learning in credit card fraud detection
- over complete autoencoders / Types of AEs based on hidden layers
- overfitting
- about / Underfitting and overfitting, Bagging, Implementing a deep learning network for handwritten digit recognition
- avoiding, with dropout / Implementing dropout to avoid overfitting
P
- PAM clustering algorithm
- about / Gower and PAM, PAM
- Gower dissimilarity coefficient / Gower and PAM
- random forest / Random forest and PAM
- Partial Autocorrelation Function (PACF) / Univariate time series analysis
- partial dependency plot (PDP) / Model interpretability
- PCA modeling
- about / PCA modeling
- component extraction / Component extraction
- orthogonal rotation / Orthogonal rotation and interpretation
- interpretation / Orthogonal rotation and interpretation
- scores, creating from components / Creating scores from the components
- regression, with MARS / Regression with MARS
- test data evaluation / Test data evaluation
- penalty / Understanding RL
- performance metrics / Performance metrics
- personalized content recommendation
- archiving / Fundamental aspects of recommendation engines
- Philips-Peron (PP) / Vector autoregression
- polarity / Other quantitative analysis
- policy gradients method / Understanding RL
- predictor variables / Predictor variables
- pretrained models
- inception-V3 model / Transfer learning
- MobileNet / Transfer learning
- VCG Face / Transfer learning
- VCG 16 / Transfer learning
- Google's Word2Vec model / Transfer learning
- Stanford's GloVe model / Transfer learning
- computer vision, implementing / Implementing computer vision with pretrained models
- reference / Implementing computer vision with pretrained models
- pretrained Word2vec word embedding
- text sentiment classifier, building with / Building a text sentiment classifier with pretrained word2vec word embedding based on Reuters news corpus
- principal component analysis (PCA) / Dimensionality reduction, Identifying the customer segments in wholesale customer data using k-means clustering
- principal components
- overview / An overview of the principal components
- rotation / Rotation
- principal components analysis (PCA)
- about / Gower and PAM
- data, preparing / Data
- data, loading / Data loading and review
- data, reviewing / Data loading and review
- training datasets / Training and testing datasets
- testing datasets / Training and testing datasets
Q
- Quantile-Quantile (Q-Q) / Reviewing model assumptions
- quantitative analysis / Other quantitative analysis, Additional quantitative analysis
- quartimax / Rotation
R
- R
- autoencoders, building with H2O library / Building AEs with the H2O library in R
- radical / Text mining framework and methods
- random forest
- about / Random forest, Random forest
- used, for feature selection method / Random forest, Feature selection with random forests
- random forest model
- modeling process / Random forest model
- evaluation process / Random forest model
- random forests
- for randomization / Randomization with random forests
- attrition prediction model, implementing / Implementing an attrition prediction model with random forests
- real-time mode / Model deployment
- Receiver Operating Characteristic (ROC) chart
- used, for model comparison / Model comparison
- recommendation engine
- fundamental aspects / Fundamental aspects of recommendation engines
- recommendation engine, categories
- about / Recommendation engine categories
- content-based filtering / Content-based filtering
- collaborative filtering / Collaborative filtering
- hybrid filtering / Hybrid filtering
- recommendation system
- building, with item-based collaborative filtering (ITCF) / Building a recommendation system with an item-based collaborative filtering technique
- building, with user-based collaborative filtering (UBCF) / Building a recommendation system with a user-based collaborative filtering technique
- building, with association-rule mining / Building a recommendation system based on an association-rule mining technique
- Recommender function, parameter
- Rectified Linear Unit (ReLU) / Layers of CNNs
- recurrent neural networks (RNN) / Deep learning resources and advanced methods
- recurrent neural networks (RNNs)
- about / Deep learning, Exploring recurrent neural networks
- exploring / Exploring recurrent neural networks
- versus feedforward neural network / Comparison of feedforward neural networks and RNNs
- gradients, problems and solutions / Problems and solutions to gradients in RNN
- automated prose generator, building / Building an automated prose generator with an RNN
- text, generating / Implementing the project
- Recursive Feature Elimination (RFE) / Modeling and evaluation
- recursive partitioning / Understanding a regression tree
- regression / Supervised learning
- regression tree / Understanding a regression tree
- regret / The multi-arm bandit problem
- regularization
- overview / Regularization overview
- about / Deep learning – a not-so-deep overview
- reinforcement learning (RL)
- about / Reinforcement learning, Understanding RL
- use cases / Reinforcement learning
- comparing, with other ML algorithms / Comparison of RL with other ML algorithms
- multi-arm bandit problem (MABP) / The multi-arm bandit problem
- reinforcement learning (RL), terminology
- agent / Terminology of RL
- environment / Terminology of RL
- state / Terminology of RL
- policy / Terminology of RL
- reward / Terminology of RL
- penalty / Terminology of RL
- action / Terminology of RL
- return / Terminology of RL
- about / Terminology of RL
- Residual Sum of Squares (RSS) / Univariate linear regression
- response variable / Response variable
- restricted Boltzmann machine
- about / Deep learning – a not-so-deep overview
- reference / Deep learning – a not-so-deep overview
- reward / Understanding RL
- ridge regression
- about / Ridge regression
- modeling process / Ridge regression
- evaluation process / Ridge regression
- root mean squared error (RMSE) / Performance metrics
- Root Mean Square Error (RMSE) / Modeling and evaluation – stepwise regression
S
- Schwarz-Bayes Criterion (SC) / Vector autoregression
- semi-supervised support vector machines (S3VMs) / Semi-supervised learning
- sentiment analysis / Sentiment analysis
- about / The sentiment analysis problem
- problem / The sentiment analysis problem
- serial correlation / Examining the causality
- Shapley additive explanations (SHAP) / Model interpretability
- short-term memory / Vanishing gradients
- shrinkage penalty / Regularization overview
- sigmoid / Layers of CNNs
- Silhouette index / Working mechanics of the k-means algorithm
- Siri / ML versus software engineering
- Skip-Gram / Understanding word embedding
- softmax / Layers of CNNs
- softmax exploration / Boltzmann or softmax exploration
- software engineering
- versus machine learning (ML) / ML versus software engineering
- sparse coding model
- about / Deep learning – a not-so-deep overview
- reference / Deep learning – a not-so-deep overview
- stacking
- about / Stacking
- attrition prediction model, building with / Building attrition prediction model with stacking
- state / Comparison of RL with other ML algorithms
- state-action-reward-state-action (SARSA) / Reinforcement learning
- stratified holdout sample / Holdout sample
- Subject Matter Experts (SMEs) / Overview
- sum of squared error (SSE) / Introduction to neural networks, Univariate time series analysis
- supervised learning / Supervised learning
- support vector / Support vector machines
- support vector machine (SVM) / Supervised learning
- support vector machine bagging (SVMBag)
- implementation / Support vector machine bagging (SVMBag) implementation
- support vector machines (SVM)
- about / Support vector machines
- modeling process / Support vector machine
- evolution process / Support vector machine
- synergy / Philosophy behind ensembling
- synthetic minority over-sampling technique (SMOTE) / Class imbalance problem
T
- TensorFlow / Keras and TensorFlow background
- term-document matrix (TDM) / Text mining framework and methods
- term frequency-inverse document frequency (tf-idf) / Classifying text
- term frequency-inverse document frequency (TFIDF) / Building a text sentiment classifier with the BoW approach
- test data / Supervised learning
- text2vec package
- used, for creating feature to classification model / Data preparation
- used, for feature modeling for classification model / Data preparation
- used, for feature modeling to classification model / Data preparation
- text classification / Classifying text
- text mining framework
- about / Text mining framework and methods
- methods / Text mining framework and methods
- quantitative analysis / Other quantitative analysis
- text sentiment classifier
- building, with BoW approach / Building a text sentiment classifier with the BoW approach
- building, with pretrained Word2vec word embedding based on Reuters news corpus / Building a text sentiment classifier with pretrained word2vec word embedding based on Reuters news corpus
- building, with GloVe word embedding / Building a text sentiment classifier with GloVe word embedding
- building, with fastText / Building a text sentiment classifier with fastText
- Thompson sampling
- about / Thompson sampling
- reference / Thompson sampling
- MABP, solving with UCB / Solving the MABP with UCB and Thompson sampling algorithms
- tidyverse
- reference / Reading the data
- time series data
- about / Time series data
- data exploration / Data exploration
- topic models
- about / Topic models
- building / Topic models
- training / Supervised learning
- training data / Supervised learning
- transfer learning / Transfer learning, Implementing computer vision with pretrained models
- tree-based learning / Gradient boosting
- trials / The multi-arm bandit problem
- True Negative Rate / Cross-validation and logistic regression
- True Positive Rate (TPR) / Cross-validation and logistic regression, Model comparison
- truncated backpropagation through time (TBPTT) / Exploding gradients
U
- UCI Machine Learning Repository
- under complete autoencoders / Types of AEs based on hidden layers
- underfitting / Underfitting and overfitting
- univariate linear regression
- about / Univariate linear regression
- model, building / Building a univariate model
- model assumptions, reviewing / Reviewing model assumptions
- univariate time series
- modeling process / Modeling and evaluation
- evaluation process / Modeling and evaluation
- forecasting / Univariate time series forecasting
- model, examining / Examining the causality
- model, causality / Examining the causality
- linear regression / Linear regression
- vector autoregression / Vector autoregression
- univariate time series analysis
- about / Univariate time series analysis
- Granger causality / Understanding Granger causality
- unsupervised learning / Unsupervised learning
- unsupervised learning models
- about / Modeling
- hierarchical clustering / Hierarchical clustering
- k-means clustering / K-means clustering
- upper confidence bound (UCB) algorithm
- about / The upper confidence bound algorithm
- MABP, solving with Thompson sampling / Solving the MABP with UCB and Thompson sampling algorithms
- user-based collaborative filtering (UBCF)
- about / Building a recommendation system with a user-based collaborative filtering technique
- recommendation system, building / Building a recommendation system with a user-based collaborative filtering technique
V
- valence shifters / Other quantitative analysis
- vanishing gradients
- about / Problems and solutions to gradients in RNN, Vanishing gradients
- problem, handling / Vanishing gradients
- Variance Inflation Factor (VIF) / Modeling and evaluation – stepwise regression
- variational autoencoder (VAE) / Types of AEs based on restrictions
- varimax / Rotation, Orthogonal rotation and interpretation
- vector autoregression (VAR) / Understanding Granger causality
- V Elbow method / Identifying the customer segments in wholesale customer data using k-means clustering
- virtual personal assistants (VPAs) / ML versus software engineering
- V Silhouette method / Identifying the customer segments in wholesale customer data using k-means clustering
W
- Weight Of Evidence (WOE) / Training a logistic regression algorithm, Weight of evidence and information value
- wholesale customer dataset
- Word2vec
- word embedding / Understanding word embedding
- word frequency
- about / Word frequency
- in addresses / Word frequency in all addresses
X
- x-values / Predictor variables
Z
- zero variance / Zero and near-zero variance features