Index
A
- action-value / The policy and action-value
- activities of daily living (ADL) / Human activity recognition using the LSTM model
- ADAM
- used, for large-scale genomics data processing / ADAM for large-scale genomics data processing
- Akka / Scala Play web service
- Akka actors
- concurrency / Concurrency through Akka actors
- Allele Count (AC) / 1000 Genomes Projects dataset description
- allele frequency (AF) / 1000 Genomes Projects dataset description
- Allele Number (AN) / 1000 Genomes Projects dataset description
- alt-coins / Bitcoin, cryptocurrency, and online trading
- Alternating Least Squares (ALS) algorithm / Model-based collaborative filtering, Model-based recommendation with Spark
- anomaly detection / Outlier and anomaly detection
- Anorm / Scala Play web service
- Apache Zeppelin
- installing / Installing and getting started with Apache Zeppelin
- using / Installing and getting started with Apache Zeppelin
- URL / Installing and getting started with Apache Zeppelin, Creating notebooks
- sources, building / Building from the source
- starting / Starting and stopping Apache Zeppelin
- stopping / Starting and stopping Apache Zeppelin
- notebooks, creating / Creating notebooks
- area under the curve (AUC) / Step 10 - Model evaluation on the highly-imbalanced data
- Area Under the Precision-Recall Curve (AUPRC) / Problem description
- autoencoder
- using, for unsupervised learning / Autoencoders and unsupervised learning
- working principles / Working principles of an autoencoder
- data representation / Efficient data representation with autoencoders
B
- base pairs (bp) / 1000 Genomes Projects dataset description
- best fit line / LR for predicting insurance severity claims
- Bitcoin
- about / Bitcoin, cryptocurrency, and online trading
- state-of-the-art automated trading / State-of-the-art automated trading of Bitcoin
- training / Training
- prediction / Prediction
- black box / Assumptions and design choices
- Boltzmann machines / Efficient data representation with autoencoders
C
- churn analytics pipeline
- developing / Developing a churn analytics pipeline
- dataset, description / Description of the dataset
- exploratory analysis / Exploratory analysis and feature engineering
- feature engineering / Exploratory analysis and feature engineering
- churn prediction
- with LR / LR for churn prediction
- with SVM / SVM for churn prediction
- with DTs / DTs for churn prediction
- with RF / Random Forest for churn prediction
- client subscription assessment
- through telemarketing / Client subscription assessment through telemarketing
- dataset description / Dataset description
- Apache Zeppelin, installing / Installing and getting started with Apache Zeppelin
- Apache Zeppelin, using / Installing and getting started with Apache Zeppelin
- exploratory analysis, of dataset / Exploratory analysis of the dataset
- numeric features, statistics / Statistics of numeric features
- implementing / Implementing a client subscription assessment model
- hyperparameter tuning / Hyperparameter tuning and feature selection
- feature selection / Hyperparameter tuning and feature selection
- client subscription assessment, hyperparameter tuning
- hidden layers / Number of hidden layers
- number of neurons, per hidden layer / Number of neurons per hidden layer
- activation functions / Activation functions
- weight, initialization / Weight and bias initialization
- bias, initialization / Weight and bias initialization
- regularization / Regularization
- cluster
- cluster managers
- standalone / Spark-based model deployment for large-scale dataset
- Apache Mesos / Spark-based model deployment for large-scale dataset
- Hadoop YARN / Spark-based model deployment for large-scale dataset
- kubernetes / Spark-based model deployment for large-scale dataset
- CNN architecture
- about / CNN architecture
- convolutional operations / Convolutional operations
- pooling layer / Pooling layer and padding operations
- padding operations / Pooling layer and padding operations
- operations, subsampling / Subsampling operations
- convolutional operations, in DL4j / Convolutional and subsampling operations in DL4j, Convolutional and subsampling operations in DL4j
- subsampling operations, in DL4j / Convolutional and subsampling operations in DL4j, Convolutional and subsampling operations in DL4j
- DL4j, configuring / Configuring DL4j, ND4s, and ND4j
- ND4s, configuring / Configuring DL4j, ND4s, and ND4j
- ND4j, configuring / Configuring DL4j, ND4s, and ND4j
- using, for image classification / Large-scale image classification using CNN
- CNN hyperparameters
- tuning / Tuning and optimizing CNN hyperparameters
- optimizing / Tuning and optimizing CNN hyperparameters
- dropout / Tuning and optimizing CNN hyperparameters
- clipping / Tuning and optimizing CNN hyperparameters
- sparsity / Tuning and optimizing CNN hyperparameters
- regularization / Tuning and optimizing CNN hyperparameters
- weight transforms / Tuning and optimizing CNN hyperparameters
- probability distribution manipulation / Tuning and optimizing CNN hyperparameters
- gradient normalization / Tuning and optimizing CNN hyperparameters
- collaborative filtering approaches
- about / Collaborative filtering approaches
- content-based filtering approaches / Content-based filtering approaches
- hybrid recommender systems / Hybrid recommender systems
- model-based collaborative filtering / Model-based collaborative filtering
- collaborative filtering approaches, problems
- cold start / Collaborative filtering approaches
- scalability / Collaborative filtering approaches
- sparsity / Collaborative filtering approaches
- comparative analysis
- Computational Intelligence and Data Mining (CIDM) / Description of the dataset and using linear models
- concurrency
- through Akka actors / Concurrency through Akka actors
- convolutional operations / Convolutional operations
- Coordinated Universal Time (UTC) / Data exploration
- cross-validation
- about / Hyperparameter tuning and cross-validation
- exhaustive cross-validation / Hyperparameter tuning and cross-validation
- non-exhaustive cross-validation / Hyperparameter tuning and cross-validation
- Cryptocompare API
- URL / Real-time data through the Cryptocompare API
- real-time data, collecting / Real-time data through the Cryptocompare API
- cryptocurrency / Bitcoin, cryptocurrency, and online trading
- curse of dimensionality / Efficient data representation with autoencoders
- customer attrition / Why do we perform churn analysis, and how do we do it?
- customer churn
D
- Databricks
- data denoising / Working principles of an autoencoder
- data pre-processing / Data pre-processing and feature engineering
- decision trees (DTs)
- about / SVM for churn prediction
- using, for churn prediction / DTs for churn prediction
- deep learning (DL) / Machine learning and learning workflow
- Deep Neural Networks (DNNs)
- about / Machine learning and learning workflow
- drawbacks / Image classification and drawbacks of DNNs
- demo prediction
- Scala play framework, using / Demo prediction using Scala Play framework
- dimensionality reduction / Working principles of an autoencoder
- DNNs
- using, for geographic ethnicity prediction / DNNs for geographic ethnicity prediction
- dynamic programming (DP) / A simple Q-learning implementation
E
- ethnicity prediction
- H2O, using / Using H2O for ethnicity prediction
- random forest, using / Using random forest for ethnicity prediction
- exchange / Bitcoin, cryptocurrency, and online trading
- expectation-maximization (EM) / How does LDA algorithm work?
- exploratory analysis / Exploratory analysis and feature engineering
- exploratory analysis of dataset, Apache Zeppelin
- about / Exploratory analysis of the dataset
- label distribution / Label distribution
- job distribution / Job distribution
- marital distribution / Marital distribution
- default distribution / Default distribution
- housing distribution / Housing distribution
- loan distribution / Loan distribution
- contact distribution / Contact distribution
- month distribution / Month distribution
- day distribution / Day distribution
- previous outcome distribution / Previous outcome distribution
- age feature / Age feature
- duration distribution / Duration distribution
- campaign distribution / Campaign distribution
- pdays distribution / Pdays distribution
- previous distribution / Previous distribution
- emp_var_rate distributions / emp_var_rate distributions
- cons_price_idx features / cons_price_idx features
- cons_conf_idx distribution / cons_conf_idx distribution
- euribor3m distribution / Euribor3m distribution
F
- feature engineering / Exploratory analysis and feature engineering, Data pre-processing and feature engineering
- feature maps / CNN architecture
- feature selection / Hyperparameter tuning and feature selection
- feature vectors / Developing a churn analytics pipeline
- feed-forward / DNNs for geographic ethnicity prediction
- FNR (false negative rate) / Predicting prices and evaluating the model
- FPR (false positive rate) / Predicting prices and evaluating the model
- fraud analytics model
- developing / Developing a fraud analytics model
- dataset, description / Description of the dataset and using linear models
- linear models, using / Description of the dataset and using linear models
- problem description / Problem description
- programming environment, preparing / Preparing programming environment
- packages, loading / Step 1 - Loading required packages and libraries
- libraries, loading / Step 1 - Loading required packages and libraries
- Spark session, creating / Step 2 - Creating a Spark session and importing implicits
- implicits, importing / Step 2 - Creating a Spark session and importing implicits
- input data, loading / Step 3 - Loading and parsing input data
- input data, parsing / Step 3 - Loading and parsing input data
- input data, exploratory analysis / Step 4 - Exploratory analysis of the input data
- H2O DataFrame, preparing / Step 5 - Preparing the H2O DataFrame
- unsupervised pre-training, with autoencoders / Step 6 - Unsupervised pre-training using autoencoder
- dimensionality reduction, with hidden layers / Step 7 - Dimensionality reduction with hidden layers
- anomaly detection / Step 8 - Anomaly detection
- pre-trained supervised model / Step 9 - Pre-trained supervised model
- model evaluation, on highly-imbalanced data / Step 10 - Model evaluation on the highly-imbalanced data
- Spark session, stopping / Step 11 - Stopping the Spark session and H2O context
- H2O context / Step 11 - Stopping the Spark session and H2O context
- auxiliary classes / Auxiliary classes and methods
- auxiliary methods / Auxiliary classes and methods
G
- 1000 Genomes projects dataset
- description / 1000 Genomes Projects dataset description
- Gated Recurrent Unit (GRU) / Tuning LSTM hyperparameters and GRU
- GBT regressor
- used, for predicting insurance severity claims / GBT regressor for predicting insurance severity claims
- Generalized linear models (GLM) / H2O and Sparkling water
- geographic ethnicity / Population scale clustering and geographic ethnicity
- geographic ethnicity prediction
- DNNs, using / DNNs for geographic ethnicity prediction
- Gradient boosting machine (GBM) / H2O and Sparkling water
H
- H2 / Scala Play web service
- H2O
- about / H2O and Sparkling water
- using, for ethnicity prediction / Using H2O for ethnicity prediction
- Hadoop Distributed File System (HDFS) / Spark-based model deployment for large-scale dataset
- hailstone sequence / Efficient data representation with autoencoders
- hidden layers / DNNs for geographic ethnicity prediction
- Hierarchical Drichilet Process (HDP) algorithms / Other topic models versus the scalability of LDA
- high-level data pipeline
- of prototype / High-level data pipeline of the prototype
- HistoMinute
- historical data collection
- about / Historical data collection
- URL / Historical data collection
- transforming, into time series / Transformation of historical data into a time series
- assumptions / Assumptions and design choices
- design / Assumptions and design choices
- data preprocessing / Data preprocessing
- Human Activity Recognition (HAR)
- LSTM model, using / Human activity recognition using the LSTM model
- dataset description / Dataset description
- MXNet, setting for Scala / Setting and configuring MXNet for Scala
- MXNet, configuring for Scala / Setting and configuring MXNet for Scala
- LSTM model, implementing / Implementing an LSTM model for HAR
- hyperparameters / Developing insurance severity claims predictive model using LR
- hyperparameter tuning / Hyperparameter tuning and cross-validation, Model training and hyperparameter tuning, Hyperparameter tuning and feature selection
I
- image classification
- about / Image classification and drawbacks of DNNs
- CNN, using / Large-scale image classification using CNN
- problem description / Problem description
- dataset, description / Description of the image dataset
- workflow / Workflow of the overall project
- CNNs, implementing / Implementing CNNs for image classification
- image processing / Image processing
- image metadata, extracting / Extracting image metadata
- image feature, extraction / Image feature extraction
- ND4j dataset, preparing / Preparing the ND4j dataset
- CNNs, training / Training the CNNs and saving the trained models
- trained models, saving / Training the CNNs and saving the trained models
- model, evaluating / Evaluating the model
- main() method, executing / Wrapping up by executing the main() method
- IMDb database
- URL / Data exploration
- insurance severity claims
- analyzing / Analyzing and predicting insurance severity claims
- predicting / Analyzing and predicting insurance severity claims
- motivation / Motivation
- dataset, description / Description of the dataset
- exploratory analysis, of dataset / Exploratory analysis of the dataset
- data preprocessing / Data preprocessing
- predicting, with LR / LR for predicting insurance severity claims, Developing insurance severity claims predictive model using LR
- predicting, with GBT regressor / GBT regressor for predicting insurance severity claims
- performance, boosting with RF regressor / Boosting the performance using random forest regressor
- comparative analysis, in production / Comparative analysis and model deployment
- model deployment, in production / Comparative analysis and model deployment
- Spark-based model deployment, for large-scale dataset / Spark-based model deployment for large-scale dataset
- Inter-Quartile Range (IQR) / Outlier and anomaly detection
- Interactive Intelligent Systems (TiiS) / Item-based collaborative filtering for movie similarity
- item-based collaborative filtering, for movie similarity
- libraries, importing / Step 1 - Importing necessary libraries and creating a Spark session
- Spark session, creating / Step 1 - Importing necessary libraries and creating a Spark session
- dataset, reading / Step 2 - Reading and parsing the dataset
- dataset, parsing / Step 2 - Reading and parsing the dataset
- similarity, computing / Step 3 - Computing similarity
- model, testing / Step 4 - Testing the model
K
- K-means
- working / How does K-means work?
- Kaggle / Description of the image dataset
- KYC (Know Your Customer) / Bitcoin, cryptocurrency, and online trading
L
- Latent Dirichlet Allocation (LDA)
- working / How does LDA algorithm work?
- latent factors (LFs) / Model-based collaborative filtering
- likelihood measurement
- linear regression (LR)
- used, for predicting insurance severity claims / LR for predicting insurance severity claims, Developing insurance severity claims predictive model using LR
- used, for predicting severity claims / Developing insurance severity claims predictive model using LR
- using, for churn prediction / LR for churn prediction
- used, for churn prediction / LR for churn prediction
- linear threshold units (LTUs) / DNNs for geographic ethnicity prediction
- live-price data collection
- through Cryptocompare API / Real-time data through the Cryptocompare API
- logistic regression (LR) / Exploratory analysis and feature engineering
- Long Short-Term Memory cells (LSTMs)
- implementing, for HAR / Implementing an LSTM model for HAR
- LSTM hyperparameters
- tuning / Tuning LSTM hyperparameters and GRU
- LSTM model, implementing for HAR
- packages, importing / Step 1 - Importing necessary libraries and packages
- libraries, importing / Step 1 - Importing necessary libraries and packages
- MXNet context, creating / Step 2 - Creating MXNet context
- test set, parsing / Step 3 - Loading and parsing the training and test set
- test set, loading / Step 3 - Loading and parsing the training and test set
- training set, loading / Step 3 - Loading and parsing the training and test set
- training set, parsing / Step 3 - Loading and parsing the training and test set
- exploratory analysis, of dataset / Step 4 - Exploratory analysis of the dataset
- internal RNN structure, defining / Step 5 - Defining internal RNN structure and LSTM hyperparameters
- LSTM hyperparameters / Step 5 - Defining internal RNN structure and LSTM hyperparameters
- LSTM network construction / Step 6 - LSTM network construction
- optimizer, setting up / Step 7 - Setting up an optimizer
- LSTM network, training / Step 8 - Training the LSTM network
- model, evaluating / Step 9 - Evaluating the model
- LSTM networks / LSTM networks
M
- machine learning (ML)
- about / Machine learning and learning workflow, State-of-the-art automated trading of Bitcoin
- workflow / Typical machine learning workflow
- for genetic variants / Machine learning for genetic variants
- mean square error (MSE) / Outlier and anomaly detection
- ML-based ALS models
- model
- model-based recommendation
- with Spark / Model-based recommendation with Spark
- data exploration / Data exploration
- movie recommendation, with ALS / Movie recommendation using ALS
- packages, importing / Step 1 - Import packages, load, parse, and explore the movie and rating dataset
- movie, parsing / Step 1 - Import packages, load, parse, and explore the movie and rating dataset
- movie, loading / Step 1 - Import packages, load, parse, and explore the movie and rating dataset
- movie, exploring / Step 1 - Import packages, load, parse, and explore the movie and rating dataset
- rating dataset / Step 1 - Import packages, load, parse, and explore the movie and rating dataset
- DataFrames, registering as temp tables / Step 2 - Register both DataFrames as temp tables to make querying easier
- statistics, exploring / Step 3 - Explore and query for related statistics
- statistics, querying / Step 3 - Explore and query for related statistics
- training data, preparing / Step 4 - Prepare training and test rating data and check the counts
- test rating data, preparing / Step 4 - Prepare training and test rating data and check the counts
- counts, checking / Step 4 - Prepare training and test rating data and check the counts
- data preparation, for building recommendation model with ALS / Step 5 - Prepare the data for building the recommendation model using ALS
- ALS user product matrix, building / Step 6 - Build an ALS user product matrix
- predictions, creating / Step 7 - Making predictions
- model, evaluating / Step 8 - Evaluating the model
- model deployment
- in production / Comparative analysis and model deployment, Spark-based model deployment for large-scale dataset
- about / Selecting the best model for deployment
- model training
- using, for prediction / Model training for prediction
- about / Model training and hyperparameter tuning
- movie database
- URL / Data exploration
- MovieLens 100k rating dataset
- Multilayer Perceptron (MLP) / DNNs for geographic ethnicity prediction, Working principles of an autoencoder
- MXNet
- setting, for Scala / Setting and configuring MXNet for Scala
- configuring, for Scala / Setting and configuring MXNet for Scala
N
- Next-generation genome sequencing (NGS) / Population scale clustering and geographic ethnicity
O
- online trading / Bitcoin, cryptocurrency, and online trading
- open-high-low-close (OHLC) / Real-time data through the Cryptocompare API
- optimal clusters
- quantity, determining / Determining the number of optimal clusters
- options trading web
- developing, with Q-learning / Developing an options trading web app using Q-learning
- problem description / Problem description
- implementing / Implementating an options trading web application, Putting it altogether
- option property, creating / Creating an option property
- option model, creating / Creating an option model
- model, evaluating / Evaluating the model
- wrapping up, as Scala web app / Wrapping up the options trading app as a Scala web app
- backend / The backend
- frontend / The frontend
- instructions, executing / Running and Deployment Instructions
- instructions deployment / Running and Deployment Instructions
- model deployment / Model deployment
- outlier detection / Outlier and anomaly detection
- output layer / DNNs for geographic ethnicity prediction
P
- Pachinko Allocation Model (PAM) / Other topic models versus the scalability of LDA
- padding operations / Pooling layer and padding operations
- parameters / Developing insurance severity claims predictive model using LR
- partial differential equations (PDE) / Problem description
- persistence of memory / Contextual information and the architecture of RNNs
- Personal Genome Project (PGP) / Machine learning for genetic variants
- POJO (Plain Old Java Object) / SchedulerActor
- policy / Policy
- policy gradients / Policy
- pooling layers / Pooling layer and padding operations
- population-scale clustering
- about / Population scale clustering and geographic ethnicity
- Spark-based K-means, using / Spark-based K-means for population-scale clustering
- prices
- predicting / Predicting prices and evaluating the model
- Principal Component Analysis (PCA) / Efficient data representation with autoencoders
- Probabilistic Latent Sentiment Analysis (pLSA) / Other topic models versus the scalability of LDA
- programming environment
- configuring / Configuring programming environment
- prototype
- high-level data pipeline / High-level data pipeline of the prototype
Q
- Q-learning
- implementation / A simple Q-learning implementation
- components / Components of the Q-learning algorithm
- Q-learning class / Components of the Q-learning algorithm
- QLConfig / Components of the Q-learning algorithm
- QLAction / Components of the Q-learning algorithm
- QLPolicy / Components of the Q-learning algorithm
- QLSpace / Components of the Q-learning algorithm
- QLState / Components of the Q-learning algorithm
- QLIndexedState / Components of the Q-learning algorithm
- QLModel / Components of the Q-learning algorithm
- states / States and actions in QLearning
- actions / States and actions in QLearning
- search space / The search space
- policy / The policy and action-value
- action-value / The policy and action-value
- model, creation / QLearning model creation and training
- model, training / QLearning model creation and training
- model, validation / QLearning model validation
- prediction, creating with trained model / Making predictions using the trained model
- used, for developing options trading web app / Developing an options trading web app using Q-learning
R
- random forest (RF)
- using, for churn prediction / Random Forest for churn prediction
- using, for ethnicity prediction / Using random forest for ethnicity prediction
- random policy / Policy
- receiver operating characteristic (ROC) / LR for churn prediction
- receptive field / CNN architecture
- recommendation system
- about / Recommendation system
- collaborative filtering approaches / Collaborative filtering approaches
- utility matrix / The utility matrix
- rectified linear unit (ReLU) / CNN architecture
- recurrent neural network (RNN)
- working with / Working with RNNs
- contextual information / Contextual information and the architecture of RNNs
- architecture / Contextual information and the architecture of RNNs
- long-term dependency problem / RNN and the long-term dependency problem
- LSTM networks / LSTM networks
- region of interest (ROI) / Image processing
- regression error
- about / LR for predicting insurance severity claims
- Mean Squared Error (MSE) / LR for predicting insurance severity claims
- Root Mean Squared Error (RMSE) / LR for predicting insurance severity claims
- R-squared / LR for predicting insurance severity claims
- Mean Absolute Error (MAE) / LR for predicting insurance severity claims
- explained variance / LR for predicting insurance severity claims
- reinforcement learning (RL)
- versus supervised learning / Reinforcement versus supervised and unsupervised learning
- versus unsupervised learning / Reinforcement versus supervised and unsupervised learning
- using / Using RL
- notation / Notation, policy, and utility in RL
- policy / Notation, policy, and utility in RL, Policy
- utility / Notation, policy, and utility in RL, Utility
- environment / Notation, policy, and utility in RL
- agent / Notation, policy, and utility in RL
- state / Notation, policy, and utility in RL
- goal / Notation, policy, and utility in RL
- action / Notation, policy, and utility in RL
- reward / Notation, policy, and utility in RL
- episode / Notation, policy, and utility in RL
- RESTful architecture / Why RESTful architecture?
- RF regressor
- used, for boosting performance / Boosting the performance using random forest regressor
- used, for classification / Random Forest for classification and regression
- used, for regression / Random Forest for classification and regression
- Root Mean Squared Error (RMSE) / Step 8 - Evaluating the model
- rotation estimation / Hyperparameter tuning and cross-validation
S
- Scala
- MXNet, setting / Setting and configuring MXNet for Scala
- MXNet, configuring / Setting and configuring MXNet for Scala
- Scala play framework
- using, for demo prediction / Demo prediction using Scala Play framework
- RESTful architecture / Why RESTful architecture?
- project structure / Project structure
- web app, executing / Running the Scala Play web app
- Scala web service
- about / Scala Play web service
- concurrency, through Akka actors / Concurrency through Akka actors
- web service workflow / Web service workflow
- single nucleotide polymorphisms (SNPs) / Population scale clustering and geographic ethnicity, 1000 Genomes Projects dataset description
- Singular Value Decomposition (SVD) / Hybrid recommender systems
- Spark-based K-means
- used, for population-scale clustering / Spark-based K-means for population-scale clustering
- Spark-based movie recommendation systems
- about / Spark-based movie recommendation systems
- Item-based collaborative filtering, for movie similarity / Item-based collaborative filtering for movie similarity
- model-based recommendation, with Spark / Model-based recommendation with Spark
- Sparkling water / H2O and Sparkling water
- Spark ML / Scala Play web service
- Spark MLlib
- using, in TM / Topic modeling with Spark MLlib and Stanford NLP
- Spark ML pipelines
- DataFrame / Developing insurance severity claims predictive model using LR
- transformer / Developing insurance severity claims predictive model using LR
- estimator / Developing insurance severity claims predictive model using LR
- pipeline / Developing insurance severity claims predictive model using LR
- parameter / Developing insurance severity claims predictive model using LR
- Stochastic Gradient Descent (SGD) / Tuning and optimizing CNN hyperparameters
- Support Vector Machines (SVMs)
- used, for churn prediction / SVM for churn prediction
T
- telemarketing
- client subscription assessment / Client subscription assessment through telemarketing
- text clustering / Topic modeling and text clustering
- The Cancer Genome Atlas (TCGA) / Machine learning for genetic variants
- TNR (true negative rate) / Predicting prices and evaluating the model
- topic modeling (TM)
- about / Topic modeling and text clustering, Step 7 - Topic modelling
- LDA, working / How does LDA algorithm work?
- with Spark MLlib / Topic modeling with Spark MLlib and Stanford NLP
- Spark session, creating / Step 1 - Creating a Spark session
- vocabulary, creating to train LDA after text pre-processing / Step 2 - Creating vocabulary and tokens count to train the LDA after text pre-processing
- tokens count, creating / Step 2 - Creating vocabulary and tokens count to train the LDA after text pre-processing
- LDA model, instantiating / Step 4 - Set the NLP optimizer
- NLP optimizer, setting / Step 4 - Set the NLP optimizer
- LDA model, training / Step 5 - Training the LDA model
- topics of interest, preparing / Step 6 - Prepare the topics of interest
- likelihood of documents, measuring / Step 8 - Measuring the likelihood of two documents
- models, versus scalability of LDA / Other topic models versus the scalability of LDA
- TPR (true positive rate) / Predicting prices and evaluating the model
- trails / Notation, policy, and utility in RL
- trained LDA model
- deploying / Deploying the trained LDA model
U
- unsupervised machine learning
- about / Unsupervised machine learning
- population genomics / Population genomics and clustering
- clustering / Population genomics and clustering
- autoencoders, using / Autoencoders and unsupervised learning
- utility / Utility
- utility function / Utility
- utility matrix / The utility matrix
V
- validation dataset / Typical machine learning workflow
- Variant Call Format (VCF) / 1000 Genomes Projects dataset description
W
- web service workflow
- about / Web service workflow
- JobModule / JobModule
- scheduler / Scheduler
- ScheduleActor / SchedulerActor
- PredictionActor / PredictionActor and the prediction step
- prediction / PredictionActor and the prediction step
- TradeActor / TraderActor
- within-cluster sum of squares (WCSS) / How does K-means work?
- Within-Set Sum of Squared Errors (WSSSE) / Spark-based K-means for population-scale clustering
Y
- Yelp