Index
A
- A/B tests
- URL / Importance of evaluation
- activation function
- about / Perceptron
- activity recognition
- about / Introducing activity recognition
- mobile phone sensors / Mobile phone sensors
- activity-recognition pipeline / Activity recognition pipeline
- plan / The plan
- AdaBoost M1 method
- advanced modelling
- with ensembles / Advanced modeling with ensembles
- ensembleLibrary package, using / Before we start
- data, pre-processing / Data pre-processing
- attribute selection / Attribute selection
- model selection / Model selection
- performance, evaluation / Performance evaluation
- affinity analysis
- about / Affinity analysis
- cross-industry applications / Other applications in various areas
- agglomerative clustering
- about / Clustering
- Amazon Machine Learning / Machine learning as a service
- analysis types
- about / Analysis types
- pattern analysis / Pattern analysis
- transaction analysis / Transaction analysis
- Android Device Monitor
- about / Collecting training data
- Android Studio
- installing / Installing Android Studio
- URL / Installing Android Studio
- anomalous behaviour detection
- about / Suspicious and anomalous behavior detection
- unknown-unknowns / Unknown-unknowns
- anomalous pattern detection
- about / Anomalous pattern detection
- analysis types / Analysis types
- plan recognition / Plan recognition
- anomaly detection, in time series data
- about / Anomaly detection in time series data
- histogram-based anomaly detection / Histogram-based anomaly detection
- data, loading / Loading the data
- histograms, creating / Creating histograms
- density based k-nearest neighbours / Density based k-nearest neighbors
- anomaly detection, in website traffic
- about / Anomaly detection in website traffic
- dataset, using / Dataset
- Apache Mahout
- about / Apache Mahout
- configuring / Getting Apache Mahout
- configuring, in Eclipse with Maven plugin / Configuring Mahout in Eclipse with the Maven plugin
- Apache Spark
- about / Apache Spark
- URL / Apache Spark
- Application Portfolio Management (APM)
- about / IT Operations Analytics
- Applied Machine Learning
- workflow / Applied machine learning workflow
- Apriori
- about / Weka
- Apriori algorithm
- about / Apriori algorithm
- used, for discovering shopping patterns / Apriori
- artificial neural networks
- about / Artificial neural networks
- association rule learning
- about / Association rule learning
- Apriori algorithm / Apriori algorithm
- FP-growth algorithm / FP-growth algorithm
- association rule learning, basic concepts
- database, of transactions / Database of transactions
- itemset / Itemset and rule
- rule / Itemset and rule
- support / Support
- confidence / Confidence
- autoencoder
- about / Autoencoder
B
- bag-of-word (BoW)
- about / Working with text data
- basic modelling
- about / Basic modeling
- models, evaluating / Evaluating models
- naive Bayes baseline, implementing / Implementing naive Bayes baseline
- basic naive Bayes classifier baseline
- about / Basic naive Bayes classifier baseline
- data, obtaining / Getting the data
- data, loading / Loading the data
- BBC dataset
- URL / BBC dataset
- big data
- dealing with / Dealing with big data
- volume / Dealing with big data
- velocity / Dealing with big data
- variety / Dealing with big data
- big data application
- architecture / Big data application architecture
- BigML / Machine learning as a service
- Book-Crossing dataset
- URL / Book ratings dataset
- BX-Users file / Book ratings dataset
- BX-Books file / Book ratings dataset
- BX-Book-Ratings file / Book ratings dataset
- book-recommendation engine
- building / Building a recommendation engine
- book ratings dataset, using / Book ratings dataset
- data, loading / Loading the data
- data, loading from file / Loading data from file
- data, loading from database / Loading data from database
- in-memory database, creating / In-memory database
- collaborative filtering, implementing / Collaborative filtering
- custom rules, adding / Adding custom rules to recommendations
- evaluation / Evaluation
- online learning engine / Online learning engine
- content-based filtering, implementing / Content-based filtering
C
- Canova library
- URL / Loading the data
- Cassandra
- cc.mallet.pipe package
- Input2CharSequence pipeline / Pre-processing text data
- CharSequenceRemoveHTML pipeline / Pre-processing text data
- MakeAmpersandXMLFriendly pipeline / Pre-processing text data
- TokenSequenceLowercase pipeline / Pre-processing text data
- TokenSequence2FeatureSequence pipeline / Pre-processing text data
- TokenSequenceNGrams pipeline / Pre-processing text data
- Chebyshev distance
- about / Java machine learning
- classification
- about / Classification, Classification
- decision trees learning / Decision tree learning
- probabilistic classifiers / Probabilistic classifiers
- kernel methods / Kernel methods
- artificial neural networks / Artificial neural networks
- ensemble learning / Ensemble learning
- evaluating / Evaluating classification
- precision / Precision and recall
- recall / Precision and recall
- Roc curves / Roc curves
- data, using / Data
- data, loading / Loading data
- feature selection / Feature selection
- learning algorithms, selecting / Learning algorithms
- data, classifying / Classify new data
- evaluation / Evaluation and prediction error metrics
- prediction error metrics / Evaluation and prediction error metrics
- confusion matrix, examining / Confusion matrix
- algorithm, selecting / Choosing a classification algorithm
- classification algorithms
- weka.classifiers.rules.ZeroR / Choosing a classification algorithm
- weka.classifiers.trees.RandomTree / Choosing a classification algorithm
- weka.classifiers.trees.RandomForest / Choosing a classification algorithm
- weka.classifiers.lazy.IBk / Choosing a classification algorithm
- weka.classifiers.functions.MultilayerPerceptron / Choosing a classification algorithm
- weka.classifiers.bayes.NaiveBayes / Choosing a classification algorithm
- weka.classifiers.meta.AdaBoostM1 / Choosing a classification algorithm
- weka.classifiers.meta.Bagging / Choosing a classification algorithm
- classifier
- building / Building a classifier
- spurious transitions, reducing / Reducing spurious transitions
- plugging, into mobile app / Plugging the classifier into a mobile app
- class implementation
- reference link / Loading the data
- class unbalance / Class unbalance
- clustering
- about / Clustering, Clustering
- algorithms / Clustering algorithms
- evaluation / Evaluation
- clustering algorithms
- implementing / Clustering algorithms
- collaborative filtering
- about / Collaborative filtering
- implementing, with book-recommendation engine / Collaborative filtering
- user-based / User-based filtering
- item-based / Item-based filtering
- Comma Separated Value (CSV)
- about / Loading the data
- competitions
- about / Competitions
- conjugate gradient optimization algorithm
- building / Building a single-layer regression model
- content-based filtering
- about / Content-based filtering
- implementing, with book-recommendation engine / Content-based filtering
- Contrastive Divergence algorithm
- about / Restricted Boltzmann machine
- Convolutional Neural Network (CNN)
- about / Deep convolutional networks
- Core Motion framework, iOS
- URL / Mobile phone sensors
- correlation coefficient
- about / Correlation coefficient
- cosine distance
- about / Content-based filtering
- cost function
- about / Supervised learning
- Coursera
- URL / Online courses
- cross-industry applications, of affinity analysis
- about / Other applications in various areas
- medical diagnosis / Medical diagnosis
- protein sequences / Protein sequences
- census data / Census data
- customer relationship management (CRM) / Customer relationship management
- IT Operations Analytics / IT Operations Analytics
- cross-validation
- about / Cross-validation
- Cross Industry Standard Process for Data Mining (CRISP-DM)
- about / CRISP-DM
- CrowdANALYTIX
- URL / Competitions
- CSVLoader class
- URL / Loading the data
- curse of dimensionality
- about / The curse of dimensionality
- customer relationship database
- about / Customer relationship database
- challenge / Challenge
- dataset / Dataset
- evaluation / Evaluation
D
- data
- about / Data and problem definition
- Data and problem definition
- measurement scales / Measurement scales
- data and problem definition
- about / Data and problem definition
- data cleaning
- about / Data cleaning
- data collection
- about / Data collection
- data, observing / Find or observe data
- data, searching / Find or observe data
- data, generating / Generate data
- traps, sampling / Sampling traps
- from mobile phone / Collecting data from a mobile phone
- Android Studio, installing / Installing Android Studio
- data collector, loading / Loading the data collector
- training data, collecting / Collecting training data
- data collector
- loading / Loading the data collector
- URL / Loading the data collector
- feature extraction / Feature extraction
- Data Mining
- URL / Websites and blogs
- Data Mining Research
- URL / Websites and blogs
- data pre-processing
- about / Data pre-processing
- data cleaning / Data cleaning
- missing values, filling / Fill missing values
- outliers, removing / Remove outliers
- data transformation / Data transformation
- data reduction / Data reduction
- data reduction
- about / Data reduction
- data science
- Data Science Central
- URL / Websites and blogs
- Data Science CS109 (Harvard) by John A. Paulson
- URL / Online courses
- data scientist
- dataset rebalancing
- about / Dataset rebalancing
- datasets
- about / Datasets
- data transformation
- about / Data transformation
- Decision and Predictive Analytics (ADAPA) / Predictive Model Markup Language
- decision trees
- about / Underfitting and overfitting
- decision trees learning
- about / Decision tree learning
- deep belief network
- building / Building a deep belief network
- deep belief networks
- about / Artificial neural networks
- Deep Belief Networks (DBNs)
- about / Restricted Boltzmann machine
- deep convolutional networks
- about / Deep convolutional networks, MNIST dataset
- Deeplearning4j
- about / Deeplearning4j
- URL / Deeplearning4j
- org.deeplearning4j.base / Deeplearning4j
- org.deeplearning4j.berkeley / Deeplearning4j
- org.deeplearning4j.clustering / Deeplearning4j
- org.deeplearning4j.datasets / Deeplearning4j
- org.deeplearning4j.distributions / Deeplearning4j
- org.deeplearning4j.eval / Deeplearning4j
- org.deeplearning4j.exceptions / Deeplearning4j
- org.deeplearning4j.models / Deeplearning4j
- org.deeplearning4j.nn / Deeplearning4j
- org.deeplearning4j.optimize / Deeplearning4j
- org.deeplearning4j.plot / Deeplearning4j
- org.deeplearning4j.rng / Deeplearning4j
- org.deeplearning4j.util / Deeplearning4j
- deeplearning4java
- about / Deeplearning4j
- obtaining / Getting DL4J
- delta rule
- about / Perceptron
- directory
- text data, importing / Importing from directory
- Discrete Fourier Transform (DFT)
- about / Activity recognition pipeline
- distance measures
- Euclidean distances / Euclidean distances
- non-Euclidean distances / Non-Euclidean distances
- divide-and-conquer strategy
- about / FP-growth algorithm
- double evaluateLeftToRight method
- Instances heldOutDocuments component / Evaluating a model
- int numParticles component / Evaluating a model
- boolean useResampling component / Evaluating a model
- PrintStream docProbabilityStream component / Evaluating a model
- DrivenData / Competitions
- DropConnect neural network
- about / MNIST dataset
- DSGuide
- URL / Websites and blogs
- dynamic time wrapping (DTW)
- about / Java machine learning
E
- Eclipse
- Apache Mahout, configuring with Maven plugin / Configuring Mahout in Eclipse with the Maven plugin
- Eclipse IDE
- using / Before you start
- Edit distance
- about / Non-Euclidean distances
- elbow method
- about / Clustering
- email spam dataset
- URL / E-mail spam dataset
- email spam detection
- about / E-mail spam detection
- email spam dataset, collecting / E-mail spam dataset
- default pipeline, creating / Feature generation
- training / Training and testing
- testing / Training and testing
- model performance, evaluating / Model performance
- energy efficiency dataset
- URL / Loading the data
- ensambleSel.setOptions () method
- -L </path/to/modelLibrary> option / Model selection
- -W </path/to/working/directory> option / Model selection
- -B <numModelBags> option / Model selection
- -E <modelRatio> option / Model selection
- -V <validationRatio> option / Model selection
- -H <hillClimbIterations> option / Model selection
- -I <sortInitialization> option / Model selection
- -X <numFolds> option / Model selection
- -P <hillclimbMettric> option / Model selection
- -A <algorithm> option / Model selection
- -R option / Model selection
- -G option / Model selection
- -O option / Model selection
- -S <num> option / Model selection
- -D option / Model selection
- ensemble learning
- about / Ensemble learning
- ensembleLibrary package
- using / Before we start
- URL / Before we start
- ensembles
- used, for advanced modelling / Advanced modeling with ensembles
- Ensemble Selection algorithm
- about / Advanced modeling with ensembles
- environmental sensors
- about / Mobile phone sensors
- Euclidean distances
- about / Euclidean distances
- evaluate() method, parameters
- RecommenderBuilder / Evaluation
- DataModelBuilder / Evaluation
- DataModel / Evaluation
- trainingPercentage / Evaluation
- evaluationPercentage / Evaluation
- evaluation
- about / Generalization and evaluation
- Expectation Maximization (EM) clustering
- about / Clustering
- exploitation
- about / Exploitation versus exploration
- exploration
- about / Exploitation versus exploration
F
- Feature extraction
- feature map
- about / Deep convolutional networks
- feature selection
- about / Data reduction
- feedforward neural networks
- about / Feedforward neural networks
- file
- text data, importing / Importing from file
- Fourier transform
- reference link / Activity recognition pipeline
- FP-Growth
- about / Weka
- FP-growth algorithm
- about / FP-growth algorithm
- used, for discovering shopping patterns / FP-growth
- FP-tree structure
- about / FP-growth algorithm
- fraud detection, of insurance claims
- about / Fraud detection of insurance claims
- dataset, using / Dataset
- suspicious patterns, modelling / Modeling suspicious patterns
- frequent pattern (FP)
- about / FP-growth algorithm
G
- Geeking with Greg
- URL / Websites and blogs
- generalization
- about / Generalization and evaluation
- underfitting / Underfitting and overfitting
- overfitting / Underfitting and overfitting
- test set / Train and test sets
- train set / Train and test sets
- cross-validation / Cross-validation
- leave-one-out validation / Leave-one-out validation
- stratification / Stratification
- Generalized Sequential Patterns (GSP)
- about / Weka
- Generative Stochastic Networks (GSNs)
- about / Restricted Boltzmann machine
- Gibbs sampling
- about / Restricted Boltzmann machine
- GNU General Public License (GNU GPL)
- about / Weka
- Google Prediction API / Machine learning as a service
- Graphics Processing Unit (GPU)
- reference link / Build a Multilayer Convolutional Network
- about / Build a Multilayer Convolutional Network
- GraphX
- about / Apache Spark
H
- Hadoop
- Hadoop Distributed File System (HDFS)
- about / Apache Spark
- Hamming distance
- about / Non-Euclidean distances
- HBase
- Hidden layer
- about / Feedforward neural networks
- Hidden layer, issues
- vanishing gradients problem / Feedforward neural networks
- overfitting / Feedforward neural networks
- hidden Markov models (HMM)
- about / Apache Mahout
- Hidden Markov Models (HMMs)
- about / Transaction analysis
- hierarchical clustering
- about / Clustering
- histogram-based anomaly detection
- Hotspot
- about / Weka
- hybrid approach
- about / Hybrid approach
I
- IBM Research team
- about / Advanced modeling with ensembles
- IBM Watson Analytics / Machine learning as a service
- image classification
- about / Image classification
- deeplearning4java / Deeplearning4j
- MNIST dataset / MNIST dataset
- data, loading / Loading the data
- models, building / Building models
- ImageNet
- about / Introducing image recognition
- URL / Deep convolutional networks
- image recognition
- about / Introducing image recognition
- neural networks / Neural networks
- Infrastructure as a Service (IaaS) / Machine learning in the cloud
- Input layer
- about / Feedforward neural networks
- insurance claims
- fraud detection / Fraud detection of insurance claims
- interval data
- about / Measurement scales
- Intrusion Detection (ID)
- about / Transaction analysis
- item-based analysis
- item-based collaborative filtering
- about / Item-based filtering
J
- Jaccard distance
- about / Non-Euclidean distances
- Java
- need for / The need for Java
- Java-ML packages
- net.sf.javaml.classification / Java machine learning
- net.sf.javaml.clustering / Java machine learning
- net.sf.javaml.core / Java machine learning
- net.sf.javaml.distance / Java machine learning
- net.sf.javaml.featureselection / Java machine learning
- net.sf.javaml.filter / Java machine learning
- net.sf.javaml.matrix / Java machine learning
- net.sf.javaml.sampling / Java machine learning
- net.sf.javaml.tools / Java machine learning
- net.sf.javaml.utils / Java machine learning
- java -Xmx16g
- about / Performance evaluation
- Java API packages, Weka
- Java machine learning (Java-ML)
- about / Java machine learning
- URL / Java machine learning
K
- k-means clustering
- about / Clustering
- k-nearest neighbors
- about / Underfitting and overfitting
- Kaggle / Competitions
- KDD Cup
- URL / Getting the data
- KDnuggets / Machine learning as a service
- URL / Websites and blogs
- kernel methods
- about / Kernel methods
- known-knowns
- about / Unknown-unknowns
- known-unknowns
- about / Unknown-unknowns
L
- Latent Dirichlet
- about / MALLET
- Latent Dirichlet Allocation
- about / Topic modeling
- Latent Dirichlet Allocation (LDA)
- about / Modeling
- leave-one-out validation
- about / Leave-one-out validation
- Linear Discriminant Analysis (LDA)
- about / Modeling
- reference link / Evaluating a model
- linear regression
- about / Linear regression
- Local Outlier Factor (LOF)
- LOF algorithm
M
- machine learning
- about / Machine learning and data science
- advantages / What kind of problems can machine learning solve?
- supervised learning / What kind of problems can machine learning solve?
- unsupervised learning / What kind of problems can machine learning solve?
- reinforcement learning / What kind of problems can machine learning solve?
- in real life / Machine learning in real life
- nosiy data / Noisy data
- class unbalance / Class unbalance
- feature selection / Feature selection is hard
- model chaining / Model chaining
- evaluation / Importance of evaluation
- models, in production / Getting models into production
- models, maintaining / Model maintenance
- in cloud / Machine learning in the cloud
- as service / Machine learning as a service
- machine learning application
- building / Building a machine learning application
- traditional machine learning / Traditional machine learning architecture
- big data, dealing with / Dealing with big data
- Machine Learning for Language Toolkit (MALLET)
- machine learning libraries
- about / Machine learning libraries
- Waikato Environment for Knowledge Analysis (Weka) / Weka
- Java machine learning (Java-ML) / Java machine learning
- Apache Mahout / Apache Mahout
- Apache Spark / Apache Spark
- Deeplearning4j / Deeplearning4j
- Machine Learning for Language Toolkit (MALLET) / MALLET
- comparing / Comparing libraries
- Machine learning mastery
- URL / Websites and blogs
- Mahalanobis distance
- Mahout interfaces, abstractions
- DataModel / Collaborative filtering
- UserSimilarity / Collaborative filtering
- ItemSimilarity / Collaborative filtering
- UserNeighborhood / Collaborative filtering
- Recommender / Collaborative filtering
- Mahout libraries
- org.apache.mahout.cf.taste / Apache Mahout
- org.apache.mahout.classifier / Apache Mahout
- org.apache.mahout.clustering / Apache Mahout
- org.apache.mahout.common / Apache Mahout
- org.apache.mahout.ep / Apache Mahout
- org.apache.mahout.math / Apache Mahout
- org.apache.mahout.vectorizer / Apache Mahout
- Mallet
- installing / Installing Mallet
- URL / Installing Mallet
- reference link / Pre-processing text data
- MALLET, packages
- Manhattan distance
- about / Java machine learning
- market basket analysis (MBA)
- about / Market basket analysis
- item affinity / Market basket analysis
- identification, of driver items / Market basket analysis
- trip classification / Market basket analysis
- storetostore comparison / Market basket analysis
- revenue optimization / Market basket analysis
- marketing / Market basket analysis
- operations optimization / Market basket analysis
- affinity analysis / Affinity analysis
- Markov chain
- about / Restricted Boltzmann machine
- Maven plugin
- Apache Mahout, configuring with / Configuring Mahout in Eclipse with the Maven plugin
- mean absolute error
- about / Mean absolute error
- mean squared error
- about / Mean squared error
- measurement scales
- about / Measurement scales
- nominal data / Measurement scales
- ordinal data / Measurement scales
- interval data / Measurement scales
- ratio data / Measurement scales
- Microsoft Azure Machine Learning / Machine learning as a service
- Minkowski distance
- about / Java machine learning
- missing values
- filling / Fill missing values
- MLlib API library
- org.apache.spark.mllib.classification / Apache Spark
- org.apache.spark.mllib.clustering / Apache Spark
- org.apache.spark.mllib.linalg / Apache Spark
- org.apache.spark.mllib.optimization / Apache Spark
- org.apache.spark.mllib.recommendation / Apache Spark
- org.apache.spark.mllib.regression / Apache Spark
- org.apache.spark.mllib.stat / Apache Spark
- org.apache.spark.mllib.tree / Apache Spark
- org.apache.spark.mllib.util / Apache Spark
- MNIST dataset
- about / MNIST dataset
- mobile app
- classifier, plugging into / Plugging the classifier into a mobile app
- mobile phone
- data, collecting / Collecting data from a mobile phone
- mobile phone sensors
- about / Mobile phone sensors
- motion sensors / Mobile phone sensors
- environmental sensors / Mobile phone sensors
- position sensors / Mobile phone sensors
- URL, for Android / Mobile phone sensors
- URL, for Windows Phone / Mobile phone sensors
- model
- chaining / Model chaining
- in production / Getting models into production
- maintenance / Model maintenance
- models
- building / Building models
- single layer regression model, building / Building a single-layer regression model
- deep belief network, building / Building a deep belief network
- Multilayer Convolutional Network, building / Build a Multilayer Convolutional Network
- MongoDB
- motion sensors
- about / Mobile phone sensors
- Mozilla Thunderbird
- about / E-mail spam detection
- Multilayer Convolutional Network
- about / Building models
- building / Build a Multilayer Convolutional Network
- myrunscollector package
- Globals.java class / Loading the data collector
- CollectorActivity.java class / Loading the data collector
- SensorsService.java class / Loading the data collector
N
- Naive Bayes
- about / Underfitting and overfitting
- naive Bayes baseline
- implementing / Implementing naive Bayes baseline
- neural network
- about / Underfitting and overfitting
- neural networks
- about / Neural networks
- perceptron / Perceptron
- feedforward neural networks / Feedforward neural networks
- autoencoder / Autoencoder
- Restricted Boltzman machine / Restricted Boltzmann machine
- deep convolutional networks / Deep convolutional networks
- nominal data
- about / Measurement scales
- non-Euclidean distance
- about / Non-Euclidean distances
O
- online courses
- about / Online courses
- online learning engine
- about / Online learning engine
- Oracle Database Online Documentation
- URL / Dataset
- ordinal data
- about / Measurement scales
- outliers
- removing / Remove outliers
- Output layer
- about / Feedforward neural networks
- overfits
- overfitting
- about / Underfitting and overfitting
P
- p-norm distance
- about / Euclidean distances
- PAPI
- part-of-speech (POS)
- about / Working with text data
- pattern analysis
- about / Pattern analysis
- Pearson coefficient
- about / Content-based filtering
- Pearson correlation coefficient
- about / Java machine learning
- perceptron
- plan recognition
- about / Plan recognition
- Portable Format for Analytics (PFA) / Predictive Model Markup Language
- position sensors
- about / Mobile phone sensors
- Pre-processing phase
- precision
- about / Precision and recall
- Prediction.IO / Machine learning as a service
- predictive apriori
- about / Weka
- Predictive Model Markup Language (PMML)
- about / Predictive Model Markup Language
- Principal component analysis (PCA)
- about / Data reduction
- Principal Component Analysis (PCA)
- Principal Components Analysis (PCA)
- about / Kernel methods
- probabilistic classifiers
- about / Probabilistic classifiers
R
- ratio data
- about / Measurement scales
- recall
- about / Precision and recall
- Receiver Operating Characteristics (ROC)
- about / Roc curves
- recommendation engine
- basic concepts / Basic concepts
- key concepts / Key concepts
- user-based analysis / User-based and item-based analysis
- item-based analysis / User-based and item-based analysis
- similarity, calculating / Approaches to calculate similarity
- exploitation / Exploitation versus exploration
- exploration / Exploitation versus exploration
- book-recommendation engine, building / Building a recommendation engine
- regression
- about / Regression, Underfitting and overfitting, Regression
- linear regression / Linear regression
- evaluating / Evaluating regression
- mean squared error / Mean squared error
- mean absolute error / Mean absolute error
- correlation coefficient / Correlation coefficient
- data, loading / Loading the data
- attributes, analyzing / Analyzing attributes
- model, building / Building and evaluating regression model
- model, evaluating / Building and evaluating regression model
- tips / Tips to avoid common regression problems
- regression model
- evaluating / Building and evaluating regression model
- building / Building and evaluating regression model
- linear regression / Linear regression
- regression trees / Regression trees
- regression trees
- about / Regression trees
- reinforcement learning
- Resilient Distributed Dataset (RDD)
- about / Apache Spark
- Restricted Boltzman machine
- about / Restricted Boltzmann machine
- restricted Boltzmann machine
- about / Artificial neural networks
- restricted Boltzmann machines (RBM)
- about / Deeplearning4j
- Roc curves
- about / Roc curves
- RuleSetModel / Predictive Model Markup Language
S
- Scale Invariant Feature Transform (SIFT)
- about / Introducing image recognition
- score function
- about / Supervised learning
- similar items
- searching / Find similar items
- similarity calculation
- about / Approaches to calculate similarity
- collaborative filtering / Collaborative filtering
- content-based filtering / Content-based filtering
- hybrid approach / Hybrid approach
- SimRank
- about / Non-Euclidean distances
- single layer regression model
- building / Building a single-layer regression model
- Singular value decomposition (SVD)
- about / Data reduction
- Spark Streaming
- about / Apache Spark
- spatio-temporal patterns
- about / Transaction analysis
- Spearman's footrule distance
- about / Java machine learning
- stacked autoencoders
- about / Autoencoder
- standards and markup languages
- about / Standards and markup languages
- Statistics 110 (Harvard) by Joe Biltzstein
- URL / Online courses
- stratification
- about / Stratification
- sum transfer function
- about / Perceptron
- supermarket dataset
- about / The supermarket dataset
- shopping patterns, discovering / Discover patterns
- shopping patterns, discovering with Apriori algorithm / Apriori
- shopping patterns, discovering with FP-growth algorithm / FP-growth
- supervised learning
- about / What kind of problems can machine learning solve?, Supervised learning
- classification / Classification
- regression / Regression
- Support Vector Machine (SVM) model / Predictive Model Markup Language
- Support Vector Machines (SVM)
- about / Kernel methods
- survivorship bias
- about / Sampling traps
- suspicious behaviour detection
- suspicious pattern detection
- about / Suspicious pattern detection
- suspicious patterns, modelling
- about / Modeling suspicious patterns
- vanilla approach / Vanilla approach
- dataset rebalancing / Dataset rebalancing
- SVM
- about / Underfitting and overfitting
- Sample, Explore, Modify, Model, and Assess (SEMMA).
- about / SEMMA methodology
T
- target variables
- Tertius
- about / Weka
- test set
- about / Train and test sets
- text classification
- about / Text classification
- examples / Text classification
- text data
- extracting / Working with text data
- importing / Importing data
- importing, from directory / Importing from directory
- importing, from file / Importing from file
- pre-processing / Pre-processing text data
- text mining
- about / Introducing text mining
- topic modeling / Topic modeling
- text classification / Text classification
- time series data
- anomaly detection / Anomaly detection in time series data
- topic modeling
- about / Topic modeling
- topic modelling, for BBC news
- about / Topic modeling for BBC news
- BBC dataset, collecting / BBC dataset
- modeling / Modeling
- model, evaluating / Evaluating a model
- model, reusing / Reusing a model
- model, saving / Saving a model
- model, restoring / Restoring a model
- traditional machine learning
- architecture / Traditional machine learning architecture
- Training data
- training data
- collecting / Collecting training data
- train set
- about / Train and test sets
- transaction analysis
- about / Transaction analysis
- TreeModel / Predictive Model Markup Language
U
- UCI machine learning repository
- URL / Datasets
- Udemy
- URL / Online courses
- underfits
- underfitting
- about / Underfitting and overfitting
- Universal PMML Plug-in (UPPI) / Predictive Model Markup Language
- unknown-unknowns
- about / Unknown-unknowns
- unsupervised learning
- about / What kind of problems can machine learning solve?, Unsupervised learning
- similar items, searching / Find similar items
- clustering / Clustering
- user-based analysis
- user-based collaborative filtering
- about / User-based filtering
V
- vanilla approach
- about / Vanilla approach
W
- Waikato Environment for Knowledge Analysis (Weka)
- web resources and competitions
- about / Web resources and competitions, Competitions
- datasets / Datasets
- online courses / Online courses
- websites and blogs / Websites and blogs
- venues and conferences / Venues and conferences
- website traffic
- anomaly detection / Anomaly detection in website traffic
- weka.classifiers package
- Weka 3.6
- URL / Before you start
- downloading / Before you start
- WEKA Packages
- URL / Before we start
- word2vec
- about / Working with text data
- URL / Working with text data
- workflow, Applied Machine Learning
- data and problem definition / Applied machine learning workflow
- data collection / Applied machine learning workflow
- data preprocessing / Applied machine learning workflow
- data analysis and modeling / Applied machine learning workflow
- evaluation / Applied machine learning workflow
X
- Xiaming Chen
- URL / Datasets
Y
- Yahoo traffic dataset
- URL / Dataset