Scala for Machine Learning

Book Image

Scala for Machine Learning

By : Patrick R. Nicolas

Book Image

Scala for Machine Learning

By: Patrick R. Nicolas

Overview of this book

Scala for Machine Learning

Scala for Machine Learning

Credits

About the Author

About the Author

About the Reviewers

About the Reviewers

www.PacktPub.com

www.PacktPub.com

Preface

Free Chapter

Getting Started

Getting Started

Mathematical notation for the curious

Why machine learning?

Model categorization

Taxonomy of machine learning algorithms

Don't reinvent the wheel!

Tools and frameworks

Let's kick the tires

Hello World!

Defining a methodology

Monadic data transformation

A workflow computational model

Assessing a model

Data Preprocessing

Data Preprocessing

Time series in Scala

Moving averages

Fourier analysis

The discrete Kalman filter

Alternative preprocessing techniques

Unsupervised Learning

Unsupervised Learning

Dimension reduction

Performance considerations

Naïve Bayes Classifiers

Naïve Bayes Classifiers

Probabilistic graphical models

Naïve Bayes classifiers

The Multivariate Bernoulli classification

Naïve Bayes and text mining

Regression and Regularization

Regression and Regularization

Linear regression

Numerical optimization

Logistic regression

Sequential Data Models

Sequential Data Models

Markov decision processes

The hidden Markov model

Conditional random fields

Regularized CRFs and text analytics

Comparing CRF and HMM

Performance consideration

Kernel Models and Support Vector Machines

Kernel Models and Support Vector Machines

Kernel functions

Support vector machines

Support vector classifiers – SVC

Anomaly detection with one-class SVC

Support vector regression

Performance considerations

Artificial Neural Networks

Artificial Neural Networks

Feed-forward neural networks

The multilayer perceptron

Convolution neural networks

Benefits and limitations

Genetic Algorithms

Genetic Algorithms

Genetic algorithms and machine learning

Genetic algorithm components

GA for trading strategies

Advantages and risks of genetic algorithms

Reinforcement Learning

Reinforcement Learning

Reinforcement learning

Learning classifier systems

Scalable Frameworks

Scalable Frameworks

Scalability with Actors

Basic Concepts

Scala programming

Suggested online courses

Index

Customer Reviews

5 star

0

4 star

0

3 star

0

2 star

0

1 star

0

Apache Spark

Apache Spark is a fast and general-purpose cluster computing system, initially developed as AMPLab/UC Berkeley as part of the Berkeley Data Analytics Stack (BDAS) (http://en.wikipedia.org/wiki/UC_Berkeley). It provides high-level APIs for the following programming languages that make large and concurrent parallel jobs easy to write and deploy [12:11]:

Scala: http://spark.apache.org/docs/latest/api/scala/index.html
Java: http://spark.apache.org/docs/latest/api/java/index.html
Python: http://spark.apache.org/docs/latest/api/python/index.html

Note

The link to the latest information

The URLs as any reference to Apache Spark may change in future versions.

The core element of Spark is a resilient distributed dataset (RDD), which is a collection of elements partitioned across the nodes of a cluster and/or CPU cores of servers. An RDD can be created from a local data structure such as a list, array, or hash table, from the local filesystem or the Hadoop distributed file system (HDFS...