Book Image

Machine Learning with Scala Quick Start Guide

By : Md. Rezaul Karim, Ajay Kumar N
Book Image

Machine Learning with Scala Quick Start Guide

By: Md. Rezaul Karim, Ajay Kumar N

Overview of this book

Scala is a highly scalable integration of object-oriented nature and functional programming concepts that make it easy to build scalable and complex big data applications. This book is a handy guide for machine learning developers and data scientists who want to develop and train effective machine learning models in Scala. The book starts with an introduction to machine learning, while covering deep learning and machine learning basics. It then explains how to use Scala-based ML libraries to solve classification and regression problems using linear regression, generalized linear regression, logistic regression, support vector machine, and Naïve Bayes algorithms. It also covers tree-based ensemble techniques for solving both classification and regression problems. Moving ahead, it covers unsupervised learning techniques, such as dimensionality reduction, clustering, and recommender systems. Finally, it provides a brief overview of deep learning using a real-life example in Scala.
Table of Contents (9 chapters)

SVM for churn prediction

SVM is also a population algorithm for classification. SVM is based on the concept of decision planes, which defines the decision boundaries we discussed at the beginning of this chapter. The following diagram shows how the SVM algorithm works:

SVM uses kernel function, which finds the linear hyperplane that separates classes with the maximum margin. The following diagram shows how the data points (that is, support vectors) belonging to two different classes (red versus blue) are separated using the decision boundary based on the maximum margin:

The preceding support vector classifier can be represented as a dot product mathematically, as follows:

If the data to be separated is very high-dimensional, the kernel trick uses the kernel function to transform the data into a higher-dimensional feature space so that they can be linearly separable for classification...