Book Image

Machine Learning with Scala Quick Start Guide

By : Md. Rezaul Karim, Ajay Kumar N
Book Image

Machine Learning with Scala Quick Start Guide

By: Md. Rezaul Karim, Ajay Kumar N

Overview of this book

Scala is a highly scalable integration of object-oriented nature and functional programming concepts that make it easy to build scalable and complex big data applications. This book is a handy guide for machine learning developers and data scientists who want to develop and train effective machine learning models in Scala. The book starts with an introduction to machine learning, while covering deep learning and machine learning basics. It then explains how to use Scala-based ML libraries to solve classification and regression problems using linear regression, generalized linear regression, logistic regression, support vector machine, and Naïve Bayes algorithms. It also covers tree-based ensemble techniques for solving both classification and regression problems. Moving ahead, it covers unsupervised learning techniques, such as dimensionality reduction, clustering, and recommender systems. Finally, it provides a brief overview of deep learning using a real-life example in Scala.
Table of Contents (9 chapters)

Clustering analysis through examples

One of the most important tasks in clustering analysis is the analysis of genomic profiles to attribute individuals to specific ethnic populations, or the analysis of nucleotide haplotypes for diseases susceptibility. Human ancestry from Asia, Europe, Africa, and the Americas can be separated based on their genomic data. Research has shown that the Y chromosome lineage can be geographically localized, forming the evidence for clustering the human alleles of the human genotypes. According to National Cancer Institute (https://www.cancer.gov/publications/dictionaries/genetics-dictionary/def/genetic-variant):


"Genetic variants are an alteration in the most common DNA nucleotide sequence. The term variant can be used to describe an alteration that may be benign, pathogenic, or of unknown significance. The term variant is increasingly being...