9. Practical Machine Learning with Spark | Large Scale Machine Learning with Python

Book Overview & Buying
Table Of Contents

Large Scale Machine Learning with Python

By : Sjardin, Luca Massaron , Alberto Boschetti

4 (3)

Buy this Book

Large Scale Machine Learning with Python

4 (3)

By: Sjardin, Luca Massaron , Alberto Boschetti

Buy this Book

Overview of this book

Large Python machine learning projects involve new problems associated with specialized machine learning architectures and designs that many data scientists have yet to tackle. But finding algorithms and designing and building platforms that deal with large sets of data is a growing need. Data scientists have to manage and maintain increasingly complex data projects, and with the rise of big data comes an increasing demand for computational and algorithmic efficiency. Large Scale Machine Learning with Python uncovers a new wave of machine learning algorithms that meet scalability demands together with a high predictive accuracy. Dive into scalable machine learning and the three forms of scalability. Speed up algorithms that can be used on a desktop computer with tips on parallelization and memory allocation. Get to grips with new algorithms that are specifically designed for large projects and can handle bigger files, and learn about machine learning in big data environments. We will also cover the most effective machine learning techniques on a map reduce framework in Hadoop and Spark in Python.

Preface

What this book covers

What you need for this book

Who this book is for

Conventions

Reader feedback

Customer support

Free Chapter

1. First Steps to Scalability

Explaining scalability in detail

Python for large scale machine learning

Python packages

Summary

2. Scalable Learning in Scikit-learn

Out-of-core learning

Streaming data from sources

Stochastic learning

Feature management with data streams

Summary

3. Fast SVM Implementations

Datasets to experiment with on your own

Support Vector Machines

Feature selection by regularization

Including non-linearity in SGD

Hyperparameter tuning

Summary

4. Neural Networks and Deep Learning

The neural network architecture

Neural networks and regularization

Neural networks and hyperparameter optimization

Neural networks and decision boundaries

Deep learning at scale with H2O

Deep learning and unsupervised pretraining

Deep learning with theanets

Autoencoders and unsupervised learning

Summary

5. Deep Learning with TensorFlow

TensorFlow installation

Machine learning on TensorFlow with SkFlow

Keras and TensorFlow installation

Convolutional Neural Networks in TensorFlow through Keras

CNN's with an incremental approach

GPU Computing

Summary

6. Classification and Regression Trees at Scale

Bootstrap aggregation

Random forest and extremely randomized forest

Fast parameter optimization with randomized search

CART and boosting

XGBoost

Out-of-core CART with H2O

Summary

7. Unsupervised Learning at Scale

Unsupervised methods

Feature decomposition – PCA

PCA with H2O

Clustering – K-means

K-means with H2O

LDA

Summary

8. Distributed Environments – Hadoop and Spark

From a standalone machine to a bunch of nodes

Setting up the VM

The Hadoop ecosystem

Spark

Summary

9. Practical Machine Learning with Spark

Setting up the VM for this chapter

Sharing variables across cluster nodes

Data preprocessing in Spark

Machine learning with Spark

Summary

A. Introduction to GPUs and Theano

GPU computing

Theano – parallel computing on the GPU

Installing Theano

Index

Large Scale Machine Learning with Python

By : Sjardin, Luca Massaron , Alberto Boschetti

Large Scale Machine Learning with Python

By: Sjardin, Luca Massaron , Alberto Boschetti

Overview of this book

Sharing variables across cluster nodes

Broadcast read-only variables

Confirmation

Buy this book with your credits?

Submit Your Feedback

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access