Book Image

R Deep Learning Cookbook

By : PKS Prakash, Achyutuni Sri Krishna Rao
Book Image

R Deep Learning Cookbook

By: PKS Prakash, Achyutuni Sri Krishna Rao

Overview of this book

Deep Learning is the next big thing. It is a part of machine learning. It's favorable results in applications with huge and complex data is remarkable. Simultaneously, R programming language is very popular amongst the data miners and statisticians. This book will help you to get through the problems that you face during the execution of different tasks and Understand hacks in deep learning, neural networks, and advanced machine learning techniques. It will also take you through complex deep learning algorithms and various deep learning packages and libraries in R. It will be starting with different packages in Deep Learning to neural networks and structures. You will also encounter the applications in text mining and processing along with a comparison between CPU and GPU performance. By the end of the book, you will have a logical understanding of Deep learning and different deep learning packages to have the most appropriate solutions for your problems.
Table of Contents (17 chapters)
Title Page
Credits
About the Authors
About the Reviewer
www.PacktPub.com
Customer Feedback
Preface

Installing H2O in R


H2O is another very popular open source library to build machine learning models. It is produced by H2O.ai and supports multiple languages including R and Python. The H2O package is a multipurpose machine learning library developed for a distributed environment to run algorithms on big data.

Getting ready

To set up H2O, the following systems are required:

  • 64-bit Java Runtime Environment (version 1.6 or later)
  • Minimum 2 GB RAM

H2O from R can be called using the h2o package. The h2o package has the following dependencies:

  • RCurl
  • rjson
  • statmod
  • survival
  • stats
  • tools
  • utils
  • methods

For machines that do not have curl-config installed, the RCurl dependency installation will fail in R and curl-config needs to be installed outside R.

How to do it...

  1. H2O can be installed directly from CRAN with the dependency parameter TRUE to install all CRAN-related h2o dependencies. This command will install all the R dependencies required for the h2o package:
install.packages("h2o", dependencies = T)
  1. The following command is used to call the h2o package in the current R environment. The first-time execution of the h2o package will automatically download the JAR file before launching H2O, as shown in the following figure:
library(h2o) 
localH2O = h2o.init()

Starting H2O cluster

  1. The H2O cluster can be accessed using clusterip and port information. The current H2O cluster is running on localhost at port 54321, as shown in the following screenshot:

H2O cluster running in the browser

Note

Models in H2O can be developed interactively using a browser or scripting from R. H2O modeling is like creating a Jupyter Notebook but you create a flow with different operations such as importing data, splitting data, setting up a model, and scoring.

How it works...

Let's build a logistic regression interactively using the H2O browser.

  1. Start a new flow, as shown in the following screenshot:

Creating a new flow in H2O

  1. Import a dataset using the Data menu, as shown in the following screenshot:

Importing files to the H2O environment

  1. The imported file in H2O can be parsed into the hex format (the native file format for H2O) using the Parse these files action, which will appear once the file is imported to the H2O environment:

Parsing the file to the hex format

  1. The parsed data frame in H2O can be split into training and validation using the Data |Split Frame action, as shown in the following screenshot:

Splitting the dataset into training and validation

  1. Select the model from the Model menu and set up the model-related parameters. An example for a glm model is seen in the following screenshot:

Building a model in H2O

  1. The Score|predict action can be used to score another hex data frame in H2O:

Scoring in H2O

There's more...

For more complicated scenarios that involve a lot of preprocessing, H2O can be called from R directly. This book will focus more on building models using H2O from R directly. If H2O is set up at a different location instead of localhost, then it can be connected within R by defining the correct ip and port at which the cluster is running:

localH2O = h2o.init(ip = "localhost", port = 54321, nthreads = -1) 

Another critical parameter is the number of threads to be used to build the model; by default, n threads are set to -2, which means that two cores will be used. The value of -1 for n threads will make use of all available cores.

Note

http://docs.h2o.ai/h2o/latest-stable/index.html#gettingstarted is very good using H2O in interactive mode.