Book Image

Hands-On Deep Learning with Apache Spark

By : Guglielmo Iozzia
Book Image

Hands-On Deep Learning with Apache Spark

By: Guglielmo Iozzia

Overview of this book

Deep learning is a subset of machine learning where datasets with several layers of complexity can be processed. Hands-On Deep Learning with Apache Spark addresses the sheer complexity of technical and analytical parts and the speed at which deep learning solutions can be implemented on Apache Spark. The book starts with the fundamentals of Apache Spark and deep learning. You will set up Spark for deep learning, learn principles of distributed modeling, and understand different types of neural nets. You will then implement deep learning models, such as convolutional neural networks (CNNs), recurrent neural networks (RNNs), and long short-term memory (LSTM) on Spark. As you progress through the book, you will gain hands-on experience of what it takes to understand the complex datasets you are dealing with. During the course of this book, you will use popular deep learning frameworks, such as TensorFlow, Deeplearning4j, and Keras to train your distributed models. By the end of this book, you'll have gained experience with the implementation of your models on a variety of use cases.
Table of Contents (19 chapters)
Appendix A: Functional Programming in Scala
Appendix B: Image Data Preparation for Spark

Hands-on NLP with Keras model import into DL4J

In Chapter 10, Deploying on a Distributed System, Importing Python Models in the JVM with DL4J section, we learned how to import existing Keras models into DL4J and use them to make predictions or re-train them in a JVM-based environment.

This applies to the model we implemented and trained in the Hand-on NLP with Keras and TensorFlow backend section in Python, using Keras with a TensorFlow backed. We need to modify the code for that example to serialize the model in HDF5 format by doing the following:

model.save('sa_rnn.h5')

The sa_rnn.h5 file produced needs to be copied into the resource folder for the Scala project to be implemented. The dependencies for the project are the DataVec API, the DL4J core, ND4J, and the DL4J model import library.

We need to import and transform the Large Movie Review database as explained...