Book Image

Hands-On Deep Learning with Apache Spark

By : Guglielmo Iozzia
Book Image

Hands-On Deep Learning with Apache Spark

By: Guglielmo Iozzia

Overview of this book

Deep learning is a subset of machine learning where datasets with several layers of complexity can be processed. Hands-On Deep Learning with Apache Spark addresses the sheer complexity of technical and analytical parts and the speed at which deep learning solutions can be implemented on Apache Spark. The book starts with the fundamentals of Apache Spark and deep learning. You will set up Spark for deep learning, learn principles of distributed modeling, and understand different types of neural nets. You will then implement deep learning models, such as convolutional neural networks (CNNs), recurrent neural networks (RNNs), and long short-term memory (LSTM) on Spark. As you progress through the book, you will gain hands-on experience of what it takes to understand the complex datasets you are dealing with. During the course of this book, you will use popular deep learning frameworks, such as TensorFlow, Deeplearning4j, and Keras to train your distributed models. By the end of this book, you'll have gained experience with the implementation of your models on a variety of use cases.
Table of Contents (19 chapters)
Appendix A: Functional Programming in Scala
Appendix B: Image Data Preparation for Spark

NLP Basics

In the previous chapter, several topics were covered concerning the undertaking of DL distributed training in a Spark cluster. The concepts presented there are common to any network model. Starting from this chapter, specific use cases for RNNs or LSTMs will be looked at first, and then CNNs will be covered. This chapter starts by introducing the following core concepts of Natural Language Processing (NLP):

  • Tokenizers
  • Sentence segmentation
  • Part-of-speech tagging
  • Named entity extraction
  • Chunking
  • Parsing

The theory behind the concepts in the preceding list will be detailed before finally presenting two complete Scala examples of NLP, one using Apache Spark and the Stanford core NLP library, and the other using the Spark core and the Spark-nlp library (which is built on top of Apache Spark MLLib). The goal of the chapter is to make readers familiar with NLP, before moving...