Book Image

R Deep Learning Essentials - Second Edition

By : Mark Hodnett, Joshua F. Wiley
Book Image

R Deep Learning Essentials - Second Edition

By: Mark Hodnett, Joshua F. Wiley

Overview of this book

Deep learning is a powerful subset of machine learning that is very successful in domains such as computer vision and natural language processing (NLP). This second edition of R Deep Learning Essentials will open the gates for you to enter the world of neural networks by building powerful deep learning models using the R ecosystem. This book will introduce you to the basic principles of deep learning and teach you to build a neural network model from scratch. As you make your way through the book, you will explore deep learning libraries, such as Keras, MXNet, and TensorFlow, and create interesting deep learning models for a variety of tasks and problems, including structured data, computer vision, text data, anomaly detection, and recommendation systems. You’ll cover advanced topics, such as generative adversarial networks (GANs), transfer learning, and large-scale deep learning in the cloud. In the concluding chapters, you will learn about the theoretical concepts of deep learning projects, such as model optimization, overfitting, and data augmentation, together with other advanced topics. By the end of this book, you will be fully prepared and able to implement deep learning concepts in your research work or projects.
Table of Contents (13 chapters)

Some common myths about deep learning

There are many misconceptions, half-truths, and downright misleading opinions on deep learning. Here are some common mis-conceptions regarding deep learning:

  • Artificial intelligence means deep learning and replaces all other techniques
  • Deep learning requires a PhD-level understanding of mathematics
  • Deep learning is hard to train, almost an art form
  • Deep learning requires lots of data
  • Deep learning has poor interpretability
  • Deep learning needs GPUs

The following paragraphs discuss these statements, one by one.

Deep learning is not artificial intelligence and does not replace all other machine learning algorithms. It is only one family of algorithms in machine learning. Despite the hype, deep learning probably accounts for less than 1% of the machine learning projects in production right now. Most of the recommendation engines and online adverts that you encounter when you browse the net are not powered by deep learning. Most models used internally by companies to manage their subscribers, for example churn analysis, are not deep learning models. The models used by credit institutions to decide who gets credit do not use deep learning.

Deep learning does not require a deep understanding of mathematics unless your interest is in researching new deep learning algorithms and specialized architectures. Most practitioners use existing deep learning techniques on their data by taking an existing architecture and modifying it for their work. This does not require a deep mathematical foundation, the mathematics used in deep learning are taught at high school level throughout the world. In fact, we demonstrate this in Chapter 3, Deep Learning Fundamentals, where we build an entire neural network from basic code in less than 70 lines of code!

Training deep learning models is difficult but it is not an art form. It does require practice, but the same problems occur over and over again. Even better, there is often a prescribed fix for that problem, for example, if your model is overfitting, add regularization, if your model is not training well, build a more complex model and/or use data augmentation. We will look at this in more depth in Chapter 6, Tuning and Optimizing Models.

There is a lot of truth to the statement that deep learning requires lots of data. However, you may still be able to apply deep learning to the problem by using a pre-trained network, or creating more training data from existing data (data augmentation). We will look at these in later Chapter 6, Tuning and Optimizing Models and Chapter 11, The Next Level in Deep Learning.

Deep learning models are difficult to interpret. By this, we mean being able to explain how the models came to their decision. This is a problem in many machine learning algorithms, not just deep learning. In machine learning, generally there is an inverse relationship between accuracy and interpretation the more accurate the model needs to be, the less interpretable it is. For some tasks, for example, online advertising, interpretability is not important and there is little cost from being wrong, so the most powerful algorithm is preferred. In some cases, for example, credit scoring, interpretability may be required by law; people could demand an explanation of why they were denied credit. In other cases, such as medical diagnoses, interpretability may be important for a doctor to see why the model decided someone had a disease.

If interpretability is important, some methods can be applied to machine learning models to get an understanding of why they predicted the output for an instance. Some of them work by perturbing the data (that is, making slight changes to it) and trying to find what variables are most influential in the model coming to its decision. One such algorithm is called LIME (Local Interpretable Model-Agnostic Explanations). (Ribeiro, Marco Tulio, Sameer Singh, and Carlos Guestrin. Why should I trust you?: Explaining the predictions of any classifier. Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, 2016.) This has been implemented in many languages including R; there is a package called lime. We will use this package in Chapter 6, Tuning and Optimizing Models.

Finally, while deep learning models can run on CPUs, the truth is that any real work requires a workstation with a GPU. This does not mean that you need to go out and purchase one, as you can use cloud-computing to train your models. In Chapter 10, Running Deep Learning Models in the Cloud, will look at using AWS, Azure, and Google Cloud to train deep learning models.