Book Image

R Deep Learning Cookbook

By : PKS Prakash, Achyutuni Sri Krishna Rao
Book Image

R Deep Learning Cookbook

By: PKS Prakash, Achyutuni Sri Krishna Rao

Overview of this book

Deep Learning is the next big thing. It is a part of machine learning. It's favorable results in applications with huge and complex data is remarkable. Simultaneously, R programming language is very popular amongst the data miners and statisticians. This book will help you to get through the problems that you face during the execution of different tasks and Understand hacks in deep learning, neural networks, and advanced machine learning techniques. It will also take you through complex deep learning algorithms and various deep learning packages and libraries in R. It will be starting with different packages in Deep Learning to neural networks and structures. You will also encounter the applications in text mining and processing along with a comparison between CPU and GPU performance. By the end of the book, you will have a logical understanding of Deep learning and different deep learning packages to have the most appropriate solutions for your problems.
Table of Contents (17 chapters)
Title Page
Credits
About the Authors
About the Reviewer
www.PacktPub.com
Customer Feedback
Preface

Performing preprocessing of textual data and extraction of sentiments


In this section, we will use Jane Austen's bestselling novel Pride and Prejudice, published in 1813, for our textual data preprocessing analysis. In R, we will use the tidytext package by Hadley Wickham to perform tokenization, stop word removal, sentiment extraction using predefined sentiment lexicons, term frequency - inverse document frequency (tf-idf) matrix creation, and to understand pairwise correlations among n-grams.

In this section, instead of storing text as a string or a corpus or a document term matrix (DTM), we process them into a tabular format of one token per row.

How to do it...

Here is how we go about preprocessing:

  1. Load the required packages:
load_packages=c("janeaustenr","tidytext","dplyr","stringr","ggplot2","wordcloud","reshape2","igraph","ggraph","widyr","tidyr") 
lapply(load_packages, require, character.only = TRUE) 
  1. Load the Pride and Prejudice dataset. The line_num attribute is analogous to the line...