Book Image

Modern R Programming Cookbook

By : Jaynal Abedin
Book Image

Modern R Programming Cookbook

By: Jaynal Abedin

Overview of this book

R is a powerful tool for statistics, graphics, and statistical programming. It is used by tens of thousands of people daily to perform serious statistical analyses. It is a free, open source system whose implementation is the collective accomplishment of many intelligent, hard-working people. There are more than 2,000 available add-ons, and R is a serious rival to all commercial statistical packages. The objective of this book is to show how to work with different programming aspects of R. The emerging R developers and data science could have very good programming knowledge but might have limited understanding about R syntax and semantics. Our book will be a platform develop practical solution out of real world problem in scalable fashion and with very good understanding. You will work with various versions of R libraries that are essential for scalable data science solutions. You will learn to work with Input / Output issues when working with relatively larger dataset. At the end of this book readers will also learn how to work with databases from within R and also what and how meta programming helps in developing applications.
Table of Contents (10 chapters)

R for Text Processing

Every day, we are producing a huge amount of text data, either structured or unstructured plain format through various media such as Facebook, Twitter, Blog posts, or even scientific research articles. In the financial market, the sentiment of people plays a vital role. You can mine sentiment by analyzing text data obtained from various sources. In this chapter, you will learn the recipe related to working with unstructured text data. This chapter will cover the following recipes:

  • Extracting unstructured text data from a plain web page
  • Extracting text data from an HTML page
  • Extracting text data from an HTML page using the XML library
  • Extracting text data from PubMed
  • Importing unstructured text data from a plain text file
  • Importing plain text data from a PDF file
  • Pre-processing text data for topic modeling and sentiment analysis
  • Creating a word cloud to explore...