Book Image

Mastering Social Media Mining with R

Book Image

Mastering Social Media Mining with R

Overview of this book

With an increase in the number of users on the web, the content generated has increased substantially, bringing in the need to gain insights into the untapped gold mine that is social media data. For computational statistics, R has an advantage over other languages in providing readily-available data extraction and transformation packages, making it easier to carry out your ETL tasks. Along with this, its data visualization packages help users get a better understanding of the underlying data distributions while its range of "standard" statistical packages simplify analysis of the data. This book will teach you how powerful business cases are solved by applying machine learning techniques on social media data. You will learn about important and recent developments in the field of social media, along with a few advanced topics such as Open Authorization (OAuth). Through practical examples, you will access data from R using APIs of various social media sites such as Twitter, Facebook, Instagram, GitHub, Foursquare, LinkedIn, Blogger, and other networks. We will provide you with detailed explanations on the implementation of various use cases using R programming. With this handy guide, you will be ready to embark on your journey as an independent social media analyst.
Table of Contents (13 chapters)
Mastering Social Media Mining with R
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Preface
Index

Preface

In recent times, the popularity of social media has grown exponentially and is increasingly being used as a channel for mass communication, such that the brands consider it as a medium of promotion and people largely use it for content sharing. With the increase in the number of users online, the data generated has increased many folds, bringing in the huge scope for gaining insights into the untapped gold mine, the social media data.

Mastering Social Media Mining with R will provide you with a detailed step-by-step guide to access the data using R and the APIs of various social media sites, such as Twitter, Facebook, Instagram, GitHub, Foursquare, LinkedIn, Blogger, and a few more networks. Most importantly, this book will provide you detailed explanations of implementation of various use cases using R programming; and by reading this book, you will be ready to embark your journey as an independent social media analyst. This book is structured in such a way that people new to the field of data mining or a seasoned professional can learn to solve powerful business cases with the application of machine learning techniques on the social media data.

What this book covers

Chapter 1, Fundaments of Mining, introduces you to the concepts of social media mining, various social media platforms, generic processes involved in accessing and processing the data, and techniques that can be implemented, as well as the importance, challenges, and applications of social media mining.

Chapter 2, Mining Opinions, Exploring Trends, and More with Twitter, focuses on steps involved in collecting tweets using the Twitter API and solve business cases, such as identifying the trending topics, searching tweets, collecting tweets, processing them, performing sentiment analysis, exploring few business cases based on sentiment analysis, and visualizing the sentiments in the form of word clouds.

Chapter 3, Find Friends on Facebook, discusses the usage of the Facebook API and uses the extracted data to measure click-through rate performance, detect spam messages, implement and explore the concepts of social graphs, and build recommendations using the Apriori algorithm on pages to like.

Chapter 4, Finding Popular Photos on Instagram, helps you understand the procedure involved in pulling the data using the Instagram API and helps you extract the popular personalities and destinations, building different types of clusters, and implementing recommendation engine based on the user-based collaborative filtering approach.

Chapter 5, Let's Build Software with GitHub, teaches you to use the GitHub API from R and also helps you understand the ways in which you can get the solutions to business questions by performing graphical and nongraphical exploration data analysis, which includes some basic charts, trend analysis, heat maps, scatter plots, and much more.

Chapter 6, More Social Media Websites, helps you understand the functioning of APIs of various social media websites and covers the business cases that can be solved.

What you need for this book

In order to make your learning efficient, you need to have a computer with either Windows, Mac, or Ubuntu.

You need to download R to execute the codes mentioned in this book. You can download and install R using the CRAN website available at http://cran.r-project.org/. All the codes are written using RStudio. RStudio is an integrated development environment for R and can be downloaded from http://www.rstudio.com/products/rstudio/.

In order to access the APIs of the social media, it will be necessary to create an app and follow certain instructions. All of these procedures are explained in their respective chapters.

Who this book is for

Mastering Social Media Mining with R is intended for those who have basic knowledge of R in terms of its libraries and are aware of different machine learning techniques, or if you are a data analyst and interested in mining social media data; however, there is no need to have any prior knowledge of the usage of APIs of social media websites. This book will make you master in getting the required social media data and transforming them into actions resulting in improved business values.

Conventions

In this book, you will find a number of text styles that distinguish between different kinds of information. Here are some examples of these styles and an explanation of their meaning.

Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: "We can include other contexts through the use of the include directive."

A block of code is set as follows:

[default]
post_id<- head(page$id, n = 100)
head(post_id, n=10)
post_id<- as.matrix(post_id)

Any command-line input or output is written as follows:

# Location (Country)

New terms and important words are shown in bold. Words that you see on the screen, for example, in menus or dialog boxes, appear in the text like this: "Clicking the Next button moves you to the next screen."

Note

Exercise to be tried by the readers and notes appear in a box like this.

Reader feedback

Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or disliked. Reader feedback is important for us as it helps us develop titles that you will really get the most out of.

To send us general feedback, simply e-mail , and mention the book's title in the subject of your message.

If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide at www.packtpub.com/authors.

Customer support

Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.

Downloading the example code

You can download the example code files from your account at http://www.packtpub.com for all the Packt Publishing books you have purchased. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

Errata

Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books—maybe a mistake in the text or the code—we would be grateful if you could report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website or added to any list of existing errata under the Errata section of that title.

To view the previously submitted errata, go to https://www.packtpub.com/books/content/support and enter the name of the book in the search field. The required information will appear under the Errata section.

Piracy

Piracy of copyrighted material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works in any form on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.

Please contact us at with a link to the suspected pirated material.

We appreciate your help in protecting our authors and our ability to bring you valuable content.

Questions

If you have a problem with any aspect of this book, you can contact us at , and we will do our best to address the problem.