Book Image

R: Mining spatial, text, web, and social media data

By : Nathan H. Danneman, Richard Heimann, Pradeepta Mishra, Bater Makhabel
Book Image

R: Mining spatial, text, web, and social media data

By: Nathan H. Danneman, Richard Heimann, Pradeepta Mishra, Bater Makhabel

Overview of this book

Data mining is the first step to understanding data and making sense of heaps of data. Properly mined data forms the basis of all data analysis and computing performed on it. This learning path will take you from the very basics of data mining to advanced data mining techniques, and will end up with a specialized branch of data mining—social media mining. You will learn how to manipulate data with R using code snippets and how to mine frequent patterns, association, and correlation while working with R programs. You will discover how to write code for various predication models, stream data, and time-series data. You will also be introduced to solutions written in R based on R Hadoop projects. Now that you are comfortable with data mining with R, you will move on to implementing your knowledge with the help of end-to-end data mining projects. You will learn how to apply different mining concepts to various statistical and data applications in a wide range of fields. At this stage, you will be able to complete complex data mining cases and handle any issues you might encounter during projects. After this, you will gain hands-on experience of generating insights from social media data. You will get detailed instructions on how to obtain, process, and analyze a variety of socially-generated data while providing a theoretical background to accurately interpret your findings. You will be shown R code and examples of data that can be used as a springboard as you get the chance to undertake your own analyses of business, social, or political data. This Learning Path combines some of the best that Packt has to offer in one complete, curated package. It includes content from the following Packt products: ? Learning Data Mining with R by Bater Makhabel ? R Data Mining Blueprints by Pradeepta Mishra ? Social Media Mining with R by Nathan Danneman and Richard Heimann
Table of Contents (6 chapters)

Chapter 4. Potentials and Pitfalls of Social Media Data

Socially generated data, and especially social media data, comes with many complexities. Our ability to navigate these complexities as we describe and draw inferences from this data hinges on our thinking carefully about the potentials and pitfalls that arise in social media data. This chapter highlights some of the potentials and pitfalls of social media data.

Opinion mining made difficult

In this chapter, we highlight some of the potentials and pitfalls inherent in using social media data, and also in the tools we use to process it. These pitfalls are serious enough to warrant devoting an entire chapter to their enumeration and description. Though some of them cannot be ameliorated, we at least hope to give would-be analysts a fair warning so that they can enjoy their findings with an appropriately sized grain of salt.

In The Empty Raincoat, Handy described the first step to measurement as follows:

"The first step is to...