Mastering Clojure Data Analysis

Book Image

Mastering Clojure Data Analysis

By : Eric Richard Rochester

Book Image

Mastering Clojure Data Analysis

By: Eric Richard Rochester

Overview of this book

Mastering Clojure Data Analysis

Mastering Clojure Data Analysis

Credits

About the Author

About the Author

About the Reviewers

About the Reviewers

www.PacktPub.com

www.PacktPub.com

Preface

Free Chapter

Network Analysis – The Six Degrees of Kevin Bacon

Network Analysis – The Six Degrees of Kevin Bacon

Analyzing social networks

Getting the data

Understanding graphs

Implementing the graphs

Measuring social network graphs

Visualizing the graph

GIS Analysis – Mapping Climate Change

GIS Analysis – Mapping Climate Change

Understanding GIS

Mapping the climate change

Working with map projections

Working with ArcGIS

Topic Modeling – Changing Concerns in the State of the Union Addresses

Topic Modeling – Changing Concerns in the State of the Union Addresses

Understanding data in the State of Union addresses

Understanding topic modeling

Preparing for visualizations

Setting up the project

Getting the data

Classifying UFO Sightings

Classifying UFO Sightings

Getting the data

Extracting the data

Dealing with messy data

Visualizing UFO data

Topic modeling descriptions

Benford's Law – Detecting Natural Progressions of Numbers

Benford's Law – Detecting Natural Progressions of Numbers

Learning about Benford's Law

Failing Benford's Law

Sentiment Analysis – Categorizing Hotel Reviews

Sentiment Analysis – Categorizing Hotel Reviews

Understanding sentiment analysis

Getting hotel review data

Exploring the data

Preparing the data

Cross-validating the results

Calculating error rates

Using the Weka machine learning library

Running the experiment

Examining the results

Improving the results

Null Hypothesis Tests – Analyzing Crime Data

Null Hypothesis Tests – Analyzing Crime Data

Introducing confirmatory data analysis

Understanding null hypothesis testing

Understanding burglary rates

Exploring the data

Conducting the experiment

Interpreting the results

A/B Testing – Statistical Experiments for the Web

A/B Testing – Statistical Experiments for the Web

Defining A/B testing

Conducting an A/B test

Analyzing Social Data Participation

Analyzing Social Data Participation

Setting up the project

Modeling Stock Data

Modeling Stock Data

Learning about financial data analysis

Setting up the basics

Getting prepared with data

Analyzing the text

Inspecting the stock prices

Merging text and stock features

Analyzing both text and stock features together with neural nets

Predicting the future

Taking it with a grain of salt

Index

Customer Reviews

5 star

0

4 star

0

3 star

0

2 star

0

1 star

0

Examining the results

First, let's examine the precision of the classifiers. Remember that the precision is how well the classifiers do at only returning positive reviews. This indicates the percentage of reviews that each classifier has identified as being positive is actually positive in the test set:

We need to remember a couple of things while looking at this graph. First, sentiment analysis is difficult, compared to other categorization tasks. Most importantly, human raters only agree about 80 percent of the time. So, the bar seen in the preceding figure that almost reaches 65 percent is actually decent, if not great. Still, we can see that the naive Bayesian classifier generally outperforms the maxent one for this dataset, especially when using unigram features. It performed less well for the bigram and trigram features, and slightly lesser for the POS-tagged unigrams.

We didn't try tagging the bigram and trigrams with POS information, but that might have been an interesting experiment...