Learning Haskell Data Analysis

Learning Haskell Data Analysis

By : James Church

Buy this Book

Learning Haskell Data Analysis

By: James Church

Buy this Book

Overview of this book

<p>Haskell is trending in the field of data science by providing a powerful platform for robust data science practices. This book provides you with the skills to handle large amounts of data, even if that data is in a less than perfect state. Each chapter in the book helps to build a small library of code that will be used to solve a problem for that chapter. The book starts with creating databases out of existing datasets, cleaning that data, and interacting with databases within Haskell in order to produce charts for publications. It then moves towards more theoretical concepts that are fundamental to introductory data analysis, but in a context of a real-world problem with real-world data. As you progress in the book, you will be relying on code from previous chapters in order to help create new solutions quickly. By the end of the book, you will be able to manipulate, find, and analyze large and small sets of data using your own Haskell libraries.</p>

Learning Haskell Data Analysis

Credits

About the Author

About the Reviewers

www.PacktPub.com

Preface

Free Chapter

Tools of the Trade

Welcome to Haskell and data analysis!

Why Haskell?

Getting ready

Nearly essential tools of the trade

Our first Haskell program

Interactive Haskell

Summary

Getting Our Feet Wet

Type is king – the implications of strict types in Haskell

Working with csv files

Converting csv files to the SQLite3 format

Summary

Cleaning Our Datasets

Structured versus unstructured datasets

Creating your own structured data

Counting the number of fields in each record

Filtering data using regular expressions

Searching fields based on a regular expression

Summary

Plotting

Plotting data with EasyPlot

Simplifying access to data in SQLite3

Plotting data from a SQLite3 database

Plotting multiple datasets

Plotting a moving average

Summary

Hypothesis Testing

Data in a coin

Does a home-field advantage really exist?

Summary

Correlation and Regression Analysis

The terminology of correlation and regression

Study – is there a connection between scoring and winning?

Regression analysis

The pitfalls of regression analysis

Summary

Naive Bayes Classification of Twitter Data

An introduction to Naive Bayes classification

Creating a Twitter application

Summary

Building a Recommendation Engine

Analyzing the frequency of words in tweets

Working with multivariate data

Preparing our environment

Performing linear algebra in Haskell

Principal Component Analysis in Haskell

Building a recommendation engine

Summary

Regular Expressions in Haskell

A crash course in regular expressions

Index

Customer Reviews

5 star

4 star

3 star

2 star

1 star

An introduction to Naive Bayes classification

The Bayes theorem is a simple yet efficient method of classifying data. In the context of our example, tweets will be analyzed based on their individual words. There are three factors that go into a Naive Bayes classifier: prior knowledge, likelihood, and evidence. Together, they attempt to create a proportional measurement of an unknown quality of an event based on something knowable.

Prior knowledge

Prior knowledge allows us to contemplate our problem of discovering the language represented by a sentence without thinking about the features of the sentence. Think about answering the question blindly; that is, a sentence is spoken and you aren't allowed to see or hear it. What language was used? Of all of the tens of thousands of languages used across time, how could you ever guess this one? You are forced to play the odds. The top five most widely spoken languages are Mandarin, Spanish, English, Hindi, and Arabic. By selecting one of these languages...

Learning Haskell Data Analysis

By : James Church

Learning Haskell Data Analysis

By: James Church

Overview of this book

Related Content you might be interested in

Current Title:

Learning Haskell Data Analysis

An introduction to Naive Bayes classification

Prior knowledge