Book Image

Getting Started with Haskell Data Analysis

By : James Church
Book Image

Getting Started with Haskell Data Analysis

By: James Church

Overview of this book

Every business and organization that collects data is capable of tapping into its own data to gain insights how to improve. Haskell is a purely functional and lazy programming language, well-suited to handling large data analysis problems. This book will take you through the more difficult problems of data analysis in a hands-on manner. This book will help you get up-to-speed with the basics of data analysis and approaches in the Haskell language. You'll learn about statistical computing, file formats (CSV and SQLite3), descriptive statistics, charts, and progress to more advanced concepts such as understanding the importance of normal distribution. While mathematics is a big part of data analysis, we've tried to keep this course simple and approachable so that you can apply what you learn to the real world. By the end of this book, you will have a thorough understanding of data analysis, and the different ways of analyzing data. You will have a mastery of all the tools and techniques in Haskell for effective data analysis.
Table of Contents (8 chapters)

What this book covers

Chapter 1, Descriptive Statistics, teaches you about the Text.CSV library. It also covers some of the descriptive statistics functions, such as mean, median, and mode.

Chapter 2, SQLite3, focuses on how to get the data from CSV into SQLite3. You will understand the data types of SQLite3 and how to fetch data using SQL statements. It also covers how to create your own custom module of descriptive statistics.

Chapter 3, Regular Expressions, introduces you to regular expression syntax, such as dots and pipe. It also covers character classes at length. Finally, it teaches you how to use regular expressions within a CSV file and an SQLite3 database.

Chapter 4, Visualizations, starts with the installation of gnuplot and the EasyPlot Haskell library. It covers how to use moving average function to analyze stock data. Finally, it teaches you how to make publication-ready plots by adding legends and saving those plots to files.

Chapter 5, Kernel Density Estimation, introduces you to central limit theorem and normal distribution and helps you to understand the difference between them. Later, it talks about the kernel density estimator and how to apply it to a dataset.

Chapter 6, Course review, works on the MovieLens data by applying what you have learned from the first five chapters. In addition to what was covered in the earlier chapters, you will also be exploring a few more interesting techniques for analyzing the data.