Book Image

Getting Started with Haskell Data Analysis [Video]

By : James Church
Book Image

Getting Started with Haskell Data Analysis [Video]

By: James Church

Overview of this book

<p>Data analysis is part computer science and part statistics. An important part of data analysis is validating your assumptions with real-world data to see if there is a pattern, or a particular user behavior that you can validate. This video course will help you get up to speed with the basics of data analysis and approaches in the Haskell language. You'll learn about statistical computing, file formats (CSV and SQLite3), descriptive statistics, charts, and onto more advanced concepts like understanding the importance of normal distribution. Whilst mathematics is a big part of data analysis, we’ve tried to keep this course simple and approachable so that you can apply what you learn to the real world.</p> <h1>Style and Approach:</h1> <p>The style of this course is driven by problem solving using real-world data. In some sections, we will begin by seeking out datasets that are readily accessible on the Internet, downloading them, and then performing some analysis. Each video builds a little on the video before it at a conversational pace. We use the Jupyter notebook system, which allows us to easily create and share notebooks of our analysis work. You can download the notebooks that we create alongside each of our videos.</p>
Table of Contents (6 chapters)
Chapter 2
SQLite3
Content Locked
Section 4
SQLite3 and Descriptive Statistics
We've learned how to pull data from an SQLite3 database. Now we'd like to take that data and use it in the module that we created from Section 1 - We have lots of earthquake data, but we only want the earthquakes inside of the area of Oklahoma. This is where SQL shines. We're going to issue a SELECT query to grab everything. - Next, we will be discussing the WHERE clause, which will allow us to filter data which only meets our criteria - Finally, we'll be discussing the ORDER BY and the LIMIT clauses in order to sort the data and pull from just the values we want