Book Image

Functional Python Programming - Second Edition

By : Steven F. Lott
Book Image

Functional Python Programming - Second Edition

By: Steven F. Lott

Overview of this book

If you’re a Python developer who wants to discover how to take the power of functional programming (FP) and bring it into your own programs, then this book is essential for you, even if you know next to nothing about the paradigm. Starting with a general overview of functional concepts, you’ll explore common functional features such as first-class and higher-order functions, pure functions, and more. You’ll see how these are accomplished in Python 3.6 to give you the core foundations you’ll build upon. After that, you’ll discover common functional optimizations for Python to help your apps reach even higher speeds. You’ll learn FP concepts such as lazy evaluation using Python’s generator functions and expressions. Moving forward, you’ll learn to design and implement decorators to create composite functions. You'll also explore data preparation techniques and data exploration in depth, and see how the Python standard library fits the functional programming model. Finally, to top off your journey into the world of functional Python, you’ll at look at the PyMonad project and some larger examples to put everything into perspective.
Table of Contents (22 chapters)
Title Page
Packt Upsell

Cleaning raw data with generator functions

One of the tasks that arise in exploratory data analysis is cleaning up raw source data. This is often done as a composite operation applying several scalar functions to each piece of input data to create a usable dataset.

Let's look at a simplified set of data. This data is commonly used to show techniques in exploratory data analysis. It's called Anscombe's quartet, and it comes from the article, Graphs in Statistical Analysis, by F. J. Anscombe that appeared in American Statistician in 1973. The following are the first few rows of a downloaded file with this dataset:

Anscombe's quartet 
x  y  x  y  x  y  x  y 
10.0  8.04  10.0  9.14       10.0  7.46  8.0  6.58 
8.0      6.95  8.0  8.14  8.0  6.77  8.0  5.76 
13.0  7.58  13.0  8.74  13.0  12.74  8.0  7.71 

Sadly, we can't trivially process this with the csv module. We have to do a little bit of parsing to extract the useful information from this file. Since the data is properly tab...