Book Image

Functional Python Programming

By : Steven F. Lott, Steven F. Lott
Book Image

Functional Python Programming

By: Steven F. Lott, Steven F. Lott

Overview of this book

Table of Contents (23 chapters)
Functional Python Programming
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Using filter() to identify outliers


In the previous chapter, we defined some useful statistical functions to compute mean and standard deviation and normalize a value. We can use these functions to locate outliers in our trip data. What we can do is apply the mean() and stdev() functions to the distance value in each leg of a trip to get the population mean and standard deviation.

We can then use the z() function to compute a normalized value for each leg. If the normalized value is more than 3, the data is extremely far from the mean. If we reject this outliers, we have a more uniform set of data that's less likely to harbor reporting or measurement errors.

The following is how we can tackle this:

from stats import mean, stdev, z
dist_data = list(map(dist, trip))
μ_d = mean(dist_data)
σ_d = stdev(dist_data)
outlier = lambda leg: z(dist(leg),μ_d,σ_d) > 3
print("Outliers", list(filter(outlier, trip)))

We've mapped the distance function to each leg in the trip collection. As we'll do several...