A certain percentage of all data will consist of what is referred to as outliers--those points or responses beyond reasonable ranges established for the data, based upon its context. General responses to found outliers become increasingly challenging within big data initiatives.
In this chapter, we will focus on the topic of dealing with outliers as they relate to big data visualization, introduce the Python language, and offer working examples demonstrating solutions for effectively dealing with data outliers and other anomalies in big data, using Python.
This chapter is organized into the following main sections:
About Python
Python and big data
Outliers
Some basic examples
More examples