Book Image

Mastering Python for Data Science

By : Samir Madhavan
Book Image

Mastering Python for Data Science

By: Samir Madhavan

Overview of this book

Table of Contents (19 chapters)
Mastering Python for Data Science
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
7
Estimating the Likelihood of Events
Index

Chapter 3. Finding a Needle in a Haystack

Analyzing a dataset to find patterns is an art as much as it is a science. There can be a lot of metrics associated with a dataset and you would like to find the needle in this haystack. For us, a needle is the insight that we look for within data that we weren't aware of earlier. Here, insight could refer to important information about people who buy milk of a particular brand and also buy cereals of another brand, for instance. The retail store can then stack the products near each other.

Whenever you try to analyze a dataset, you should have a detailed understanding of it and also of the domain that it is associated with. If it's a simple dataset that can be understood very easily, then the analysis can be performed directly, but if the dataset relates to the sensor data of a turbine, then domain understanding of how turbines work and what is critical to their functioning will add richness to your analysis.

The understanding of a domain is like...