Book Image

Learning Data Mining with Python

Book Image

Learning Data Mining with Python

Overview of this book

Table of Contents (20 chapters)
Learning Data Mining with Python
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Big data


What makes big data different? Most big-data proponents talk about the four Vs of big data:

  1. Volume: The amount of data that we generate and store is growing at an increasing rate, and predictions of the future generally only suggest further increases. Today's multi-gigabyte sized hard drives will turn into exabyte hard drives in a few years, and network throughput traffic will be increasing as well. The signal to noise ratio can be quite difficult, with important data being lost in the mountain of non-important data.

  2. Velocity: While related to volume, the velocity of data is increasing too. Modern cars have hundreds of sensors that stream data into their computers, and the information from these sensors needs to be analyzed at a subsecond level to operate the car. It isn't just a case of finding answers in the volume of data; those answers often need to come quickly.

  3. Variety: Nice datasets with clearly defined columns are only a small part of the dataset that we have these days...