Book Image

Principles of Data Science - Second Edition

By : Sinan Ozdemir, Sunil Kakade, Marco Tibaldeschi
Book Image

Principles of Data Science - Second Edition

By: Sinan Ozdemir, Sunil Kakade, Marco Tibaldeschi

Overview of this book

Need to turn programming skills into effective data science skills? This book helps you connect mathematics, programming, and business analysis. You’ll feel confident asking—and answering—complex, sophisticated questions of your data, making abstract and raw statistics into actionable ideas. Going through the data science pipeline, you'll clean and prepare data and learn effective data mining strategies and techniques to gain a comprehensive view of how the data science puzzle fits together. You’ll learn fundamentals of computational mathematics and statistics and pseudo-code used by data scientists and analysts. You’ll learn machine learning, discovering statistical models that help control and navigate even the densest datasets, and learn powerful visualizations that communicate what your data means.
Table of Contents (17 chapters)
16
Index

Overview of the five steps

The five essential steps to perform data science are as follows:

  1. Asking an interesting question
  2. Obtaining the data
  3. Exploring the data
  4. Modeling the data
  5. Communicating and visualizing the results

First, let's look at the five steps with reference to the big picture.

Asking an interesting question

This is probably my favorite step. As an entrepreneur, I ask myself (and others) interesting questions every day. I would treat this step as you would treat a brainstorming session. Start writing down questions, regardless of whether or not you think the data to answer these questions even exists. The reason for this is twofold. First off, you don't want to start biasing yourself even before searching for data. Secondly, obtaining data might involve searching in both public and private locations and, therefore, might not be very straightforward. You might ask a question and immediately tell yourself "Oh, but I bet there's no data out there that can help me&quot...