Book Image

Principles of Data Science - Second Edition

By : Sinan Ozdemir, Sunil Kakade, Marco Tibaldeschi
Book Image

Principles of Data Science - Second Edition

By: Sinan Ozdemir, Sunil Kakade, Marco Tibaldeschi

Overview of this book

Need to turn programming skills into effective data science skills? This book helps you connect mathematics, programming, and business analysis. You’ll feel confident asking—and answering—complex, sophisticated questions of your data, making abstract and raw statistics into actionable ideas. Going through the data science pipeline, you'll clean and prepare data and learn effective data mining strategies and techniques to gain a comprehensive view of how the data science puzzle fits together. You’ll learn fundamentals of computational mathematics and statistics and pseudo-code used by data scientists and analysts. You’ll learn machine learning, discovering statistical models that help control and navigate even the densest datasets, and learn powerful visualizations that communicate what your data means.
Table of Contents (17 chapters)
16
Index

The road thus far

So far in this chapter, we have looked at the differences between structured and unstructured data, as well as between qualitative and quantitative characteristics. These two simple distinctions can have drastic effects on the analysis that is performed. Allow me to summarize before moving on the second half of the chapter.

Data as a whole can either be structured or unstructured, meaning that the data can either take on an organized row/column structure with distinct features that describe each row of the dataset, or exist in a free-form state that usually must be pre-processed into a form that is easily digestible.

If data is structured, we can look at each column (feature) of the dataset as being either quantitative or qualitative. Basically, can the column be described using mathematics and numbers, or not? The next part of this chapter will break down data into four very specific and detailed levels. At each order, we will apply more complicated rules of mathematics...