Book Image

Principles of Strategic Data Science

By : Peter Prevos
Book Image

Principles of Strategic Data Science

By: Peter Prevos

Overview of this book

Mathematics and computer science form an integral part of data science, and understanding them is crucial for efficiently managing data. This book is designed to take you through the entire data science pipeline and help you join the dots between mathematics, programming, and business analysis. You’ll start by learning what data science is and how organizations can use it to revolutionize the way they use their data. The book then covers the criteria for the soundness of data products and demonstrates how to effectively visualize information. As you progress, you’ll discover the strategic aspects of data science by exploring the five-phase framework that enables you to enhance the value you extract from data. Toward the concluding chapters, you’ll understand the role of a data science manager in helping an organization take the data-driven approach. By the end of this book, you’ll have a good understanding of data science and how it can enable you to extract value from your data.
Table of Contents (6 chapters)

Collecting Data

The ultimate purpose of collecting data is to enable decisions that improve reality or reduce the risk of future adverse impacts. Chapter 2, Good Data Science discussed the importance of understanding the relationship between data and the reality it describes. This relationship is the essence of domain knowledge. Professionals in their respective domains are trained and experienced with measuring the reality they manage. Chapter 2, Good Data Science described some of the differences in measurement between the two domains.

The literature often distinguishes between raw data and processed data as the main ingredients of analysis. The idea that data can be raw and natural is deceiving. There is no such thing as raw data, because every time we collect information from a physical or social process, we need to decide how the data is collected. These decisions are always informed by assumptions about what this reality looks like before we see the data. We cannot measure anything...