Book Image

Principles of Data Science - Second Edition

By : Sinan Ozdemir, Sunil Kakade, Marco Tibaldeschi
Book Image

Principles of Data Science - Second Edition

By: Sinan Ozdemir, Sunil Kakade, Marco Tibaldeschi

Overview of this book

Need to turn programming skills into effective data science skills? This book helps you connect mathematics, programming, and business analysis. You’ll feel confident asking—and answering—complex, sophisticated questions of your data, making abstract and raw statistics into actionable ideas. Going through the data science pipeline, you'll clean and prepare data and learn effective data mining strategies and techniques to gain a comprehensive view of how the data science puzzle fits together. You’ll learn fundamentals of computational mathematics and statistics and pseudo-code used by data scientists and analysts. You’ll learn machine learning, discovering statistical models that help control and navigate even the densest datasets, and learn powerful visualizations that communicate what your data means.
Table of Contents (17 chapters)
16
Index

Chapter 12. Beyond the Essentials

In this chapter, we will be discussing some of the more complicated parts of data science that can put some people off. The reason for this is that data science is not all fun and machine learning. Sometimes, we have to discuss and consider theoretical and mathematical paradigms, and evaluate our procedures.

This chapter will explore many of these procedures step by step so that we completely and totally understand the topics. We will be discussing topics such as the following:

  • Cross-validation
  • The bias/variance trade-off
  • Overfitting and underfitting
  • Ensembling techniques
  • Random forests
  • Neural networks

These are only some of the topics to be covered. At no point do I want you to be confused. I will attempt to explain each procedure/algorithm with the utmost care and with many examples and visuals.