Book Image

Python Data Science Essentials [Video]

By : Alberto Boschetti, Luca Massaron
Book Image

Python Data Science Essentials [Video]

By: Alberto Boschetti, Luca Massaron

Overview of this book

<p>The Python Data Science Essentials video series takes you through all you need to know to succeed in data science using Python. Get insights into the core of Python data, including the latest versions of Jupyter Notebook, NumPy, Pandas and scikit-learn. In this course, you will delve into building your essential Python 3.6 data science toolbox, using a single-source approach that will allow to work with Python 2.7 as well. Get to grips fast with data munging and preprocessing, and prepare for machine learning and visualization techniques.</p> <p>The code bundle for the video course is available at -&nbsp;<a href="https://github.com/PacktPublishing/Python-Data-Science-Essentials" target="_blank">https://github.com/PacktPublishing/Python-Data-Science-Essentials</a></p> <h1>Style and Approach</h1> <p>The course is structured as a data science project. You will always benefit from clear code and simplified examples to help you understand the underlying mechanics and real-world datasets.</p>
Table of Contents (4 chapters)
Chapter 3
The Data Pipeline
Content Locked
Section 7
Cross-Validation
Unfortunately, relying on the validation and testing phases of samples brings uncertainty along with a reduction of the learning examples dedicated to training. A solution would be to use cross-validation, and Scikit-learn offers a complete module for cross-validation and performance evaluation. Let’s take a step further and see how to use these in our code. - Use the three possible hypotheses for the digits dataset - Use cross-validation iterators - Perform sampling and bootstrapping