Book Image

The Python Workshop - Second Edition

By : Corey Wade, Mario Corchero Jiménez, Andrew Bird, Dr. Lau Cher Han, Graham Lee
4.7 (3)
Book Image

The Python Workshop - Second Edition

4.7 (3)
By: Corey Wade, Mario Corchero Jiménez, Andrew Bird, Dr. Lau Cher Han, Graham Lee

Overview of this book

Python is among the most popular programming languages in the world. It’s ideal for beginners because it’s easy to read and write, and for developers, because it’s widely available with a strong support community, extensive documentation, and phenomenal libraries – both built-in and user-contributed. This project-based course has been designed by a team of expert authors to get you up and running with Python. You’ll work though engaging projects that’ll enable you to leverage your newfound Python skills efficiently in technical jobs, personal projects, and job interviews. The book will help you gain an edge in data science, web development, and software development, preparing you to tackle real-world challenges in Python and pursue advanced topics on your own. Throughout the chapters, each component has been explicitly designed to engage and stimulate different parts of the brain so that you can retain and apply what you learn in the practical context with maximum impact. By completing the course from start to finish, you’ll walk away feeling capable of tackling any real-world Python development problem.
Table of Contents (16 chapters)
13
Chapter 13: The Evolution of Python – Discovering New Python Features

Testing data with cross-validation

In cross-validation, also known as CV, the training data is split into five folds (any number will do, but five is standard). The ML algorithm is fit on one fold at a time and tested on the remaining data. The result is five different training and test sets that are all representative of the same data. The mean of the scores is usually taken as the accuracy of the model.

Note

For cross-validation, 5 folds is only one suggestion. Any natural number may be used, with 3 and 10 also being fairly common.

Cross-validation is a core tool for ML. Mean test scores on different folds are more reliable than one mean test score on the entire set, which we performed in the first exercise. When examining one test score, there is no way of knowing whether it is low or high. Five test scores give a better picture of the true accuracy of the model.

Cross-validation can be implemented in a variety of ways. A standard approach is to use cross_val_score,...