Book Image

The Python Workshop

By : Olivier Pons, Andrew Bird, Dr. Lau Cher Han, Mario Corchero Jiménez, Graham Lee, Corey Wade
Book Image

The Python Workshop

By: Olivier Pons, Andrew Bird, Dr. Lau Cher Han, Mario Corchero Jiménez, Graham Lee, Corey Wade

Overview of this book

Have you always wanted to learn Python, but never quite known how to start? More applications than we realize are being developed using Python because it is easy to learn, read, and write. You can now start learning the language quickly and effectively with the help of this interactive tutorial. The Python Workshop starts by showing you how to correctly apply Python syntax to write simple programs, and how to use appropriate Python structures to store and retrieve data. You'll see how to handle files, deal with errors, and use classes and methods to write concise, reusable, and efficient code. As you advance, you'll understand how to use the standard library, debug code to troubleshoot problems, and write unit tests to validate application behavior. You'll gain insights into using the pandas and NumPy libraries for analyzing data, and the graphical libraries of Matplotlib and Seaborn to create impactful data visualizations. By focusing on entry-level data science, you'll build your practical Python skills in a way that mirrors real-world development. Finally, you'll discover the key steps in building and using simple machine learning algorithms. By the end of this Python book, you'll have the knowledge, skills and confidence to creatively tackle your own ambitious projects with Python.
Table of Contents (13 chapters)

Regularization: Ridge and Lasso

Regularization is an important concept in machine learning; it's used to counteract overfitting. In the world of big data, it's easy to overfit data to the training set. When this happens, the model will often perform badly on the test set as indicated by mean_squared_error, or some other error.

You may wonder why a test set is kept aside at all. Wouldn't the most accurate machine learning model come from fitting the algorithm on all the data?

The answer, generally accepted by the machine learning community after years of research and experimentation, is probably not.

There are two main problems with fitting a machine learning model on all the data:

  • There is no way to test the model on unseen data. Machine learning models are powerful when they make good predictions on new data. Models are trained on known results, but they perform in the real world on data that has never been seen before. It's not vital to see...