Book Image

Beginning Data Science with Python and Jupyter

By : Chris DallaVilla
Book Image

Beginning Data Science with Python and Jupyter

By: Chris DallaVilla

Overview of this book

Getting started with data science doesn’t have to be an uphill battle. This step-by-step video course is ideal for beginners who know a little Python and are looking for a quick, fast-paced introduction. Get to grips with the skills you need for entry-level data science in this hands-on Python and Jupyter course. You’ll learn about some of the most commonly used libraries that are part of the Anaconda distribution, and then explore machine learning models with real datasets to give you the skills and exposure you need for the real world.We'll start with understanding the basics of Jupyter and its standard features. You'll be analyzing an example of a data analytics report. After analyzing a data analytics report, next step is to implement multiple classification algorithms. We’ll then show you how easy it can be to scrape and gather your own data from the open web, so that you can apply your new skills in an actionable context. Finish up by learning to visualize these data interactively. The code bundle for this course is available at https://github.com/TrainingByPackt/Beginning-Data-Science-with-Python-and-Jupyter-eLearning
Table of Contents (3 chapters)
Chapter 2
Data Cleaning and Advanced Machine Learning
Content Locked
Section 4
Training Classification Models
As we've already seen in the previous lesson, using libraries such as scikit-learn and platforms such as Jupyter, predictive models can be trained in just a few lines of code. This is possible by abstracting away the difficult computations involved with optimizing model parameters. In other words, we deal with a black box where the internal operations are hidden instead. With this simplicity also comes the danger of misusing algorithms, for example, by overfitting during training or failing to properly test on unseen data. We'll show how to avoid these pitfalls while training classification models and produce trustworthy results with the use of k-fold cross validation and validation curves. This video covers: - Regression and Classification - Binary and Multi-class Classification - Classification Models - Demo on Training Two-feature Classification Models with Scikit-learn - Demo on Training k-nearest Neighbors For Our Model - Demo - Training a Random Forest