Book Image

Training Systems using Python Statistical Modeling

By : Curtis Miller
Book Image

Training Systems using Python Statistical Modeling

By: Curtis Miller

Overview of this book

Python's ease-of-use and multi-purpose nature has made it one of the most popular tools for data scientists and machine learning developers. Its rich libraries are widely used for data analysis, and more importantly, for building state-of-the-art predictive models. This book is designed to guide you through using these libraries to implement effective statistical models for predictive analytics. You’ll start by delving into classical statistical analysis, where you will learn to compute descriptive statistics using pandas. You will focus on supervised learning, which will help you explore the principles of machine learning and train different machine learning models from scratch. Next, you will work with binary prediction models, such as data classification using k-nearest neighbors, decision trees, and random forests. The book will also cover algorithms for regression analysis, such as ridge and lasso regression, and their implementation in Python. In later chapters, you will learn how neural networks can be trained and deployed for more accurate predictions, and understand which Python libraries can be used to implement them. By the end of this book, you will have the knowledge you need to design, build, and deploy enterprise-grade statistical models for machine learning using Python and its rich ecosystem of libraries for predictive analytics.
Table of Contents (9 chapters)

Binary Prediction Models

In this chapter, we will look at various methods for classifying data, and focus on binary data. We will start with a simple algorithm—the k-nearest neighbors algorithm. Next, we will move on to decision trees. We will then look at an ensemble method and combine multiple decision trees into a random forest classifier. After that, we will move on to linear classifiers, the first being the Naive Bayes algorithm. Then, we will see how to train support vector machines. Following this, we will look at another well-known and extensively used classifierlogistic regression. Finally, we will see how we can extend algorithms for binary classification to algorithms that are capable of multiclass classification.

The following topics will be covered in this chapter:

  • K-nearest neighbors classifier
  • Decision trees
  • Random forests
  • Naive Bayes
  • Support vector...