Book Image

Machine Learning with Python

By : Oliver Theobald
Book Image

Machine Learning with Python

By: Oliver Theobald

Overview of this book

The course starts by setting the foundation with an introduction to machine learning, Python, and essential libraries, ensuring you grasp the basics before diving deeper. It then progresses through exploratory data analysis, data scrubbing, and pre-model algorithms, equipping you with the skills to understand and prepare your data for modeling. The journey continues with detailed walkthroughs on creating, evaluating, and optimizing machine learning models, covering key algorithms such as linear and logistic regression, support vector machines, k-nearest neighbors, and tree-based methods. Each section is designed to build upon the previous, reinforcing learning and application of concepts. Wrapping up, the course introduces the next steps, including an introduction to Python for newcomers, ensuring a comprehensive understanding of machine learning applications.
Table of Contents (18 chapters)
Free Chapter
1
FOREWORD
2
DATASETS USED IN THIS BOOK
3
INTRODUCTION
4
DEVELOPMENT ENVIRONMENT
5
MACHINE LEARNING LIBRARIES
6
EXPLORATORY DATA ANALYSIS
7
DATA SCRUBBING
8
PRE-MODEL ALGORITHMS
9
SPLIT VALIDATION
10
MODEL DESIGN
11
LINEAR REGRESSION
12
LOGISTIC REGRESSION
13
SUPPORT VECTOR MACHINES
14
k-NEAREST NEIGHBORS
15
TREE-BASED METHODS
16
NEXT STEPS
APPENDIX 1: INTRODUCTION TO PYTHON
APPENDIX 2: PRINT COLUMNS

DATA SCRUBBING

 

Similar to Swiss or Japanese watch design, a good machine learning model should run smoothly and contain no extra parts. This means avoiding syntax or other errors that prevent the code from executing and removing redundant variables that might clog up the model’s decision path.

This inclination towards simplicity extends to beginners coding their first model. When working with a new algorithm, it helps to create a minimal viable model and add complexity to the code later. If you find yourself at an impasse, look at the troublesome element and ask, “Do I need it?” If the model can’t handle missing values or multiple variable types, the quickest cure is to remove those variables. This should help the afflicted model spring to life and breathe normally. Once the model is working, you can go back and add complexity to your code.

Let’s now take a look at specific data scrubbing techniques to prepare, streamline,...