Book Image

Machine Learning Fundamentals

By : Hyatt Saleh
Book Image

Machine Learning Fundamentals

By: Hyatt Saleh

Overview of this book

As machine learning algorithms become popular, new tools that optimize these algorithms are also developed. Machine Learning Fundamentals explains you how to use the syntax of scikit-learn. You'll study the difference between supervised and unsupervised models, as well as the importance of choosing the appropriate algorithm for each dataset. You'll apply unsupervised clustering algorithms over real-world datasets, to discover patterns and profiles, and explore the process to solve an unsupervised machine learning problem. The focus of the book then shifts to supervised learning algorithms. You'll learn to implement different supervised algorithms and develop neural network structures using the scikit-learn package. You'll also learn how to perform coherent result analysis to improve the performance of the algorithm by tuning hyperparameters. By the end of this book, you will have gain all the skills required to start programming machine learning algorithms.
Table of Contents (9 chapters)
Machine Learning Fundamentals
Preface

Preface

Note

About

This section briefly introduces the author, the coverage of this book, the technical skills you'll need to get started, and the hardware and software required to complete all of the included activities and exercises.

About the Book

As machine learning algorithms become popular, new tools that optimize these algorithms are also being developed. Machine Learning Fundamentals explains the scikit-learn API, which is a package created to facilitate the process of building machine learning applications. You will learn how to explain the differences between supervised and unsupervised models, and how to apply some popular algorithms to real-life datasets.

You'll begin by learning how to use the syntax of scikit-learn. You'll study the differences between supervised and unsupervised models, as well as the importance of choosing the appropriate algorithm for each dataset. You'll apply an unsupervised clustering algorithm to real-world datasets to discover patterns and profiles, and explore the process to solve an unsupervised machine learning problem. Then, the focus of the book shifts to supervised learning algorithms. You'll learn how to implement different supervised algorithms and develop neural network structures using the scikit-learn package. You'll also learn how to perform coherent result analysis to improve the performance of the algorithm by tuning hyperparameters. By the end of this book, you will have the skills and confidence to start programming machine learning algorithms.

About the Author

After graduating from college as a business administrator, Hyatt Saleh discovered the importance of data analysis to understand and solve real-life problems. Since then, as a self-taught person, she has not only worked as a freelancer for many companies around the world in the field of machine learning, but has also founded an artificial intelligence company that aims to optimize everyday processes.

Objectives

  • Understand the importance of data representation

  • Gain insights into the differences between supervised and unsupervised models

  • Explore data using the Matplotlib library

  • Study popular algorithms, such as K-means, Mean-Shift, and DBSCAN

  • Measure model performance through different metrics

  • Study popular algorithms, such as Naïve Bayes, Decision Tree, and SVM

  • Perform error analysis to improve the performance of the model

  • Learn to build a comprehensive machine learning program

Audience

Machine Learning Fundamentals is designed for developers who are new to the field of machine learning and want to learn how to use the scikit-learn library to develop machine learning algorithms. You must have some knowledge and experience with Python programming, but you do not need any prior knowledge of scikit-learn or machine learning algorithms.

Approach

Machine Learning Fundamentals takes a hands-on approach to introduce beginners to the world of machine learning. It contains multiple activities that use real-life business scenarios for you to practice and apply your new skills in a highly relevant context.

Minimum Hardware Requirements

For the optimal student experience, we recommend the following hardware configuration:

  • Processor: Intel Core i5 or equivalent

  • Memory: 4 GB RAM or higher

Software Requirements

You'll also need the following software installed in advance:

  • Sublime Text (latest version), Atom IDE (latest version), or other similar text editor applications

  • Python 3

  • The following Python libraries: NumPy, SciPy, scikit-learn, Matplotlib, Pandas, pickle, jupyter, and seaborn

Installation and Setup

Before you start this book, you'll need to install Python 3.6, pip, scikit-learn, and the other libraries used in this book. You will find the steps to install these here:

Installing Python

Install Python 3.6 by following the instructions at this link: https://realpython.com/installing-python/.

Installing pip

  1. To install pip, go to the following link and download the get-pip.py file: https://pip.pypa.io/en/stable/installing/.

  2. Then, use the following command to install it:

    python get-pip.py

You might need to use the python3 get-pip.py command, due to previous versions of Python on your computer are already using use the python command.

Installing libraries

Using the pip command, install the following libraries:

python -m pip install --user numpy scipy matplotlib jupyter pandas seaborn

Installing scikit-learn

Install scikit-learn using the following command:

pip install -U scikit-learn

Installing the Code Bundle

Copy the code bundle for the class to the C:/Code folder.

Additional Resources

The code bundle for this book is also hosted on GitHub at: https://github.com/TrainingByPackt/Machine-Learning-Fundamentals.

We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Conventions

Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: "Import the iris toy dataset using scikit-learn's datasets package and store it in a variable named iris_data."

A block of code is set as follows:

from sklearn.datasets import load_iris
iris_data = load_iris()

New terms and important words are shown in bold. Words that you see on the screen, for example, in menus or dialog boxes, appear in the text like this: "Below the dataset's title, find the download section and click on Data Folder."