Book Image

Machine Learning Fundamentals

By : Hyatt Saleh
Book Image

Machine Learning Fundamentals

By: Hyatt Saleh

Overview of this book

As machine learning algorithms become popular, new tools that optimize these algorithms are also developed. Machine Learning Fundamentals explains you how to use the syntax of scikit-learn. You'll study the difference between supervised and unsupervised models, as well as the importance of choosing the appropriate algorithm for each dataset. You'll apply unsupervised clustering algorithms over real-world datasets, to discover patterns and profiles, and explore the process to solve an unsupervised machine learning problem. The focus of the book then shifts to supervised learning algorithms. You'll learn to implement different supervised algorithms and develop neural network structures using the scikit-learn package. You'll also learn how to perform coherent result analysis to improve the performance of the algorithm by tuning hyperparameters. By the end of this book, you will have gain all the skills required to start programming machine learning algorithms.
Table of Contents (9 chapters)
Machine Learning Fundamentals
Preface

Summary


Using the knowledge from previous chapters, we started this chapter by performing an analysis on the Census Income Dataset, with the objective of understanding the data available and making decisions for the preprocessing process. Three supervised learning classification algorithms—the Naïve Bayes algorithm, the Decision Tree algorithm, and the SVM algorithm—were explained, and were applied to the previously preprocessed dataset to create models that generalized to the training data. Finally, we compared the performance of the three models on the Census Income Dataset by calculating the accuracy, precision, and recall on the different sets of data (training, validation, and testing).

In the next chapter, we will look at Artificial Neural Networks (ANNs), their different types, and their advantages and disadvantages. We will also use the ANN to solve the same data problem that was discussed here, and to compare its performance with that of the other supervised learning algorithms.