Book Image

Machine Learning Quick Reference

By : Rahul Kumar
Book Image

Machine Learning Quick Reference

By: Rahul Kumar

Overview of this book

Machine learning makes it possible to learn about the unknowns and gain hidden insights into your datasets by mastering many tools and techniques. This book guides you to do just that in a very compact manner. After giving a quick overview of what machine learning is all about, Machine Learning Quick Reference jumps right into its core algorithms and demonstrates how they can be applied to real-world scenarios. From model evaluation to optimizing their performance, this book will introduce you to the best practices in machine learning. Furthermore, you will also look at the more advanced aspects such as training neural networks and work with different kinds of data, such as text, time-series, and sequential data. Advanced methods and techniques such as causal inference, deep Gaussian processes, and more are also covered. By the end of this book, you will be able to train fast, accurate machine learning models at your fingertips, which you can easily use as a point of reference.
Table of Contents (18 chapters)
Title Page
Copyright and Credits
About Packt
Contributors
Preface
Index

Introduction


In the previous chapter, we understood what principal component analysis(PCA) is, how it works, and when we should be deploying it. However, as a dimensionality reduction technique, do you think that you can put this to use in every scenario? Can you recall the roadblock or the underlying assumption behind it that we discussed?

Yes, the most important assumption behind PCA is that it works for datasets that are linearly separable. However, in the real world, you don't get this kind of dataset very often. We need a method to capture non-linear data patterns.

 

On the left-hand side, we have got a dataset in which there are two classes. We can see that once we arrive at the projections and establish the components, PCA doesn't have an effect on it and that it is not able to separate it by a line in a 2D dimension. That is, PCA can only function well when we have got low-level dimensions and linearly separable data. The following plot shows the dataset of two classes:

This is why we...