Book Image

Spark Cookbook

By : Rishi Yadav
Book Image

Spark Cookbook

By: Rishi Yadav

Overview of this book

Table of Contents (19 chapters)
Spark Cookbook
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Introduction


The following is Wikipedia's definition of unsupervised learning:

"In machine learning, the problem of unsupervised learning is that of trying to find hidden structure in unlabeled data."

In contrast to supervised learning where we have labeled data to train an algorithm, in unsupervised learning we ask the algorithm to find a structure on its own. Let's take a look at the following sample dataset:

As you can see from the preceding graph, the data points are forming two clusters as follows:

In fact, clustering is the most common type of unsupervised learning algorithm.