#### Overview of this book

Most of us have heard about the term Machine Learning, but surprisingly the question frequently asked by developers across the globe is, “How do I get started in Machine Learning?”. One reason could be attributed to the vastness of the subject area because people often get overwhelmed by the abstractness of ML and terms such as regression, supervised learning, probability density function, and so on. This book is a systematic guide teaching you how to implement various Machine Learning techniques and their day-to-day application and development. You will start with the very basics of data and mathematical models in easy-to-follow language that you are familiar with; you will feel at home while implementing the examples. The book will introduce you to various libraries and frameworks used in the world of Machine Learning, and then, without wasting any time, you will get to the point and implement Regression, Clustering, classification, Neural networks, and more with fun examples. As you get to grips with the techniques, you’ll learn to implement those concepts to solve real-world scenarios for ML applications such as image analysis, Natural Language processing, and anomaly detections of time series data. By the end of the book, you will have learned various ML techniques to develop more efficient and intelligent applications.
Table of Contents (10 chapters)
Preface
Free Chapter
Introduction - Machine Learning and Statistical Science
The Learning Process
Clustering
Linear and Logistic Regression
Neural Networks
Convolutional Neural Networks
Recurrent Neural Networks
Recent Models and Developments
Software Installation and Configuration

# Finding a common center - K-means

Here we go! After some necessary preparation review, we will finally start to learn from data; in this case, we are looking to label data we observe in real life.

In this case, we have the following elements:

• A set of N-dimensional elements of numeric type
• A predetermined number of groups (this is tricky because we have to make an educated guess)
• A set of common representative points for each group (called centroids)

The main objective of this method is to split the dataset into an arbitrary number of clusters, each of which can be represented by the mentioned centroids.

The word centroid comes from the mathematics world, and has been translated to calculus and physics. Here we find a classical representation of the analytical calculation of a triangle's centroid:

Graphical depiction of the centroid finding scheme for a triangle

The centroid...