Book Image

Data Science Algorithms in a Week

By : Dávid Natingga
Book Image

Data Science Algorithms in a Week

By: Dávid Natingga

Overview of this book

<p>Machine learning applications are highly automated and self-modifying, and they continue to improve over time with minimal human intervention as they learn with more data. To address the complex nature of various real-world data problems, specialized machine learning algorithms have been developed that solve these problems perfectly. Data science helps you gain new knowledge from existing data through algorithmic and statistical analysis.</p> <p>This book will address the problems related to accurate and efficient data classification and prediction. Over the course of 7 days, you will be introduced to seven algorithms, along with exercises that will help you learn different aspects of machine learning. You will see how to pre-cluster your data to optimize and classify it for large datasets. You will then find out how to predict data based on the existing trends in your datasets.</p> <p>This book covers algorithms such as: k-Nearest Neighbors, Naive Bayes, Decision Trees, Random Forest, k-Means, Regression, and Time-series. On completion of the book, you will understand which machine learning algorithm to pick for clustering, classification, or regression and which is best suited for your problem.</p>
Table of Contents (12 chapters)
11
Glossary of Algorithms and Methods in Data Science

Proof of Bayes' theorem and its extension

Bayes' theorem states the following:

P(A|B)=[P(B|A) * P(A)]/P(B)

Proof:

We can prove this theorem using elementary set theory on the probability spaces of the events A and B. That is, here, a probability event will be defined as the set of the possible outcomes in the probability space:

Figure 2.1: Probability space for the two events
From figure 2.1 above, we can state the following relationships:

P(A|B)=P(AB)/P(B)

P(B|A)=P(AB)/P(A)

Rearranging these relationships, we get the following:

P(AB)=P(A|B)*P(B)

P(AB)=P(B|A)*P(A)

P(A|B)*P(B)=P(B|A)*P(A)

This is, in fact, Bayes' theorem:

P(A|B)=P(B|A)*P(A)/P(B)

This concludes the proof.

Extended Bayes' theorem

...