Overview of this book

Practical Data Analysis
Credits
Foreword
Acknowledgments
www.PacktPub.com
Preface
Free Chapter
Getting Started
Working with Data
Data Visualization
Text Classification
Similarity-based Image Retrieval
Simulation of Stock Prices
Predicting Gold Prices
Working with Support Vector Machines
Modeling Infectious Disease with Cellular Automata
Working with Social Graphs
Data Processing and Aggregation with MongoDB
Working with MapReduce
Online Data Analysis with IPython and Wakari
Setting Up the Infrastructure
Index

Bayesian classification

Probabilistic classification is a practical way to draw inferences based on data, using statistical inference to find the best class for a given value. Given the probability distribution, we can select the best option with the highest probability. The Bayes theorem is the basic rule to draw inferences. The Bayes theorem allows us to update the likelihood of an event, given the new data or observations. In other words, it allows us to update the prior probability P (A) to the posterior probability P (A|B). The prior probability is given by the likelihood before the data is evaluated and the posterior probability is assigned after the data is taken into account. The following expression represents the Bayes theorem:

Naïve Bayes algorithm

Naïve Bayes is the simplest classification algorithm among Bayesian classification methods. In this algorithm, we simply need to learn the probabilities by making the assumption that the attributes A and B are independent, that's why...