Book Image

Bayesian Analysis with Python

Book Image

Bayesian Analysis with Python

Overview of this book

The purpose of this book is to teach the main concepts of Bayesian data analysis. We will learn how to effectively use PyMC3, a Python library for probabilistic programming, to perform Bayesian parameter estimation, to check models and validate them. This book begins presenting the key concepts of the Bayesian framework and the main advantages of this approach from a practical point of view. Moving on, we will explore the power and flexibility of generalized linear models and how to adapt them to a wide array of problems, including regression and classification. We will also look into mixture models and clustering data, and we will finish with advanced topics like non-parametrics models and Gaussian processes. With the help of Python and PyMC3 you will learn to implement, check and expand Bayesian models to solve data analysis problems.
Table of Contents (15 chapters)
Bayesian Analysis with Python
Credits
About the Author
About the Reviewer
www.PacktPub.com
Preface
Index

Occam's razor – simplicity and accuracy


Suppose we have two models for the same data/problem and both seem to explain the data equally as well. Which model should we choose? There is a guiding principle or heuristic known as Occam's razor that loosely states that if we have two or more equivalent explanations for the same phenomenon, we should choose the simpler one. There are many justifications for this heuristic; one of them is related to the falsifiability criterion introduced by Popper, another takes a pragmatic perspective since simpler models are easier to understand than more complex models, and another justification is based on Bayesian statistics. Without getting into the details of these justifications, we are going to accept this criterion as a useful rule of thumb for the moment, something that sounds reasonable.

Another factor we generally should take into account when comparing models is their accuracy, that is, how well the model fits the data. We have already seen some measures...