Book Image

Bayesian Analysis with Python

Book Image

Bayesian Analysis with Python

Overview of this book

The purpose of this book is to teach the main concepts of Bayesian data analysis. We will learn how to effectively use PyMC3, a Python library for probabilistic programming, to perform Bayesian parameter estimation, to check models and validate them. This book begins presenting the key concepts of the Bayesian framework and the main advantages of this approach from a practical point of view. Moving on, we will explore the power and flexibility of generalized linear models and how to adapt them to a wide array of problems, including regression and classification. We will also look into mixture models and clustering data, and we will finish with advanced topics like non-parametrics models and Gaussian processes. With the help of Python and PyMC3 you will learn to implement, check and expand Bayesian models to solve data analysis problems.
Table of Contents (15 chapters)
Bayesian Analysis with Python
Credits
About the Author
About the Reviewer
www.PacktPub.com
Preface
Index

Predictive accuracy measures


In the previous example, it is more or less easy to see that the order 0 model is very simple and the order 5 model is too complex, but what about the other two? How we can distinguish between those options? We need a more principled way of taking into account the accuracy on one side and the simplicity on the other. Two methods to estimate the out-of-sample predictive accuracy using only the within-sample data are:

  • Cross-validation: This is an empirical strategy based on dividing the available data into subsets that are used for fitting and evaluation in an alternated way

  • Information criteria: This is an umbrella term for several relatively simple expressions that can be considered as ways to approximate the results that we could have obtained by performing cross-validation

Cross-validation

On average, the accuracy of a model will be higher for the within-sample than for the out-of-sample accuracy. As we need data to fit the model and data to test it, one simple...