Book Image

Bayesian Analysis with Python - Third Edition

By : Osvaldo Martin
Book Image

Bayesian Analysis with Python - Third Edition

By: Osvaldo Martin

Overview of this book

The third edition of Bayesian Analysis with Python serves as an introduction to the main concepts of applied Bayesian modeling using PyMC, a state-of-the-art probabilistic programming library, and other libraries that support and facilitate modeling like ArviZ, for exploratory analysis of Bayesian models; Bambi, for flexible and easy hierarchical linear modeling; PreliZ, for prior elicitation; PyMC-BART, for flexible non-parametric regression; and Kulprit, for variable selection. In this updated edition, a brief and conceptual introduction to probability theory enhances your learning journey by introducing new topics like Bayesian additive regression trees (BART), featuring updated examples. Refined explanations, informed by feedback and experience from previous editions, underscore the book's emphasis on Bayesian statistics. You will explore various models, including hierarchical models, generalized linear models for regression and classification, mixture models, Gaussian processes, and BART, using synthetic and real datasets. By the end of this book, you will possess a functional understanding of probabilistic modeling, enabling you to design and implement Bayesian models for your data science challenges. You'll be well-prepared to delve into more advanced material or specialized statistical modeling if the need arises.
Table of Contents (15 chapters)
Preface
12
Bibliography
13
Other Books You May Enjoy
14
Index

1.5 Interpreting probabilities

Probabilities can be interpreted in various useful ways. For instance, we can think that P(A) = 0.125 means that if we repeat the survey many times, we would expect all three individuals to answer “yes” about 12.5% of the time. We are interpreting probabilities as the outcome of long-run experiments. This is a very common and useful interpretation. It not only can help us think about probabilities but can also provide an empirical method to estimate probabilities. Do we want to know the probability of a car tire exploding if filled with air beyond the manufacturer’s recommendation? Just inflate 120 tires or so, and you may get a good approximation. This is usually called the frequentist interpretation.

Another interpretation of probability, usually called subjective or Bayesian interpretation, states that probabilities can be interpreted as measures of an individual’s uncertainty about events. In this interpretation, probabilities are about our state of knowledge of the world and are not necessarily based on repeated trials. Under this definition of probability, it is valid and natural to ask about the probability of life on Mars, the probability of the mass of an electron being 9.1 × 1031 kg, or the probability that the 9th of July of 1816 was a sunny day in Buenos Aires. All these are one-time events. We cannot re-create 1 million universes, each with one Mars, and check how many of them develop life. Of course, we can do this as a mental experiment, so long-term frequencies can still be a valid mental scaffold.

Sometimes the Bayesian interpretation of probabilities is described in terms of personal beliefs; I don’t like that. I think it can lead to unnecessary confusion as beliefs are generally associated with the notion of faith or unsupported claims. This association can easily lead people to think that Bayesian probabilities, and by extension Bayesian statistics, is less objective or less scientific than alternatives. I think it also helps to generate confusion about the role of prior knowledge in statistics and makes people think that being objective or rational means not using prior information.

Bayesian methods are as subjective (or objective) as any other well-established scientific method we have. Let me explain myself with an example: life on Mars exists or does not exist; the outcome is binary, a yes-no question. But given that we are not sure about that fact, a sensible course of action is trying to find out how likely life on Mars is. To answer this question any honest and scientific-minded person will use all the relevant geophysical data about Mars, all the relevant biochemical knowledge about necessary conditions for life, and so on. The response will be necessarily about our epistemic state of knowledge, and others could disagree and even get different probabilities. But at least, in principle, they all will be able to provide arguments in favor of their data, their methods, their modeling decisions, and so on. A scientific and rational debate about life on Mars does not admit arguments such as ”an angel told me about tiny green creatures.” Bayesian statistics, however, is just a procedure to make scientific statements using probabilities as building blocks.