Bayesian Inference

The Bayesian Philosophy

The frequentist (or classical) point of view is based on the following postulates:

  1. Probability refers to limiting relative frequencies. Probabilities are objective properties of the real world
  2. Parameters are fixed, unknown constants. Because they are not fluctuating, no useful probability statements can be made about parameters.
  3. Statistical procedures should be designed to have well-defined long run frequency properties. For example, as 95 percent confidence interval should trap the true value of the parameter with limiting frequency at least 95 percent.

There is another approach to inference called Bayesian inference. The Bayesian approach is based on the following postulates:

  1. Probability describes degree of belief, not limiting frequencies. As such, we can make probability statements about lots of things, not just data which are subject to random variation. For example, one can say that “the probability that Albert Einstein drank a cup of tea on August 1, 1948” is 0.35. This does not refer to any limiting frequency. It reflects the strength of belief that the proposition is true.
  2. We can make probability statements about parameters, even though they are fixed constants.
  3. We make inferences about a parameter by producing a probability distribution for . Inferences such as point estimates and interval estimates may then be extracted from this distribution.

The Bayesian Method

Bayesian inference is usually carried out in the following way:

  1. We choose a probability density — called the prior distribution — that expresses our beliefs about a parameter before we see any data.
  2. We choose a statistical model that reflects our beliefs about given . Notice that we write this as instead of .
  3. After observing the data , we update our beliefs and calculate the posterior distribution .

For a discrete random variable,

which is Baye’s theorem. For continuous random variables, a similar version is obtained with density functions

If we have IID observations , we replace with

Now, using to mean and to mean ,

where

is called the normalizing constant. Since does not depend on , we can summarize the by writing

Tip

Posterior is proportional to Likelihood times Prior

Now that we have the posterior distribution, we can get a point estimate by summarizing the center of the posterior. Typically, we use the mean or mode of the posterior.

Definition

The posterior mean is

We can also obtain a Bayesian interval estimate. We find and such that

Definition

Let . Then

so is a posterior interval.

Functions of Parameters

Suppose we want to make inferences about a function . The posterior CDF for is

where . The posterior density is .

Sources

  • Wasserman, L. (2010). All of Statistics: A concise Course in Statistical Inference. Chapter 11.

0 items under this folder.