Bayesian Inference
The Bayesian Philosophy
The frequentist (or classical) point of view is based on the following postulates:
- Probability refers to limiting relative frequencies. Probabilities are objective properties of the real world
- Parameters are fixed, unknown constants. Because they are not fluctuating, no useful probability statements can be made about parameters.
- Statistical procedures should be designed to have well-defined long run frequency properties. For example, as 95 percent confidence interval should trap the true value of the parameter with limiting frequency at least 95 percent.
There is another approach to inference called Bayesian inference. The Bayesian approach is based on the following postulates:
- Probability describes degree of belief, not limiting frequencies. As such, we can make probability statements about lots of things, not just data which are subject to random variation. For example, one can say that “the probability that Albert Einstein drank a cup of tea on August 1, 1948” is 0.35. This does not refer to any limiting frequency. It reflects the strength of belief that the proposition is true.
- We can make probability statements about parameters, even though they are fixed constants.
- We make inferences about a parameter
by producing a probability distribution for . Inferences such as point estimates and interval estimates may then be extracted from this distribution.
The Bayesian Method
Bayesian inference is usually carried out in the following way:
- We choose a probability density
— called the prior distribution — that expresses our beliefs about a parameter before we see any data. - We choose a statistical model
that reflects our beliefs about given . Notice that we write this as instead of . - After observing the data
, we update our beliefs and calculate the posterior distribution .
For a discrete random variable,
which is Baye’s theorem. For continuous random variables, a similar version is obtained with density functions
If we have
Now, using
where
is called the normalizing constant. Since
Tip
Posterior is proportional to Likelihood times Prior
Now that we have the posterior distribution, we can get a point estimate by summarizing the center of the posterior. Typically, we use the mean or mode of the posterior.
Definition
The posterior mean is
We can also obtain a Bayesian interval estimate. We find
Definition
Let
. Then so
is a posterior interval.
Functions of Parameters
Suppose we want to make inferences about a function
where
Sources
- Wasserman, L. (2010). All of Statistics: A concise Course in Statistical Inference. Chapter 11.