Bootstrapping

Definition

The bootstrap is a method for estimating standard errors and computing confidence intervals.

Let $T_{n} = g (X_{1}, \dots, X_{n})$ be a statistic, that is, $T_{n}$ is any function of the data. Suppose we want to know $V_{F} (T_{n})$ , the variance of $T_{n}$ . Note that the subscript $F$ emphasizes that the variance usually depends on the unknown distribution $F$ . For example, if $T_{n} = \overset{ˉ}{X}_{n}$ , then $V_{F} (T_{n}) = σ^{2} / n$ where $σ^{2} = \int (x - μ)^{2} / d F (x)$ and $μ = \int x d F (x)$ . Thus, the variance of $T_{n}$ is a function of $F$ .

Intuition

If we only use the observed sample directly, we can certainly compute the statistic itself, such as the sample mean, median, or correlation. But that gives us only one realized value of the statistic. It does not tell us how much that statistic would change if we drew a different sample from the population.

That is the real goal of the bootstrap: not to recompute the statistic for its own sake, but to approximate the sampling distribution of the statistic. Once we have an approximation to that sampling distribution, we can estimate standard errors, variances, and confidence intervals.

Since the true population distribution $F$ is unknown, we replace it by the empirical distribution $\hat{F}_{n}$ , which puts mass $1/ n$ on each observed data point. Resampling with replacement from the observed sample is exactly the same as drawing from $\hat{F}_{n}$ . Repeating this many times lets us see how $T_{n}$ would vary if $\hat{F}_{n}$ were the population.

The bootstrap has two steps:

Estimate $V_{F} (T_{n})$ with $V_{\hat{F}_{n}} (T_{n})$ .
Approximate $V_{\hat{F}_{n}} (T_{n})$ using simulation

For $T_{n} = \overset{ˉ}{X}_{n}$ , we have for Step 1 that $V_{\hat{F}_{n}} (T_{n}) = \overset{σ}{^}^{2} / n$ , where $\overset{σ}{^}^{2} = n^{- 1} \sum_{i = 1}^{n} (X_{i} - \overset{ˉ}{X}_{n})$ . In this case, Step 1 is enough. However, in more complicated cases, we cannot write down a simple formula for $V_{\hat{F}_{n}} (T_{n})$ which is why we need Step 2.

Simulation

Suppose we draw an IID sample $Y_{1}, \dots, Y_{B}$ from a distribution $G$ . By the Law of Large Numbers,

\hat{Y}_{n} = \frac{1}{B} j = 1 \sum B Y_{j} P \int y d G (y) = E (Y)

as $B \to \infty$ . So if we draw a large sample from $G$ , we can use the sample mean $\overset{ˉ}{Y}_{n}$ to approximate $E (Y)$ . In a simulation, we can make $B$ as large as we like, in which case the difference between $\hat{Y}_{n}$ and $E (Y)$ is negligible.

More generally, if $h$ is any function with finite mean, then

\frac{1}{B} j = 1 \sum B h (Y_{j}) P \int h (y) d G (y) = E [h (Y)]

as $B \to \infty$ . In particular,

\frac{1}{B} j = 1 \sum B (Y_{j} - \overset{ˉ}{Y})^{2} = \frac{1}{B} j = 1 \sum B Y_{j}^{B} - (\frac{1}{B} j = 1 \sum B Y_{j})^{2} P \int y^{2} / d F (y) - (\int y d F (y))^{2} = V (Y) .

Hence, we can use the sample variance of the simulated values to approximate $V (Y)$ .

Limitations

The bootstrap is powerful, but it is not automatic or universally reliable.

It only uses the observed sample. If the sample is biased or unrepresentative, the bootstrap inherits that problem.
With small samples, the empirical distribution $\hat{F}_{n}$ may be a poor approximation to the true distribution $F$ , so the bootstrap approximation can be inaccurate.
It cannot recover behavior that is missing from the data, such as rare events or poorly observed tail behavior.
The ordinary bootstrap assumes the data are IID. For dependent data, such as time series or clustered observations, the usual bootstrap should not be used without modification.
Some statistics are not well approximated by the bootstrap, especially irregular or non-smooth ones such as maxima, minima, or post-model-selection estimators.
Bootstrap confidence intervals can have poor coverage in small samples or highly skewed settings unless more careful variants are used.
It can be computationally expensive when the statistic is costly to recompute many times.

So the bootstrap is most useful when the sample is reasonably representative, the IID assumption is plausible, and the statistic has a complicated sampling distribution but behaves regularly.

Parametric Bootstrap

For parametric models, standard errors and confidence intervals may also be estimated using the bootstrap. There is only one change. In the nonparametric bootstrap, we sample $X_{1}^{*}, \dots, X_{n}^{*}$ from the empirical distribution $\hat{F}_{n}$ . Int he parametric bootstrap we sample instead from $f (x; \hat{θ}_{n})$ . Here, $\hat{θ}_{n}$ could be the MLE or the method of moments estimator.

Jake Tuero

Explorer

Bootstrapping

Bootstrapping

Intuition

Simulation

Limitations

Parametric Bootstrap

Table of Contents