The Permutation Test

The permutation test is a nonparametric method for testing whether two distributions are the same. This test is exact, meaning that it is not based on large sample theory or approximations.

Suppose that $X_{1}, \dots, X_{m} \sim F_{X}$ and $Y_{1}, \dots, Y_{n} \sim F_{Y}$ are two independent samples and $H_{0}$ is the hypothesis that two samples are identically distributed. This is the type of hypothesis we would consider when testing whether a treatment differs from a placebo. More precisely we are testing

H_{0} : F_{X} = F_{Y} versus H_{1} : F_{X} \neq = F_{Y} .

Let $T = (x_{1}, \dots, x_{m}, y_{1}, \dots, y_{n})$ be some test statistic. For example,

T (X_{1}, \dots, X_{m}, Y_{1}, \dots, Y_{n}) = \overset{ˉ}{X}_{m} - Y_{n} .

Let $N = m + n$ and consider forming all $N!$ permutations of the data $X_{1}, \dots, X_{m}, Y_{1}, \dots, Y_{n}$ . For each permutation, compute the test statistic $T$ . Denote these values by $T_{1}, \dots, T_{N!}$ . Under the null hypothesis, each of these values is equally likely. The distribution $P_{0}$ that puts mass $1/ N!$ on each $T_{j}$ is called the permutation distribution of $T$ . Let $t_{obs}$ be the observed value of the test statistic. Assuming we reject when $T$ is large, the p-value is

p-value = P_{0} (T > t_{obs}) = \frac{1}{N !} j = 1 \sum N! I {T_{j} > t_{obs}} .

Usually its not practical to evaluate all $N!$ permutations. We can approximate the p-value by sampling randomly from the set of permutations. The fraction of times $T_{j} > t_{obs}$ among these samples approximate the p-value.

Algorithm for Permutation Test

Compute the observed value of the test statistic

$t_{obs} = T (X_{1}, \dots, X_{m}, Y_{1}, \dots, Y_{n}) .$

Randomly permute the data. Compute the statistic again using the permutated data.

Repeat the previous step $B$ times and let $T_{1}, \dots, T_{B}$ denote the resulting values.

THe approximate p-value is

$\frac{1}{B} j = 1 \sum B I {T_{j} > t_{obs}} .$

Tip

In large samples, the permutation test usually gives similar results to a test that is based on large sample theory. The permutation test is thus most useful for small samples.

Jake Tuero

Explorer

The Permutation Test

The Permutation Test