The Wald Test

Let be a scalar parameter, let be an estimate of , and let be the estimated standard error of .

Hypothesis testing says that a test is built by choosing a test statistic and a critical value , then rejecting when looks too extreme under . The Wald test is one of the most common ways to do this when an estimator is approximately Normal.

Definition (The Wald Test)

Consider testing

Assume that is asymptotically Normal:

The size Wald Test is: reject when where

Theorem

Asymptotically, the Wald test has a size , that is,

as .

Intuition

The Wald statistic

measures how many estimated standard errors the observed estimate is away from the null value .

If is true, then , so we expect

That means values of near are typical under the null, while very large positive or negative values are unlikely under the null. So the Wald test says:

Intuition

Pretend the null value is the truth. Then ask: would an estimate this far from , relative to its usual noise level, be surprising?

If yes, reject . If no, keep .

This is exactly the general hypothesis testing template with

So the rejection region is

Why Standardize?

The raw difference by itself is not enough, because the same absolute difference can be very meaningful or not meaningful depending on the estimator’s variability.

  • If is small, even a modest difference from is strong evidence against .
  • If is large, the same difference may just be sampling noise.

Dividing by puts the discrepancy on a common scale: “how many noise levels away from the null are we?”

Connection to Confidence Intervals

The Wald test is the testing version of a normal-based confidence interval. The usual approximate interval is

The null hypothesis is rejected exactly when is outside this interval, since

is equivalent to saying that lies more than standard errors away from .

This is often the cleanest way to remember the Wald test:

Note

A Wald test rejects a null value precisely when that null value is not plausible according to the corresponding normal-based confidence interval.

Example

Suppose and we want to test

Let . Since

we estimate the standard error by

The Wald statistic is then

If , we reject .

Informally: if the observed sample proportion is several estimated standard errors away from the hypothesized proportion , then is not a convincing explanation for the data.

Caveat

The Wald test is simple and widely used, but it can be inaccurate in small samples, for skewed estimators, or when the parameter is near the boundary of its parameter space. In those settings, score tests, likelihood-ratio tests, or bootstrap-based methods can behave better.