Errors & Power in Hypothesis Testing

No statistical test is perfect. We must understand the risks of making the wrong decision.

Type I and Type II Errors

When we make a decision to Reject or Fail to Reject $H_0$, there are four possibilities:

Decision $H_0$ is True $H_0$ is False
Reject $H_0$ Type I Error ($\alpha$)
False Positive
Correct Decision ($1-\beta$)
Power
Fail to Reject $H_0$ Correct Decision ($1-\alpha$)
Confidence
Type II Error ($\beta$)
False Negative
Legal Analogy

$H_0$: Defendant is Innocent.

  • Type I Error: Convicting an innocent person. (Very bad, so we keep $\alpha$ low).
  • Type II Error: Letting a guilty person go free.

Power of a Test ($1 - \beta$)

The probability of correctly rejecting a false null hypothesis. In other words, the ability of the test to detect an effect if one actually exists.

How to increase Power:

  • Increase Sample Size ($n$).
  • Increase Significance Level ($\alpha$) - but this increases Type I error risk.
  • Reduce variability ($\sigma$).

The P-Value Approach

Instead of comparing Z-scores, we often compare probabilities.

P-Value: The probability of seeing data this extreme (or more extreme) assuming $H_0$ is true.

Decision Rule:

  • If $P\text{-value} \le \alpha \rightarrow$ Reject $H_0$. (Result is statistically significant).
  • If $P\text{-value} > \alpha \rightarrow$ Fail to Reject $H_0$.

"If P is low, the Null must go."

Test Yourself

Q1: A "False Positive" (detecting an effect that isn't there) corresponds to which error?

  • Type I Error
  • Type II Error
  • Standard Error

Q2: If you want to increase the Power of your test without changing $\alpha$, what should you do?

  • Decrease sample size
  • Increase sample size
  • Increase variance