Analysis of Variance (ANOVA)
Comparing the means of 3 or more groups simultaneously. Why not just do multiple t-tests? Because that increases the risk of Type I error!
The Concept
ANOVA splits the total variation in the data into two parts:
- Between-Group Variation: Differences caused by the treatment (e.g., different drugs).
- Within-Group Variation: Random noise or error.
$$ F = \frac{\text{Variance Between Groups}}{\text{Variance Within Groups}} $$
If $F$ is large, the treatment has a significant effect.
One-Way ANOVA
Used when there is only one factor (independent variable). Example: Effect of 3 different fertilizers on crop yield.
ANOVA Table
| Source | Sum of Squares (SS) | df | Mean Square (MS) | F |
|---|---|---|---|---|
| Between | SSB | $k-1$ | $MSB = SSB/df$ | $MSB/MSW$ |
| Within | SSW | $N-k$ | $MSW = SSW/df$ | |
| Total | SST | $N-1$ |
Two-Way ANOVA
Used when there are two factors. Example: Effect of Fertilizer AND Watering Frequency on crop yield.
It allows us to test for Interaction Effects (e.g., maybe Fertilizer A works best only with high watering).
Test Yourself
Q1: If the F-statistic is close to 1, what does it mean?