Chi-Square Tests ($\chi^2$)
Used for categorical data. Does the data fit a pattern? Are two variables related?
Goodness of Fit Test
Tests if a sample matches a population distribution. (e.g., Is a die fair?)
$$ \chi^2 = \sum \frac{(O_i - E_i)^2}{E_i} $$
$O_i$ = Observed Frequency
$E_i$ = Expected Frequency
$O_i$ = Observed Frequency
$E_i$ = Expected Frequency
Fair Die Example
Roll a die 60 times. Expected ($E$) is 10 for each face. If we get 15 ones, 5 twos, etc., we calculate $\chi^2$ to see if the deviation is too large.
Test of Independence
Tests if two categorical variables are related. (e.g., Gender vs. Preference for Coffee/Tea).
We use a Contingency Table.
Expected Frequency for a cell:
$$ E_{ij} = \frac{(\text{Row Total}) \times (\text{Column Total})}{\text{Grand Total}} $$
Degrees of freedom: $df = (r-1)(c-1)$.
Test Yourself
Q1: If $\chi^2 = 0$, what does that mean?