Requires:
- All conditions contain independent samples
- the dependent scores are normally distributed interval or ratio scores
- the variances of the populations are homogenous
of each condition doesn’t need to be equal, but it’s way easier if they are. It only tests two-tailed hypothesis, but it’s actually a one-tailed test due to 0 being the same.
Diagram of an example study
Factor A: Independent variable of perceived difficulty | ||
---|---|---|
Level : Easy | Level : Medium | Level : Difficult |
X | X | X |
X | X | X |
k = 3 b/c there are 3 conditions.
This is similar to the two-sample t-test, but we can’t just do a bunch of pair-wise t-tests because the probability of making a Type I error is too high. This is because we’re doing multiple comparisons with a 0.05 margin for error.. which aggregates. ANOVA limits Type I probability to .
“sum of squares” or “SS” is really short for “sum of the squared deviations”
Definitions
experiment-wise error rate : The probability of making a Type I error somewhere among the comparisons in an experiment.
Tukey’s HSD test : HSD = Honestly Significant Difference is a post-hoc procedure done after ANOVA to compare means between factors when all levels have equal ‘s.
Example
. But not all are equal.
So given the above table:
easy | medium | difficult | |
---|---|---|---|
9 | 4 | 1 | |
12 | 6 | 3 | |
4 | 8 | 4 | |
8 | 2 | 5 | |
7 | 10 | 2 | totals |
sum(X): 40 | 30 | 15 | 85 |
sum(X^2): 354 | 220 | 55 | 629 |
n: 5 | 5 | 5 | 15 |
xbar: 8 | 6 | 3 | k=3 |
Calculate the total sum of squares:
Then calculate the sum of squares between groups
Then calculate the sum of squares within groups:
Compute the degrees of freedom:
Then get the mean square between groups ():
Within groups ():
then:
Which leaves us with:
Source | Sum of squares | df | mean square | f_obt |
---|---|---|---|---|
between | 63.33 | 2 | 31.67 | 4.52 |
within | 84 | 12 | 7 | |
total | 147.33 | 14 |
Distribution
Unlike t and z distribution, the f-distribution is positively skewed, b/c there’s no upper limit for how big f can be.. but it can never be lower than 0. Unlike those other distributions, finding requires the df for both between and within to look up in the “F-table”.
For the example above, we get , so this is sufficient evidence to reject the null hypothesis.
This leads us to conclude that there does appear to be a relationship between perceived difficulty and score. We don’t know if it applies to all columns, though.
Post-hoc test
Tukey’s HSD is is found in a table called “Values of Studentized Range Statistic” (table 5 in the appendix of Behavioral Sciences Stats). For , .
Get all the differences of level-mean combinations:
x1 = 8 ; aka easy x2 = 6 ; aka medium x3 = 3 ; aka hard
x1 - x2 = 2 x1 - x3 = 5 x2 - x3 = 3
Compare to HSD. If the absolute difference is greater than the HSD, then they have signifiant differences. (so easy -> hard was a significant difference)