Post-hoc pairwise comparisons are commonly performed after significant effects have been found when there are three or more levels of a factor. After an ANOVA, you may know that the means of your response variable differ significantly across your factor, but you do not know which pairs of the factor levels are significantly different from each other. At this point, you can conduct pairwise comparisons.

We will demonstrate the how to conduct pairwise comparisons in R and the different
options for adjusting the p-values of these comparisons given the number of
tests conducted. We will be using the **hsb2** dataset and looking at the
variable **write** by **ses**. We will first look at the means and
standard deviations by **ses**.

hsb2<-read.table("https://stats.idre.ucla.edu/stat/data/hsb2.csv", sep=",", header=T) attach(hsb2) ses <- factor(ses) levels(ses) <- c("low","medium","high") female <- factor(female) levels(female) <- c("male","female") tapply(write, ses, mean)low medium high 50.61702 51.92632 55.91379tapply(write, ses, sd)low medium high 9.490391 9.106044 9.442874

## One-Way ANOVA

In R, we can run the ANOVA with the **aov** command.

a1 <- aov(write ~ ses) summary(a1)Df Sum Sq Mean Sq F value Pr(>F) ses 2 859 429.4 4.97 0.00784 ** Residuals 197 17020 86.4 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

From this output, we can see that **ses** is significant in the 2-degrees
of freedom test, but we do not know which pairs of **ses** levels are
significantly different from each other. However, this will require three
tests (high vs. low, high vs. middle, low vs. middle), so we wish to adjust what
we consider to be statistically significant to account for this multiplicity of
tests. For an one-way ANOVA (ANOVA with a single factor) We can first see the unadjusted p-values using the **
pairwise.t.test** command and indicating no adjustment of p-values:

pairwise.t.test(write, ses, p.adj = "none")Pairwise comparisons using t tests with pooled SD data: write and ses low medium medium 0.4306 - high 0.0041 0.0108 P value adjustment method: none

With this same command, we can adjust the p-values according to a variety of methods. Below we show Bonferroni and Holm adjustments to the p-values and others are detailed in the command help.

pairwise.t.test(write, ses, p.adj = "bonf")Pairwise comparisons using t tests with pooled SD data: write and ses low medium medium 1.000 - high 0.012 0.032 P value adjustment method: bonferronipairwise.t.test(write, ses, p.adj = "holm")Pairwise comparisons using t tests with pooled SD data: write and ses low medium medium 0.431 - high 0.012 0.022 P value adjustment method: holm

We can see that the adjustments all lead to increased p-values, but consistently the high-low and high-middle pairs appear to be significantly different at alpha = .05. The **pairwise.t.test **command does not offer Tukey post-hoc tests, but there are other R commands that allow for Tukey comparisons. Below, we show code for using the **TukeyHSD** (Tukey Honest Significant Differences).

TukeyHSD(a1)Tukey multiple comparisons of means 95% family-wise confidence level Fit: aov(formula = write ~ ses) $ses diff lwr upr p adj medium-low 1.309295 -2.6052575 5.223847 0.7096950 high-low 5.296772 0.9887256 9.604818 0.0114079 high-medium 3.987477 0.3296892 7.645265 0.0289035

We can see that these results are significant with what we saw using other adjustments for the p-values.

## Two (or more) Factor ANOVA

You may be fitting an ANOVA with multiple factors. Below we look at **
write** on **ses** and **female**.

a2 <- aov(write ~ ses + female) summary(a2)Df Sum Sq Mean Sq F value Pr(>F) ses 2 859 429.4 5.387 0.00528 ** female 1 1398 1398.1 17.541 4.25e-05 *** Residuals 196 15622 79.7 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

We can look at pair-wise comparisons of the **ses** levels after adjusting
for female. The **TukeyHSD** command still works well, though now we must
specify which factor is of interest.

TukeyHSD(a2, "ses")Tukey multiple comparisons of means 95% family-wise confidence level Fit: aov(formula = write ~ ses + female) $ses diff lwr upr p adj medium-low 1.309295 -2.4507360 5.069325 0.6896199 high-low 5.296772 1.1587797 9.434764 0.0079527 high-medium 3.987477 0.4740753 7.500879 0.0216707