Back to Homepage

Repeated Measures ANOVA

This tutorial will cover the repeated measures analysis of variance (ANOVA) test, which has traditionally been used as the multivariate (or multi-group) alternative to the paired samples t-test. Note that newer, arguably better methods are currently being used as well, namely linear mixed-effects models (these will be covered in the next tutorial). Accordingly, this tutorial will be rather brief.

Description of sample data

This data comprises L2 English essays written over a two year period by nine middle-school aged Dutch children studying at an English/Dutch bilingual school in the Netherlands. Essays were collected three times a year (roughly every four months) over two academic years. Included in the dataset are holistic scores for each essay (“Score”) and mean length of T-unit (MLT) values. In this tutorial, we will explore the relationship between holistic scores and time spent studying English, with the alternative hypothesis that holistic essay scores will increase as a function of time. For further reference, see Kyle (2016).

mydata <- read.csv("data/RM_sample.csv", header = TRUE)
#First, we create a new variable that is the categorical version of Time
mydata$FTime <- factor(mydata$Time)
summary(mydata)
##  Participant             Time         Score           MLT         FTime
##  Length:54          Min.   :1.0   Min.   :1.00   Min.   : 6.895   1:9  
##  Class :character   1st Qu.:2.0   1st Qu.:3.00   1st Qu.: 9.438   2:9  
##  Mode  :character   Median :3.5   Median :4.00   Median :10.976   3:9  
##                     Mean   :3.5   Mean   :4.13   Mean   :11.517   4:9  
##                     3rd Qu.:5.0   3rd Qu.:5.00   3rd Qu.:12.906   5:9  
##                     Max.   :6.0   Max.   :7.00   Max.   :18.889   6:9

Visualizing the data

First, we can look at the means at each time point.

library(ggplot2)
ggplot(data = mydata, aes(x = FTime, y = Score, group = Time)) +
  geom_boxplot()