Loading [MathJax]/jax/output/HTML-CSS/jax.js

Motivation

Regression models attempt to explain the variation in an outcome (dependent variable) through its relationship with a set of one or more predictors (independent variables).

Although a regression coefficient quantifies the strength of the relationship between outcome and predictor (effect size), it is expressed in units specific to the predictor, so direct comparison to other predictors’ coefficients is not straightforward.

Standardized effect sizes, like Cohen’s d and R2, are unitless and thus permit direct comparison between predictors measured in different units, even acros different studies.

Multilevel regression models are commonly used to model data that are hierarchically structured, which are usually characterized by non-independence of observations collected within aggregate units.

However, currently no set of standardized effect size measures have been widely adopted by researchers using multilevel models.

Purpose

This workshop discusses a new set of R2 measures introduced by Rights and Sterba in their 2019 article ” Quantifying Explained Variance in Multilevel Models: An Integrative Framework for Defining R-Squared Measures”.

The authors of the r2mlm package have themselves said that using their new R2 measures requires a solid understanding of multilevel models (Shaw et al., 2023).

To this end, we will discuss:

Linear Regression

Understanding variation

Statistical models like linear regression use a set of predictors to explain the variation in the outcome.

Later in this workshop we will learn how to quantify how much variation has been explained, but first we need to understand how variation is measured.

Imagine we have an outcome, yi, measured for units i=1,...,N.

One common way to quantify variation is to measure deviations from the mean, yiˉy, where

ˉy=1nni=1yi

Sums of squares

The sum of the squared deviations from the mean, or sums of squares, is a simple measure of variability.

For variable yi, the total sums of squares (TSS) is:

TSS=ni=1(yiˉy)2

Sums of squares always increase when we add data (because squared deviations are always non-negative), so we might want a measure of variability that we can use to compare variables with different sample sizes.

Variance

Dividing the TSS for yi by n yields the variance of yi, σ2y:

σ2y=TSSn=1nni=1(yiˉy)2

The variance is thus the average squared deviation from the mean and will thus not necessarily increase as data are added.

The formula above can be used for a population. If our yi is instead a sample from a population and we wish to estimate σ2y in the population, we should divide TSS by n1 instead of n.

ˆσ2y=1n1ni=1(yiˉy)2 ^ signifies that we are estimating a population parameter from a sample (and above, ˉy=ˆμy).

When estimating a population parameter from a sample of size n, we should subtract 1 from n for each parameter estimated as part of the calculation. In estimating the variance of yi, we estimated the population mean of yi with the sample mean ˉy, so we subtract one yielding n1 degrees of freedom.

The standard deviation is the square root of the variance, or σy=σ2y.

# deviations from mean
small$dev <- small$y - mean(small$y)

# sum of squared deviations
TSS <- sum(small$dev^2)
TSS
## [1] 1198.444
# sample size
n <- nrow(small)

# variance of y
TSS/n
## [1] 66.58025
# estimate of variance in population
TSS/(n-1)
## [1] 70.49673
# var() function divides by n-1
var(small$y)
## [1] 70.49673

Linear regression

Now that know how to measure the variability of yi, we can use linear regression to explain the variability with relevant predictors.

Below we model yi to be predicted by a single predictor, xi:

yi=β0+β1xi+ϵi where

  • β0 is the intercept
  • β1 is the regression coefficient quantifying the linear relationship between xi and yi
  • ϵi is the error term, representing unexplained variation, or the difference between yi and its prediction based on β0+β1x1 alone.

In linear regression, the errors are assumed to be normally distributed with a zero mean and variance σ2ϵ:

ϵiN(0,σ2ϵ).

As we add relevant predictors to the model, we expect the errors to decrease in magnitude, reflected by a smaller σ2ϵ.

# use lm() for linear regression
m <- lm(y ~ x, data=small)
summary(m)
## 
## Call:
## lm(formula = y ~ x, data = small)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -12.113  -7.009   1.564   5.941  11.244 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)  
## (Intercept)  33.1008    11.6468   2.842   0.0118 *
## x             0.3574     0.2353   1.519   0.1482  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 8.091 on 16 degrees of freedom
## Multiple R-squared:  0.1261, Adjusted R-squared:  0.07144 
## F-statistic: 2.308 on 1 and 16 DF,  p-value: 0.1482

Prediction

After fitting the model, we can calculate the predicted outcome for each observations, ˆyi, based on the estimated coefficients:

ˆyi=ˆβ0+ˆβ1xi

For a single continuous predictor, the predictions ˆyi will lie on a line:

Explained sums of squares

Given no predictors, the intercept β0 is the only coefficient estimated in a linear regression:

yi=β0+ϵi Its estimate will equal the sample mean:

ˆβ0=ˉy

Therefore, all predictions will be the sample mean:

ˆyi=^β0=ˉy

This no-intercept model serves as starting model where none of the variability in yi is explained.

As we add predictors to the model, we expect the predictions ˆyi on average to move away from the sample mean ˉy and closer to their observed values yi.

The explained sums of squares (ESS) quantifies how much the predicted values ˆyi from a regression model have moved away from ˉy:

ESS=ni=1(ˆyiˉy)2

As it contains no explanatory predictors, the ESS for the intercept-only model is 0:

ESSnull=ni=1(ˆyiˉy)2=ni=1(ˉyˉy)2=0

On the other hand, if our model can perfectly predict the outcome such that ˆyi=yi, then ESS=TSS:

ESSperfect=ni=1(ˆyiˉy)2=ni=1(yiˉy)2=TSS

Generally, though, 0<ESS<TSS

#TSS
TSS <- sum((small$y - mean(small$y))^2)
TSS
## [1] 1198.444
#ESS
#fitted() returns predicted values for observed data
small$yhat <- fitted(m)
ESS <- sum((small$yhat - mean(small$y))^2)
ESS
## [1] 151.0758

Residuals

We cannot directly observe or measure the errors of a linear regression model, ϵi, but we can estimate them from the residuals, ˆϵi, the difference between the observed and predicted values:

ˆϵi=yiˆyi

Residuals represent the variation in the outcome that cannot be explained by the model.

Residual sums of squares and variance

The residual sums of squares quantifies the amount of unexplained variation:

RSS=ni=1ˆϵi=ni=1(yiˆyi)2

# RSS
# residuals() returns residuals of model
RSS <- sum(residuals(m)^2)
RSS
## [1] 1047.369

Dividing RSS by np give the residual variance, ˆσ2ϵ, where p is the number of coefficients estimated in the model:

ˆσ2ϵ=RSSnp=ni=1ˆϵinp

Because the p coefficients are used to calculate the residuals and their variance ˆσ2e, the degrees of freedom are np.

# Residual standard error is square root of residual variance
summary(m)
## 
## Call:
## lm(formula = y ~ x, data = small)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -12.113  -7.009   1.564   5.941  11.244 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)  
## (Intercept)  33.1008    11.6468   2.842   0.0118 *
## x             0.3574     0.2353   1.519   0.1482  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 8.091 on 16 degrees of freedom
## Multiple R-squared:  0.1261, Adjusted R-squared:  0.07144 
## F-statistic: 2.308 on 1 and 16 DF,  p-value: 0.1482

Partitioning the total sums of squares

In regression, we can partition total variation into explained and residual variation:

TSS=ESS+RSS

As we add relevant predictors to the model, ESS grows while RSS shrinks.

Make sure example data has no duplicate x-values

R2

R2 expresses the proportion of TSS that is ESS:

R2=ESSTSS

Because TSS=ESS+RSS

R2=1RSSTSS R2 is thus interpreted as the proportion of outcome variation explained by the model predictors.

# R^2 reported in output
summary(m)
## 
## Call:
## lm(formula = y ~ x, data = small)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -12.113  -7.009   1.564   5.941  11.244 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)  
## (Intercept)  33.1008    11.6468   2.842   0.0118 *
## x             0.3574     0.2353   1.519   0.1482  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 8.091 on 16 degrees of freedom
## Multiple R-squared:  0.1261, Adjusted R-squared:  0.07144 
## F-statistic: 2.308 on 1 and 16 DF,  p-value: 0.1482
# confirming calculation
ESS/TSS
## [1] 0.1260599

Previously, we showed that ESS for the intercept-only model is 0. What will R2 for this model be?

# R^2 for intercept-only model
summary(lm(y ~ 1, data=small))$r.squared
## [1] 0

R2 properties

  • allows comparison between predictors and between models as a standardized effect size
  • varies between 0 and 1, with values closer to 1 implying more of the variation is explained
  • only increases (or stays same) as more predictors are added to the model
  • alternatively interpreted as the squared correlation between the observed and predicted outcomes

R2=corr(yi,ˆyi)

# equals R^2
cor(small$y, small$yhat)^2
## [1] 0.1260599
# plot of observed vs predicted outcome
ggplot(small, aes(x=y, y=yhat)) + 
  geom_point() +
  labs(x=expression(y[i]), y=expression(hat(y)[i]), 
       title=expression(paste("Observed and predicted scores become more correlated as ", R^2, " increases")))

Variance explained explained

What determines how much variance is explained by predictor xi?

Holding everything else constant, variance explained increases as:

  • the magnitude of the regression coefficient increases
    • sex explains more variation in height than whether someone is born in the first or second half of the year (which presumably has no relationship to height)
  • the variance of the predictor increases
    • sex and a rare disease (affects 4%) may have effects of similar size on height, but sex will explain more variation because it varies more
# height data
height[4:10,]
##      height male before_july disease
## 4  70.33877    0           1       1
## 5  62.55333    0           0       0
## 6  64.84237    0           1       0
## 7  61.73652    0           0       0
## 8  64.03135    0           1       0
## 9  64.42657    0           0       0
## 10 66.27407    0           1       0
# male and disease have similar coefficients
# before_july has nearly zero effect
summary(lm(height ~ male + before_july + disease, data=height))
## 
## Call:
## lm(formula = height ~ male + before_july + disease, data = height)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -5.6396 -1.7393  0.0938  1.4779  6.9888 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  64.2630     0.2760 232.816  < 2e-16 ***
## male          4.8323     0.3165  15.266  < 2e-16 ***
## before_july   0.1727     0.3165   0.546    0.586    
## disease       4.9490     0.8077   6.128 4.81e-09 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.238 on 196 degrees of freedom
## Multiple R-squared:  0.5802, Adjusted R-squared:  0.5738 
## F-statistic:  90.3 on 3 and 196 DF,  p-value: < 2.2e-16

Below we graph the individual contributions to R2 by predictor:

Adjusted R2

R2 does not account for the complexity of the model (number of predictors)

  • Selecting models based on R2 can lead to overfitting
  • Tends to overestimate population R2

Previously we saw that ˆσ2y and ˆσ2e are unbiased estimators of the population variance of yi and of the population error variance, respectively.

Adjusted R2 uses ˆσ2e instead of RSS to represent unexplained variability:

R2=1ˆσ2ϵˆσ2y=1RSSnpTSSn1

# model without before_july
summary(lm(height ~ male + disease, data=height))
## 
## Call:
## lm(formula = height ~ male + disease, data = height)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -5.5532 -1.7575  0.0699  1.4502  6.9024 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  64.3493     0.2257 285.059  < 2e-16 ***
## male          4.8323     0.3160  15.294  < 2e-16 ***
## disease       4.9490     0.8062   6.139  4.5e-09 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.234 on 197 degrees of freedom
## Multiple R-squared:  0.5796, Adjusted R-squared:  0.5753 
## F-statistic: 135.8 on 2 and 197 DF,  p-value: < 2.2e-16
# model with before_july has higher R^2 but lower Adjusted R^2
summary(lm(height ~ male + disease + before_july, data=height))
## 
## Call:
## lm(formula = height ~ male + disease + before_july, data = height)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -5.6396 -1.7393  0.0938  1.4779  6.9888 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  64.2630     0.2760 232.816  < 2e-16 ***
## male          4.8323     0.3165  15.266  < 2e-16 ***
## disease       4.9490     0.8077   6.128 4.81e-09 ***
## before_july   0.1727     0.3165   0.546    0.586    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.238 on 196 degrees of freedom
## Multiple R-squared:  0.5802, Adjusted R-squared:  0.5738 
## F-statistic:  90.3 on 3 and 196 DF,  p-value: < 2.2e-16

Linear multilevel models

Multilevel data

Multilevel data are hierarchically structured where sampled observations (level-1) can be aggregated into higher level units (level-2).

Multilevel data arise when:

  • individuals are sampled in clusters (e.g. students within schools, patients within hospitals)
    • individuals are level-1 and clusters are level-2
  • subjects are repeatedly measured (i.e., longitudinal data)
    • timepoints are level-1 and subjects are level-2

We generally assume that level-2 units are independent of each other but that level-1 observations within the same level-2 unit are not independent, violating the independence assumption of typical single level models.

Note: Henceforth, we will refer to multilevel data arising from clustered sampling but all concepts discussed apply equally to repeated measures data as well.

Example multilevel data

We will discuss multilevel model with a simulated dataset of worker stress, with 15 companies sampled and 10 employess within each company sampled.

Variables

  • company_id
  • employee_id
  • stress: the outcome, ranges 1-100
  • salary: employee’s salary (level-1) in thousands of dollars
  • pct_open: percent of company’s position that are unfilled (level-2)
# empstress data
head(empstress)
##   company_id employee_id salary pct_open stress
## 1          1           1 76.594   11.082     48
## 2          1           2 80.259   11.082     41
## 3          1           3 67.790   11.082     43
## 4          1           4 82.514   11.082     47
## 5          1           5 76.245   11.082     54
## 6          1           6 78.169   11.082     41

Multilevel model notation

The multilevel model explains variation in outcome yij, measured for individual i=1,...,nj within level-2 unit j=1,...,J.

Each of the J level-2 unit has nj individuals sampled from it, and the total sample size is N=Jj=1nj.

Predictors in multilevel models can be measured at level-1 or level-2.

  • Level-1 predictors, subscripted with ij, vary across individuals
  • Level-2 predictors, subscripted with j, only vary across level-2 units and are constant across level-1 units within the same level-2 unit

For the empstress data

  • We will model variation in stressij, measured for employee i=1,...,nj within company j=1,...,J
  • J=15 companies were sampled, and nj=10 employees were sampled for each company, resulting in N=150
  • level-1 predictor salaryij varies across employees
  • level-2 predictor pct_openj only varies across companies
# empstress
head(empstress)
##   company_id employee_id salary pct_open stress
## 1          1           1 76.594   11.082     48
## 2          1           2 80.259   11.082     41
## 3          1           3 67.790   11.082     43
## 4          1           4 82.514   11.082     47
## 5          1           5 76.245   11.082     54
## 6          1           6 78.169   11.082     41

Means in multilevel data

For variables measured at level-1 like yij, we can calculate 2 kinds of means:

  • the overall (sample) mean y..

ˉy..=1NJj=1nji=1yij

  • the cluster means y.j for each cluster j:

ˉy.j=1njnji=1yij

Below we see a graph of stress scores (black points), its overall mean (line) and company means (larger red points).

# add cluster means of stress to data set
# mutate() with .by= makes adding cluster means to data easy
empstress <- empstress %>% 
  mutate(.by=company_id, stress_m=mean(stress))

# graph of stress, overall mean and cluster means
ggplot(empstress, aes(x=company_id, y=stress)) + 
  geom_point() +
  geom_hline(yintercept = mean(empstress$stress), linetype=2) +
  geom_point(aes(y=stress_m), color="red", size=3) +
  scale_x_continuous(breaks=1:15, labels=1:15) +
  labs(title="Stress scores (black) by company and company stress means (red)")

Total variation decomposition in multilevel models

We can decompose a level-1 outcome yij into the sum of a cluster mean y.j and the deviation from that cluster mean:

yij=ˉy.j+(yijˉy.j)yij=cluster mean + deviation from cluster mean

We can estimate the variation of each of the terms in the equation above:

  • var(yij) represents Total Variation
  • var(cluster mean) represents Between-Cluster Variation
  • var(deviation from cluster mean) represents Within-Cluster Variation

The total variation of yij in a multilevel model can thus be decomposed as:

Total Variation = Between-Cluster Variation + Within-Cluster Variation


Importantly, both Between-Cluster Variation and Within-Cluster Variation can decomposed into explained and residual variation:

Between-Cluster Variation = Explained Between-Cluster Variation + Residual Between-Cluster Variation


Within-Cluster Variation = Explained Within-Cluster Variation + Residual Within-Cluster Variation

Within-cluster variation

Describes variation of individuals within the same cluster (e.g. employees’ stress scores varying within a company).

Level-1 predictors explain within-cluster variation in the outcome (e.g. differences in employees’ salaries within a company explain differences in their stress scores).

Between-cluster variation

Describes variation of cluster means about the overall mean (e.g. company stress score means varying around the overall mean).

ggplot(empstress, aes(x=company_id, y=stress_m)) + 
  geom_point(color="red", size=3) +
  geom_hline(yintercept = mean(empstress$stress), linetype=2) +
  scale_x_continuous(breaks=1:15, labels=1:15) +
  labs(title="Between-company variation of stress means (red) about overall mean (dashed line)")

Level-2 predictors explain between-cluster variation in the outcome (e.g. differences in percent unfilled positions can explain differences in mean stress across companies)

ggplot(empstress, aes(x=pct_open, y=stress_m)) + 
  geom_point(color="red", size=3) +
  labs(title="Level-2 predictor percent unfilled can explain between-company variation in mean stress", x="percent unfilled positions")

Random intercept model

The random intercept model is the simplest multilevel model and decomposes the total variation into unexplained or residual within-cluster and between-cluster variation:

yij=β0j+ϵijβ0j=γ00+u0j

  • β0j is the intercept for cluster j (β0j=ˉy.j in this model)
  • ϵij is the level-1 error, representing individual i’s deviation from the predicted score based on model
    • as in linear regression, ϵijN(0,σ2ϵ)
    • σ2ϵ from this model measures the total within-cluster variation
  • γ00 is the fixed intercept, the mean of all clusters’ intercepts β0j
  • u0j is the random intercept for cluster j, the deviation of cluster j’s intercept from the mean intercept γ00
    • random intercepts are also assumed to be normally distributed, u0jN(0,τ00)
    • τ00 from this model measures the total between-cluster variation
# random intercept model 
m_ri <- lmer(stress ~ 1 + (1|company_id), data=empstress)
summary(m_ri)
## Linear mixed model fit by REML ['lmerMod']
## Formula: stress ~ 1 + (1 | company_id)
##    Data: empstress
## 
## REML criterion at convergence: 1107.5
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.2379 -0.6096 -0.0342  0.5618  3.3270 
## 
## Random effects:
##  Groups     Name        Variance Std.Dev.
##  company_id (Intercept) 116.08   10.774  
##  Residual                73.41    8.568  
## Number of obs: 150, groups:  company_id, 15
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)   45.000      2.868   15.69

In the output above, we see estimates of the residual variance and random intercept variance, ˆσ2ϵ=27.3 and ˆτ00=64.36, respectively.

Intraclass correlation coefficient

The random intercept model is often used to calculate the intraclass correlation coefficient (ICC), which quantifies the proportion of total variation in yij due to between-cluster variation.

The ICC is calculated by dividing ˆτ200, the between-cluster variance, by ˆτ200+ˆσ2ϵ, which estimates the total variance.

ICC=ˆτ200ˆτ200+ˆσ2ϵ The ICC can also be interpreted as the correlation of the outcome from 2 randomly sampled individuals from the same level-2 unit.

ICC>.5 suggests that cluster means vary more than individual scores within cluster, or ˆτ200>ˆσ2ϵ. Here, level-2 predictors will be important to explain variation in yij.

ICC<.5 suggests that cluster means vary less than individual scores within cluster, or ˆτ200<ˆσ2ϵ. Here, level-1 predictors will be important to explain variation in yij.

Within-cluster centering

Like the outcome yij, a predictor measured at level-1 can be decomposed into a cluster mean and a deviation from the cluster mean:

xij=(xijˉx.j)+ˉx.j

For convenience, we superscript with w to signify the deviation from the cluster mean, as in xwij, which is often called group-mean-centered or cluster-mean-centered:

xwij=(xijˉx.j)

and thus

xij=xwij+ˉx.j

As a predictor, xwij can only predict level-1 variation in an outcome like yij.

The cluster means for group-mean centered variables like xwij are all 0, so they do not vary at level-2 (i.e., have group differences removed).

# to create group-mean centered salary, salary_c:
# 1. add cluster means of predictor salary (salary_m) to data
# 2. subtract salary_m from original salary to create salary_c
empstress <- empstress %>% 
  mutate(.by=company_id, # process data by company_id
         salary_m = mean(salary)) %>%  
  mutate(salary_c = salary-salary_m) 

# graph of salary_c and company means of salary_c (all zero)
ggplot(empstress, aes(x=company_id, y=salary_c)) +
  geom_point() +
  stat_summary(geom="point", fun="mean", color="red", size=3) +
  scale_x_continuous(breaks=1:20, labels=1:20) +
  labs(title="group-mean centering removes between-company differences")

The cluster means x.j can themselves be added as level-2 predictors to explain level-2 variation in the outcome.

Within-cluster and between-cluster effects

Recall that

xij=xwij+ˉx.j

In the model in which xwij and ˉx.j are predictors:

yij=β0j+β1xwij+β2ˉx.j+ϵij

  • β1 is interpreted as the within-cluster effect of xij, (e.g., differences in math scores between students within the same class explains differences in their writing scores)
  • β2 is interpreted as the between-cluster effect of xij, (e.g. differences in mean math scores across classes explains differences in mean writing scores across classes)

Importantly, the above model is genearlly not equivalent to the model in which the original xij is the predictor:

yij=β0j+β1xij+ϵij

Implying that:

β1β1β1β2

Instead, the coefficient for the original xij, β1, will be some uninterpretable combination of the within-cluster and between-cluster effects, β1 and β2, respectively.

  • If β1 and β2 have opposite signs, β1 may be close to 0 and thus modeling xij may lead to a misleading inference of little or no effect
  • Thus, it is often recommended that level-1 predictors be decomposed into their within (xwij) and between (ˉx.j) components so that coefficients are more cleanly interpretable.

Compare the coefficients of the model where group_centered salary_c and company means salary_m are entered as predictor to the model where the original salary is entered:

# salary_c and salary_m coefficients have clear within-company and between-company interpretations
m_salary_c_m <- lmer(stress ~ salary_c + salary_m +
                       (1 | company_id),
                     data=empstress)

summary(m_salary_c_m)
## Linear mixed model fit by REML ['lmerMod']
## Formula: stress ~ salary_c + salary_m + (1 | company_id)
##    Data: empstress
## 
## REML criterion at convergence: 1072
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -2.14261 -0.58580  0.02166  0.46986  3.14261 
## 
## Random effects:
##  Groups     Name        Variance Std.Dev.
##  company_id (Intercept) 22.71    4.766   
##  Residual               64.73    8.046   
## Number of obs: 150, groups:  company_id, 15
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept) 125.7257    11.9571  10.515
## salary_c      0.7414     0.1696   4.371
## salary_m     -1.0053     0.1479  -6.798
## 
## Correlation of Fixed Effects:
##          (Intr) slry_c
## salary_c  0.000       
## salary_m -0.993  0.000
# salary coefficient is uninterpretable average of coefficients above
m_salary <- lmer(stress ~ salary +
                   (1 | company_id),
                 data=empstress)

summary(m_salary)
## Linear mixed model fit by REML ['lmerMod']
## Formula: stress ~ salary + (1 | company_id)
##    Data: empstress
## 
## REML criterion at convergence: 1102.9
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -2.23736 -0.62578  0.00259  0.53223  3.10754 
## 
## Random effects:
##  Groups     Name        Variance Std.Dev.
##  company_id (Intercept) 239.80   15.485  
##  Residual                65.62    8.101  
## Number of obs: 150, groups:  company_id, 15
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)   4.6013    13.3741   0.344
## salary        0.5031     0.1587   3.170
## 
## Correlation of Fixed Effects:
##        (Intr)
## salary -0.953

Fixed effects

Multilevel models can include fixed and random effects.

The intercept and any regression coefficient for a level-1 predictor can be either

  • constrained to be the same across all clusters (fixed effect only)
    • e.g., β1=γ10
  • allowed to vary across clusters (fixed + random effect)
    • e.g., β1j=γ10+u1j

In either case, the fixed effect, e.g. γ10, is interpreted as the mean coefficient across clusters.

# fixed slope of salary_c to explain variation in stress
m_salary_c_fixed <- lmer(stress ~ salary_c + 
                           (1 | company_id),
                         data=empstress)

summary(m_salary_c_fixed)
## Linear mixed model fit by REML ['lmerMod']
## Formula: stress ~ salary_c + (1 | company_id)
##    Data: empstress
## 
## REML criterion at convergence: 1091.2
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -2.22676 -0.61828 -0.01386  0.49847  3.00557 
## 
## Random effects:
##  Groups     Name        Variance Std.Dev.
##  company_id (Intercept) 116.95   10.814  
##  Residual                64.73    8.046  
## Number of obs: 150, groups:  company_id, 15
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  45.0000     2.8685  15.688
## salary_c      0.7414     0.1696   4.371
## 
## Correlation of Fixed Effects:
##          (Intr)
## salary_c 0.000

The coefficient β1=γ100.741 is interpreted as the fixed slope of group-mean centered salary, salary_c, which suggests that employees with higher salaries than other employees within the same company have more stress (perhaps because they have more responsibilities).

Random effects

Random effects in multilevel models are typically used to model heterogeneity (variation) around the fixed effect

  • Random intercepts model heterogeneity in the cluster means of the outcome
  • Random slopes model heterogeneity in the coefficients for level-1 predictors

In the model below with level-1 predictors xwij and zwij:

yij=β0j+β1jxwij+β2zwij+ϵijβ0j=γ00+u0jβ1j=γ10+u1jβ2=γ20

  • β0j is the intercept for cluster j, and is the sum of the fixed (mean) intercept γ00 and each cluster’s random intercept u0j
  • β1j is the slope coefficient for xwij for cluster j, and is the sum of the fixed mean slope γ10 and each cluster’s random slope u1j
  • β2 is the slope coefficient for zwij for all clusters as there is only a fixed component, γ20

Random effects like u0j and u1j are typically assumed to have a multivariate normal distribution, where the random effects are assumed to have zero means and some covariance matrix quantifying their variances and covariances:

(u0ju1j)N((00),(τ200τ01τ01τ211)) * The elements of the covariance matrix τ200, τ211, τ01=τ10 are parameters estimated in the multilevel model.

# fixed and random slope of salary_c to explain variation in stress
m_salary_c_random <- lmer(stress ~ salary_c + 
                           (1 + salary_c | company_id),
                         data=empstress)

summary(m_salary_c_random)
## Linear mixed model fit by REML ['lmerMod']
## Formula: stress ~ salary_c + (1 + salary_c | company_id)
##    Data: empstress
## 
## REML criterion at convergence: 1087.6
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -2.31751 -0.58010 -0.05927  0.47469  2.48187 
## 
## Random effects:
##  Groups     Name        Variance Std.Dev. Corr
##  company_id (Intercept) 117.4425 10.837       
##             salary_c      0.3329  0.577   0.19
##  Residual                59.3291  7.703       
## Number of obs: 150, groups:  company_id, 15
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  45.0000     2.8679  15.691
## salary_c      0.8178     0.2245   3.643
## 
## Correlation of Fixed Effects:
##          (Intr)
## salary_c 0.126
  • fixed slope coefficient γ100.818 is interpreted as the mean slope of salary_c across companies.
  • The variance of the slopes is estimated at tau2110.333, suggesting that a company’s slope β1j deviates from the fixed slope γ10 by about 0.3330.577.
# use coef() to see each cluster's intercepts and slopes
coef(m_salary_c_random)
## $company_id
##    (Intercept)    salary_c
## 1     46.17442  0.48449509
## 2     27.79865  0.79443093
## 3     41.50957  0.95495935
## 4     31.01874  0.73785965
## 5     53.72239  0.71527322
## 6     51.56692  0.87882989
## 7     44.06388  0.89459952
## 8     52.49329  1.26991889
## 9     62.81527  1.60314719
## 10    39.82638  0.58183403
## 11    39.10364  0.78680564
## 12    27.18478  0.55404316
## 13    55.85613 -0.08883465
## 14    50.70981  0.86780080
## 15    51.15612  1.23142129
## 
## attr(,"class")
## [1] "coef.mer"

Random slopes may provide a better fit of level-1 predictors to the data than the fixed slope.

## $title
## [1] "Random slopes (red) vary around fixed mean slope (dashed black)\nDisplaying first 6 companies"
## 
## attr(,"class")
## [1] "labels"

Random effects as residuals

Random effects represent residual (unexplained) variation in model coefficients across clusters, and thus their variance estimates (e.g. τ00) quantify residual variation.

yij=β0j+β1jxwij+β2zwij+ϵijβ0j=γ00+u0jβ1j=γ10+u1j

In the model above, the random effects u0j and u1j represent unexplained heterogeneity in the intercepts and slopes, respectively.

Adding level-2 predictors (Aj below) to the model can explain the heterogeneity in the intercepts, and adding interactions of level-2 predictors with level-1 predictors can be used to explain heterogeneity in the slope coefficients.

yij=β0j+β1jxwij+β2zwij+ϵijβ0j=γ00+γ01Aj+u0jβ1j=γ10+γ11Aj+u1j

If level-2 predictor Aj is predicitve, the random effects will be smaller (representing less unexplained heterogeneity) resulting in smaller variance estimates, ˆτ00 and ˆτ11.

Excercise 1

Run random intercept model

Generate cluster means and group-centered predictor

Run model with just within predictor, take note of variances and interpret coefficient

Add cluster means and do same

Variance decompositions in linear multilevel models

Total variance decomposition in multilevel models

Rights and Sterba (2019) decompose the model-implied total variance of the outcome as follows:

^var(yij)=f1+f2+v+m+ˆσ2ϵ Explained variance components

  • f1 is variance explained by level-1 fixed effects
  • f2 is variance explained by level-2 fixed effects
  • v is variance explained by random (varying) slopes
  • m is random intercept (mean) variance (ˆτ200)††

Residual variance components

  • ˆσ2ϵ is level-1 residual variance


This complete variance decomposition allowed Rights and Sterba (2019) to develop a general framework for defining multilevel R2 measures that are clearly interpretable and that subsumes most multilevel R2 measures previously defined in the literature.

In this decomposition, all level-1 predictors are assumed to be cluster-mean-centered

†† Random intercept variation can be considered residual, unexplained between-cluster variation but it can also be considered variation explained by cluster IDs.

Within-cluster variance decomposition

Within this framework, we can also fully decompose the model-implied within-cluster variance decomposition:

^var(yijˉy.j)=f1+v+ˆσ2ϵ

Explained variance components

  • f1: (group-centered) level-1 predictor fixed (mean) effects explain within-cluster differences
  • v random (varying) slopes may provide a closer fit to outcomes than the fixed slope and thus explain more variance

Residual variance component

  • ˆσ2ϵ quantifies unexplained within-cluster variation


Below we see graphs that show how the fixed slope and random slope explain increasingly more within-company variation, reflected by the decreasing residuals.

Notice how the residual variances decrease of the 3 models depicted above decreases as we add salary_c fixed slope and random slope to the model (while random intercept variance does not decrease).

# VarCorr() returns random effect and residual standard deviations
#  wrapping VarCorr() in print() with comp="Variance" returns variances
# random intercept model
print(VarCorr(m_ri), comp="Variance") 
##  Groups     Name        Variance
##  company_id (Intercept) 116.083 
##  Residual                73.412
# fixed slope of salary_c
print(VarCorr(m_salary_c_fixed), comp="Variance")  
##  Groups     Name        Variance
##  company_id (Intercept) 116.951 
##  Residual                64.732
# fixed and random slope of salary_c
print(VarCorr(m_salary_c_random), comp="Variance")  
##  Groups     Name        Variance Cov    
##  company_id (Intercept) 117.4425        
##             salary_c      0.3329   1.216
##  Residual                59.3291

Between-cluster variance decomposition

^var(ˉy.jˉy..)=f2+m Explained variance component

  • f2: level-2 predictor fixed effects explain between-cluster differences

Residual variance component

  • m (τ200) quantifies unexplained between-cluster variation

To demonstrate, we will regress stress on level-2 pct_open:

# level-2 pct_open explains between-company variation in mean stress
m_pct_open <- lmer(stress ~ pct_open + 
                     (1 | company_id),
                   data=empstress)

summary(m_pct_open)
## Linear mixed model fit by REML ['lmerMod']
## Formula: stress ~ pct_open + (1 | company_id)
##    Data: empstress
## 
## REML criterion at convergence: 1103.5
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.2748 -0.6105 -0.0242  0.5497  3.2955 
## 
## Random effects:
##  Groups     Name        Variance Std.Dev.
##  company_id (Intercept) 105.77   10.284  
##  Residual                73.41    8.568  
## Number of obs: 150, groups:  company_id, 15
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  30.3667    10.0800   3.013
## pct_open      1.4508     0.9616   1.509
## 
## Correlation of Fixed Effects:
##          (Intr)
## pct_open -0.962

Compare the random intercept variances (and residual variances) for the model with just random intercepts to the model where we add pct_open as a predictor.

# random intercept model
print(VarCorr(m_ri), comp="Variance")
##  Groups     Name        Variance
##  company_id (Intercept) 116.083 
##  Residual                73.412
# level-2 pct_open as predictor
print(VarCorr(m_pct_open), comp="Variance")
##  Groups     Name        Variance
##  company_id (Intercept) 105.770 
##  Residual                73.412

The residuals below represent unexplained between-company variation in mean stress (i.e., random intercept variation), and we see how pct_open explains away some of this unexplained variation.

New R2 measures for linear multilevel models

Rights and Sterba (2019) R2 Notation

Rights and Sterba (2019) developed a set of R2 measures, which can include different terms in the numerator and denominator, so notation is required to distinguish among them:

  • Superscripts denote sources of explained variation (f1, f2, v, or m)
  • Subscripts denote whether the variance being explained is total, within, or between (t, w, or b)

R2(f1)w is thus the proportion of within-cluster variance explained by level-1 fixed effects.

Proportion of total variance explained by fixed effects

Proportion of total variance explained by level-1 (within-cluster) fixed effects

R2(f1)t=f1f1+f2+v+m+ˆσ2ϵ

Proportion of total variance explained by level-2 (between-cluster) fixed effects

R2(f2)t=f2f1+f2+v+m+ˆσ2ϵ

Proportion of total variance explained by fixed effects

R2(f)t=R2(f1)t+R2(f2)t=f1+f2f1+f2+v+m+ˆσ2ϵ

Superscript f is used to represent level-1 and level-2 fixed effects together.

Proportion of total variance explained by random variation

Proportion of total variance explained by random slope variation

R2(v)t=vf1+f2+v+m+ˆσ2ϵ

Proportion of total variance explained by random intercept variation

R2(m)t=mf1+f2+v+m+ˆσ2ϵ

Proportion of total variance explained by predictors and model

Proportion of total variance explained by predictors (fixed effects and random slopes)

R2(fv)t=R2(f1)t+R2(f2)t+R2(v)t=f1+f2+vf1+f2+v+m+ˆσ2ϵ

Proportion of total variance explained by model (all fixed and random effects)

R2(fvm)t=R2(f1)t+R2(f2)t+R2(v)t+R2(m)t=f1+f2+v+mf1+f2+v+m+ˆσ2ϵ

Proportion of within-cluster variance explained

Proportion of within-cluster variance explained by level-1 fixed effects

R2(f1)w=f1f1+v+ˆσ2ϵ

Proportion of within-cluster variance explained by random slopes

R2(v)w=vf1+v+ˆσ2ϵ

Proportion of within-cluster variance explained by level-1 predictors (fixed effects and random slopes)

R2(f1v)w=R2(f1)w+R2(v)w=f1+vf1+v+ˆσ2ϵ

Proportion of between-cluster variance explained

Proportion of between-cluster variance explained by level-2 fixed effects

R2(f2)b=f2f2+m

Proportion of between-cluster variance explained by random intercept (cluster mean) variation

R2(m)b=mf2+m

We do not estimate R2(f2m)b because it always equals 1.

Using the r2mlm package

The r2mlm package

The r2mlm package was developed by Rights and Sterba to calculate these new R2 measures. In this section we learn to use the following functions in the r2mlm package:

  • r2mlm(): calculates and visualizes all possible R2 measures for a model
  • r2mlm_ci(): calculates bootstrapped confidence intervals for R2 measures
  • r2mlm_comp()compares 2 models and calculates differences in all R2 measures

The r2mlm package can calculate R2 measures for models run using lmer() from the lme4 package, so we load both packages now.

# load packages
library(lme4)
library(r2mlm)

or using nlme() from the accompanying package nlme

Note: Much of the material in this section is adapted from Shaw et al. (2023), an article discussing the features and usage of the r2mlm package

Example data

To demonstrate r2mlm usage, we will use a dataset that loads with the package called teachsat, a simulated dataset of variables related to teacher job satisfaction.

Sample is 300 schools (level-2), each with 30 teachers (level-1), for a total N=9000.

Variables

  • schoolID
  • teacherID
  • satisfaction: teacher job satisfaction (outcome), 1-10 scale
  • control_c: school-mean-centered teacher’s rating of control over curriculum
  • salary_c: school-mean-centered teacher’s salary (in thousands of dollars)
  • control_m: school mean rating of teacher’s rating of control over curriculum
  • salary_m: school mean teacher’s salary
  • s_t_ratio: student-to-teacher ratio for the school
# teachsat multilevel data
head(teachsat)
##   schoolID teacherID satisfaction  control_c  salary_c control_m salary_m
## 1        1         1     8.582175  2.7239726  7.274769  5.474631 77.69538
## 2        1         2     7.428820 -0.6604034 13.321313  5.474631 77.69538
## 3        1         3     7.990401 -1.3002809 20.131688  5.474631 77.69538
## 4        1         4     5.100618 -0.6278236 -9.810052  5.474631 77.69538
## 5        1         5     7.256876  1.5190521  1.205659  5.474631 77.69538
## 6        1         6     5.154663 -1.9307584  1.778745  5.474631 77.69538
##   s_t_ratio
## 1        30
## 2        30
## 3        30
## 4        30
## 5        30
## 6        30

Random intercept model

We often start with the random intercept model to quantify the amount of variation within and between clusters.

yij=β0j+ϵijβ0j=γ00+u0j

# random intercept model
m_m <- lmer(satisfaction ~ 1 + (1|schoolID),
             data=teachsat)

summary(m_m)
## Linear mixed model fit by REML ['lmerMod']
## Formula: satisfaction ~ 1 + (1 | schoolID)
##    Data: teachsat
## 
## REML criterion at convergence: 29426.6
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -4.8004 -0.6500 -0.0083  0.6370  3.4825 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  schoolID (Intercept) 0.8376   0.9152  
##  Residual             1.3955   1.1813  
## Number of obs: 9000, groups:  schoolID, 300
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  5.99852    0.05429   110.5

The ICC is calculated by dividing the random intercept variance by the sum of the random intercept variance and residual variance:

#ICC
.8376/(.8376+1.3955)
## [1] 0.375084

About 37.5% of the total variation in satisfaction is between-cluster (differences in mean satisfaction across schools), while the remaining 62.5% is within-cluster (differences in satisfaction between teachers in the same school).

To calculate all relevant R2 using r2mlm(), simply supply the lmer object as an argument:

# R^2 for random intercept model
r2mlm(m_m)

## $Decompositions
##                     total within between
## fixed, within   0.0000000      0      NA
## fixed, between  0.0000000     NA       0
## slope variation 0.0000000      0      NA
## mean variation  0.3750827     NA       1
## sigma2          0.6249173      1      NA
## 
## $R2s
##         total within between
## f1  0.0000000      0      NA
## f2  0.0000000     NA       0
## v   0.0000000      0      NA
## m   0.3750827     NA       1
## f   0.0000000     NA      NA
## fv  0.0000000      0      NA
## fvm 0.3750827     NA      NA

The output is divided into 3 sections:

Graphs

  • The three bars, left to right, represent total, within-school, and between-school variation of satisfaction
  • from the legend, we see the only sources of variation are random intercept variation (blue stripes) and residual variation (white)
  • random intercept variation (m) explains a bit less than 40% (~38%) of total variation, and all between-cluster variation (since there are no level 2 fixed effects)

$Decompositions

Shows the proportion of variance explained by f1, f2, v, m, and σ2ϵ, respectively, for each of total, within-school, and between-school variance.

  • mean variation (random intercept variation) accounts for 37.5% of the total variance
  • sigma2 (residual variation) accounts for the remaining 62.5% of total variance
  • All of the within-school variation is residual variation
  • All of the between-school variation is random intercept variation
  • NA means this source (row) cannot explain the variation captured by the column (e.g. f2 cannot explain within-school variance)

$R2s

Displays all R2 measures calculable for this model:

  • R2(m)t0.375, 37.5% of the total variance is explained by random intercept variation
  • R2(fvm)t0.375, 37.5% of the total variance is explained by the model
  • R2(m)b=1, 100% of the between-school variance is explained by random intercept variation

Adding fixed effects

Now we add fixed effect of level-1 predictor salary_c and a level-2 predictor salary_m to the random intercept model:

yij=β0j+β1salary_c+ϵijβ0j=γ00+γ01salary_m+u0jβ1=γ10

# fixed effects and random intercept model
m_f.m <- lmer(satisfaction ~ salary_c + salary_m + (1 | schoolID),
             data=teachsat)
summary(m_f.m)
## Linear mixed model fit by REML ['lmerMod']
## Formula: satisfaction ~ salary_c + salary_m + (1 | schoolID)
##    Data: teachsat
## 
## REML criterion at convergence: 26428.3
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -4.4422 -0.6710 -0.0137  0.6522  3.7630 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  schoolID (Intercept) 0.5984   0.7736  
##  Residual             0.9980   0.9990  
## Number of obs: 9000, groups:  schoolID, 300
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept) 3.104077   0.267652   11.60
## salary_c    0.071763   0.001219   58.88
## salary_m    0.039240   0.003575   10.98
## 
## Correlation of Fixed Effects:
##          (Intr) slry_c
## salary_c  0.000       
## salary_m -0.985  0.000

The coefficient for salary_c suggests that teachers that have higher salaries than other teachers within the same school are more satisfied.

The coefficient for salary_m suggests that schools with higher mean teacher salaries have higher mean satisfaction.

Both the random intercept variance and residual variance have been reduced from the random intercept model (.5984 and .9980, down from .8376 and 1.3955, respectively), suggesting a significant proportion of each of the variances have been explained by the fixed effects.

Let’s see the R2 measures for this model:

# R^2 for fixed effects and random intercept model
r2mlm(m_f.m)

## $Decompositions
##                     total    within   between
## fixed, within   0.1720308 0.2780736        NA
## fixed, between  0.1135477        NA 0.2977533
## slope variation 0.0000000 0.0000000        NA
## mean variation  0.2678004        NA 0.7022467
## sigma2          0.4466212 0.7219264        NA
## 
## $R2s
##         total    within   between
## f1  0.1720308 0.2780736        NA
## f2  0.1135477        NA 0.2977533
## v   0.0000000 0.0000000        NA
## m   0.2678004        NA 0.7022467
## f   0.2855784        NA        NA
## fv  0.2855784 0.2780736        NA
## fvm 0.5533788        NA        NA

Graphs

  • The contributions of the level-1 fixed effect (solid red) and level-2 fixed effect (solid blue) are now visualized
  • More than 50% of the total variance has been explained, and the 2 fixed effects combine to explain a bit less than 30%
  • The fixed effect of salary_c (level-1) explains a little less than 30% of the within-school variance
  • The fixed effect of salary_m (level-2) explains approximately 30% of the between-school variance

$Decompositions

  • The level-2 fixed effect explains more of the total variance than the level-1 fixed effect
  • mean variation and sigma2, level-2 and level-1 residual variances, respectively, make up less of the total variance compared to the random intercept model because the fixed effects are explaining some of the unexplained variance away

$R2s

  • As we saw in the left-most bar graph, the model explains more than 50% of the total variance, reflected by R2(fvm)t.553
  • We can verify that the R2 measures with multiple sources of explained variance are sums of R2 with single terms, for example:

R2(f)t=R2(f1)t+R2(f2)t

and

R2(fvm)t=R2(f1)t+R2(f2)t+R2(v)t+R2(m)t

Adding a random slope

The final source of explained variation to examine with r2mlm() is v, random slopes.

We allow the effect of salary_c to vary by school via random slopes to attempt to explain more within-school variation in teacher satisfaction.

yij=β0j+β1jsalary_c+ϵijβ0j=γ00+γ01salary_m+u0jβ1j=γ10+u1j

# random slope, fixed effects and random intercept model
m_v.f.m <- lmer(satisfaction ~ salary_c + salary_m + (1 + salary_c | schoolID),
             data=teachsat)
summary(m_v.f.m)
## Linear mixed model fit by REML ['lmerMod']
## Formula: satisfaction ~ salary_c + salary_m + (1 + salary_c | schoolID)
##    Data: teachsat
## 
## REML criterion at convergence: 25376.9
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.3085 -0.6585 -0.0081  0.6432  3.6531 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr
##  schoolID (Intercept) 0.604140 0.77726      
##           salary_c    0.002291 0.04786  0.04
##  Residual             0.826908 0.90934      
## Number of obs: 9000, groups:  schoolID, 300
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept) 3.086462   0.267511   11.54
## salary_c    0.071689   0.002987   24.00
## salary_m    0.039479   0.003573   11.05
## 
## Correlation of Fixed Effects:
##          (Intr) slry_c
## salary_c  0.006       
## salary_m -0.985  0.000

Although the salary_c slope variance appears small at 0.0023, the standard deviation is 0.0479 (0.00230.0479), more than 50% the size of the coefficient itself (0.0717).

The residual variance has again decreased compared to the model with just fixed effects, suggesting that the random slope variation is explaining some of the total and within-school variance.

Now let’s see all possible R2 measures calculated for this model:

# model has all possible sources of explained variance
r2mlm(m_v.f.m)

## $Decompositions
##                      total    within   between
## fixed, within   0.17107144 0.2776837        NA
## fixed, between  0.11452685        NA 0.2982981
## slope variation 0.07624695 0.1237643        NA
## mean variation  0.26940742        NA 0.7017019
## sigma2          0.36874735 0.5985520        NA
## 
## $R2s
##          total    within   between
## f1  0.17107144 0.2776837        NA
## f2  0.11452685        NA 0.2982981
## v   0.07624695 0.1237643        NA
## m   0.26940742        NA 0.7017019
## f   0.28559828        NA        NA
## fv  0.36184523 0.4014480        NA
## fvm 0.63125265        NA        NA

Graphs

  • The model now explains a bit more than 60% of the total variance
  • salary_c slope variation (red stripes) explains a little less than 10% of the total variance
  • salary_c slope variation explains a bit more than 10% of the within-cluster variance, which is now about 40% explained by the model

$Decompositions

  • salary_c slope variation is the smallest source of explained variance
  • No values are 0, meaning all sources of variation are incorporated into this model

$R2s

  • R2(fvm)t0.631, the model explains about 63.1% of the total variance
  • R2(fv)t0.362, the predictors together explain about 36.2% of the total variance (via fixed and random slopes)
  • R2(fv)w0.401, salary_c explains about 40.1% of the within-school variance (via its fixed and random slope)

Confidence intervals for multilevel R2 measures

The function r2mlm_ci() calculates bootstrapped confidence intervals for each R2.

Using r2mlm_ci() requires the bootmlm package, which resamples multilevel data using the bootstrap.

The bootmlm package is still in development, so is not yet available for download from CRAN, but can be downloaded from github using the following code:

if (!require("remotes")) {
  install.packages("remotes")
}
remotes::install_github("marklhc/bootmlm")

The following arguments are required when calling r2mlm_ci()

  • model: an lmer() or nlme() object
  • nsim: number of bootstrap samples (e.g. 500 or 1000)
  • boottype: "parametric", which assumes normally distributed residuals, or "residual", which does not
  • confinttype: type of confidence interval, one of "norm" (normal), "basic", or "perc" (percentile)

Below, we request 95% percentile confidence intervals using 1000 residual bootstrap samples:

# set random seed to reproduce results
set.seed(1932)
# this takes a long time to run!
r2mlm_ci(m_v.f.m, nsim=1000, boottype="residual", confinttype = "perc",
         progress=F) # suppresses a progress bar from appearing in output
##                 2.5 %     97.5 %
## f1_total   0.14921313 0.19447585
## f2_total   0.08144853 0.15445686
## v_total    0.06288597 0.09266709
## m_total    0.23435219 0.30509080
## f_total    0.25205535 0.32365693
## fv_total   0.32817014 0.40108991
## fvm_total  0.60408394 0.65694501
## f1_within  0.24738847 0.30914729
## v_within   0.10441981 0.14750445
## fv_within  0.37114323 0.43443146
## f2_between 0.22192875 0.38347671
## m_between  0.61652329 0.77807125

As of writing this workshop, the r2mlm_ci() function appears to be still somewhat in development.

Comparing models

Although r2mlm() combines individual sources of explained variance of the same type (e.g. all level-1 fixed effects), we may be interested in the unique contribution of a single variable or smaller subset of variables.

Or, we may wish to compare R2 measures between multilevel models with completely different sets of predictors.

For both purposes we wish to calculate the difference in all R2 measures betwen 2 models, which r2mlm_comp() can do with 2 lmer() objects specified as arguments.

Example: calculating unique contributions to R2

Assume we are interested in calculating how much various R2 measures increase when we add the following to our previous model with fixed and random slopes of salary_c, fixed slope of salary_m, and random intercepts:

  • control_c fixed and random slope
  • control_m fixed slope

First, we fit the model with all of these components:

# model with additional predictors
m_all <-  lmer(satisfaction ~ salary_c + salary_m + control_c + control_m +
                 (1 + salary_c + control_c | schoolID),
             data=teachsat)
summary(m_all)
## Linear mixed model fit by REML ['lmerMod']
## Formula: satisfaction ~ salary_c + salary_m + control_c + control_m +  
##     (1 + salary_c + control_c | schoolID)
##    Data: teachsat
## 
## REML criterion at convergence: 22463.9
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.3888 -0.6513 -0.0009  0.6350  3.9643 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr       
##  schoolID (Intercept) 0.609125 0.7805              
##           salary_c    0.002362 0.0486    0.07      
##           control_c   0.028816 0.1698    0.04 -0.13
##  Residual             0.553253 0.7438              
## Number of obs: 9000, groups:  schoolID, 300
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept) 2.692568   0.345176   7.801
## salary_c    0.072820   0.002960  24.604
## salary_m    0.040016   0.003557  11.251
## control_c   0.278388   0.011053  25.186
## control_m   0.072244   0.042491   1.700
## 
## Correlation of Fixed Effects:
##           (Intr) slry_c slry_m cntrl_c
## salary_c   0.008                      
## salary_m  -0.787  0.000               
## control_c  0.004 -0.107  0.000        
## control_m -0.637  0.000  0.044  0.000 
## optimizer (nloptwrap) convergence code: 0 (OK)
## Model failed to converge with max|grad| = 0.00825855 (tol = 0.002, component 1)

The large t value value for control_c suggests that the data provide strong evidence that it explains a significant portion of within-school variance, but the small t value for control_m suggests that the data do not provide strong evidence that it explains much variance (even though its coefficient is larger than that for salary_m).

Nest we use r2mlm_comp to calculate the difference in R2 measures between the two models:

r2mlm_comp(m_v.f.m, m_all)

## $`Model A R2s`
##          total    within   between
## f1  0.17107144 0.2776837        NA
## f2  0.11452685        NA 0.2982981
## v   0.07624695 0.1237643        NA
## m   0.26940742        NA 0.7017019
## f   0.28559828        NA        NA
## fv  0.36184523 0.4014480        NA
## fvm 0.63125265        NA        NA
## 
## $`Model B R2s`
##         total    within   between
## f1  0.2567172 0.4194249        NA
## f2  0.1180306        NA 0.3042572
## v   0.1102090 0.1800596        NA
## m   0.2698998        NA 0.6957428
## f   0.3747479        NA        NA
## fv  0.4849568 0.5994845        NA
## fvm 0.7548567        NA        NA
## 
## $`R2 differences, Model B - Model A`
##            total     within      between
## f1  0.0856457841 0.14174120           NA
## f2  0.0035037875         NA  0.005959122
## v   0.0339620354 0.05629528           NA
## m   0.0004924254         NA -0.005959122
## f   0.0891495715         NA           NA
## fv  0.1231116070 0.19803648           NA
## fvm 0.1236040324         NA           NA

The output contains:

  • Visualizations of the variance decompositions for both models
  • Visual comparisons between the models of the total, within, and between variance decompositions
    • control_c appears to have increased the contributions of both fixed slopes and slope variation to within-school variance
    • control_m appears to very slightly incresae the contribution of fixed slopes to between-school variance
  • R2 measures for each model
    • the fixed slope of salary_c explains about 8.6% more of the total variance (than the reduced model), and about 14.2% more of the within-school variance
    • the random slope of salary_c explains about 3.4% more of the total variance and about 5.6% more of the within-school variance
    • the fixed slope of salary_m explains about 0.35% more of total variance, and about 0.6% of the between-school variance
    • together, salary_c and salary_m explain about 12.4% more of the total variance

Example: comparing different sets of predictors

Suppose we are interested in whether salary or curriculum control explain more of the variation in teacher satisfaction.

To assess this, we will compare 2 models:

  • salary_c with fixed and random slopes, salary_m with fixed slope
  • control_c with fixed and random slopes, control_m with fixed slope

If you compare 2 non-nested models with r2mlm_comp(), you must specify the dataset as well.

# salary model
m_salary <- lmer(satisfaction ~ salary_c + salary_m +
                   (1 + salary_c | schoolID),
                 data=teachsat)

# curriculum control model
m_control <- lmer(satisfaction ~ control_c + control_m +
                   (1 + control_c | schoolID),
                 data=teachsat)

# compare models
r2mlm_comp(m_salary, m_control, data=teachsat)

## $`Model A R2s`
##          total    within   between
## f1  0.17107144 0.2776837        NA
## f2  0.11452685        NA 0.2982981
## v   0.07624695 0.1237643        NA
## m   0.26940742        NA 0.7017019
## f   0.28559828        NA        NA
## fv  0.36184523 0.4014480        NA
## fvm 0.63125265        NA        NA
## 
## $`Model B R2s`
##          total     within     between
## f1  0.07973695 0.12847381          NA
## f2  0.00128136         NA 0.003377757
## v   0.03112197 0.05014435          NA
## m   0.37807113         NA 0.996622243
## f   0.08101831         NA          NA
## fv  0.11214028 0.17861816          NA
## fvm 0.49021141         NA          NA
## 
## $`R2 differences, Model B - Model A`
##           total      within    between
## f1  -0.09133449 -0.14920993         NA
## f2  -0.11324548          NA -0.2949203
## v   -0.04512498 -0.07361995         NA
## m    0.10866371          NA  0.2949203
## f   -0.20457997          NA         NA
## fv  -0.24970495 -0.22282988         NA
## fvm -0.14104124          NA         NA
  • salary_c explains more within-school variation than control_c
  • salary_m explains more between-school variation than salary_m
  • Overall, salary explains about 14.1% more total variance than curriculum control

Using uncentered level-1 predictors

Despite general recommendations to group-mean center level-1 predictors, r2mlm() can be used with uncentered level-1 predictors.

Uncentered level-1 predictors potentially explain variation at both levels, thus:

  • f1 and f2 cannot be calculated, so f, the total variance explained by fixed effects is used instead
  • only total variance is decomposed, because within-cluster variance and between-cluster variance are defined using f1 and f2, respectively
  • random slope variation v can also explain variation at both levels
  • the model implied total variance is now

^var(yij)xuncentered=f+v+m+ˆσ2ϵ * Only R2(f)t, R2(v)t, R2(m)t, R2(fv)t, and R2(fvm)t will be reported (see Table 5 of Rights and Sterba(2019) for definitions)

Below, we sum control_c and control_m to create uncentered control, which we then model as a predictor:

# create uncentered level-1 control
teachsat$control <- teachsat$control_c + teachsat$control_m


# fixed and random slopes for control
m_control_un <- lmer(satisfaction ~ control + 
                              (1 + control | schoolID),
                            data=teachsat)

summary(m_control_un)
## Linear mixed model fit by REML ['lmerMod']
## Formula: satisfaction ~ control + (1 + control | schoolID)
##    Data: teachsat
## 
## REML criterion at convergence: 28017.1
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -4.2656 -0.6153 -0.0006  0.6128  3.9539 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr 
##  schoolID (Intercept) 1.50889  1.2284        
##           control     0.02637  0.1624   -0.63
##  Residual             1.14230  1.0688        
## Number of obs: 9000, groups:  schoolID, 300
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  4.69931    0.08008   58.68
## control      0.26333    0.01178   22.34
## 
## Correlation of Fixed Effects:
##         (Intr)
## control -0.705

We see that we only get decompositions and R2 measures related to total variance for a model with uncentered level-1 predictors:

# r2mlm with uncentered level-1 predictor
r2mlm(m_control_un)

## $Decompositions
##                      total
## fixed           0.10497843
## slope variation 0.03992823
## mean variation  0.38130827
## sigma2          0.47378506
## 
## $R2s
##          total
## f   0.10497843
## v   0.03992823
## m   0.38130827
## fv  0.14490667
## fvm 0.52621494
  • About 10.5% of the total variance of satisfaction is explained by the fixed slope of control
  • About 4% of the total variance is explained by variation of the random slope of control
  • After accounting for the fixed and random slopes of control, the random intercept variation explains about 38.1% of the total variance
  • The model explains about 52.6% of the total variance of satisfaction

Which R2 to report?

Rights and Sterba (2019) explicitly caution against reporting R2 measures that combine several sources (e.g. R2(fvm)t) without reporting the R2 measures that capture a single source.

They provide an example where R2(fvm)t=0.8 is reported alone, and thus it is unclear how much each of, f1, f2, v and m contribute.

The following figure illustrates the issue, where a model includes the fixed and random slope of a level-1 predictor and random intercepts:

Although R2(f,v,m)t=0.8 for all models, the interpretations are very different:

  • the left-most image depicts the level-1 fixed slope being the sole source of explained variance, suggesting a strong mean effect of the predictor but no random variation around this mean effect
  • the second image from the left depicts level-1 random slopes being the sole source of explained variance, suggesting a zero mean (fixed) effect of the predictor
  • the third image from the left depicts random intercept variation being the sole source of explained variance, with neither fixed nor random effects of the predictor
  • the right-most image depicts equal contributions of level-1 fixed slope, random slope variation and random intercept variation

The interpretation of R2(f,v,m)t=0.8 could be thus clarified by also reporting R2(f)t, R2(v)t, and R2(m)t.

Excerise 2

Use r2mlm() to calculate R2 for model with models run in Exercise 1.

Add additional predictors and do same

References

Original theoretical article introducing new R2 measures

Rights, J. D., & Sterba, S. K. (2019). Quantifying explained variance in multilevel models: An integrative framework for defining R-squared measures. Psychological methods, 24(3), 309.

Article that reviews multilevel models and discusses usage of r2mlm in detail

Shaw, M., Rights, J. D., Sterba, S. S., & Flake, J. K. (2023). r2mlm: An R package calculating R-squared measures for multilevel models. Behavior Research Methods, 55(4), 1942-1964.

Extension: R2 for multilevel models with more than 2 levels

Rights, J. D., & Sterba, S. K. (2023). R-squared measures for multilevel models with three or more levels. Multivariate behavioral research, 58(2), 340-367.

Thank you!