Decomposing, Probing, and Plotting Interactions in R

Purpose

This seminar will show you how to decompose, probe, and plot two-way interactions in linear regression using the emmeans package in the R statistical programming language. For users of Stata, refer to Decomposing, Probing, and Plotting Interactions in Stata.

Outline

Throughout the seminar, we will be covering the following types of interactions:

Continuous by continuous
Continuous by categorical
Categorical by categorical

We can probe or decompose each of these interactions by asking the following research questions:

What is the predicted Y given a particular X and W? (predicted value)
What is relationship of X on Y at particular values of W? (simple slopes/effects)
Is there a difference in the relationship of X on Y for different values of W? (comparing simple slopes)

Proceed through the seminar in order or click on the hyperlinks below to go to a particular section:

Main vs. Simple effects (slopes)
Predicted Values vs. Slopes
- Understanding slopes in regression
- Plotting a regression slope
Continuous by Continuous
Continuous by Categorical
Categorical by Categorical

This seminar page was inspired by Analyzing and Visualizing Interactions in SAS. This page covers two way and three way interaction decompositions in the SAS programming language.

Requirements

Before beginning the seminar, please make sure you have R and RStudio installed.

Please also make sure to have the following R packages installed, and if not, run these commands in R (RStudio).

install.packages("emmeans", dependencies=TRUE) 
install.packages("ggplot2", dependencies=TRUE)

The dataset used in the seminar can be found here: exercise.csv. You can also import the data directly into R via the URL using the following code:

dat <- read.csv("https://stats.idre.ucla.edu/wp-content/uploads/2019/03/exercise.csv")

Before you begin the seminar, load the data as above and convert gender and prog (exercise type) into factor variables:

dat$prog <- factor(dat$prog,labels=c("jog","swim","read"))
dat$gender <- factor(dat$gender,labels=c("male","female"))

Download the complete R code

You may download the complete R code here: interactions.r

After clicking on the link, you can copy and paste the entire code into R or RStudio.

Motivation

Suppose you are doing a simple study on weight loss and notice that people who spend more time exercising lose more weight. Upon further analysis you notice that those who spend the same amount of time exercising lose more weight if they are more effortful. The more effort people put into their workouts, the less time they need to spend exercising. This is popular in workouts like high intensity interval training (HIIT).

You know that hours spent exercising improves weight loss, but how does it interact with effort? Here are three questions you can ask based on hypothetical scenarios.

I’m just starting out and don’t want to put in too much effort. How many hours per week of exercise do I need to put in to lose 5 pounds?
I’m moderately fit and can put in an average level of effort into my workout. For every one hour increase per week in exercise, how much additional weight loss do I expect?
I’m a crossfit athlete and can perform with the utmost intensity. How much more weight loss would I expect for every one hour increase in exercise compared to the average amount of effort most people put in?

Additionally, we can visualize the interaction to help us understand these relationships.

Weight Loss Study

This is a hypothetical study of weight loss for 900 participants in a year-long study of 3 different exercise programs, a jogging program, a swimming program, and a reading program which serves as a control activity. Variables include

loss: weight loss (continuous), positive = weight loss, negative scores = weight gain

hours: hours spent exercising (continuous)

effort: effort during exercise (continuous), 0 = minimal physical effort and 50 = maximum effort

prog: exercise program (categorical)

jogging=1
swimming=2
reading=3

gender: participant gender (binary)

male=1
female=2

Definitions

What exactly do I mean by decomposing, probing, and plotting an interaction?

decompose: to break down the interaction into its lower order components (i.e., predicted means or simple slopes)
probe: to use hypothesis testing to assess the statistical significance of simple slopes and simple slope differences (i.e., interactions)
plot: to visually display the interaction in the form of simple slopes such as values of the dependent variable are on the y-axis, values of the predictor is on the x-axis, and the moderator separates the lines or bar graphs

Let’s define the essential elements of the interaction in a regression:

DV: dependent variable (Y), the outcome of your study (e.g., weight loss)
IV: independent variable (X), the predictor of your outcome (e.g., time exercising)
MV: moderating variable (W) or moderator, a predictor that changes the relationship of the IV on the DV (e.g, effort)
coefficient: estimate of the direction and magnitude of the relationship between an IV and DV
continuous variable: a variable that can be measured on a continuous scale, e.g., weight, height
categorical or binary variable: a variable that takes on discrete values, binary variables take on exactly two values, categorical variables can take on 3 or more values (e.g., gender, ethnicity)
main effects or slopes: effects or slopes for models that do not involve interaction terms
simple slope: when a continuous IV interacts with an MV, its slope at a particular level of an MV
simple effect: when a categorical IV interacts with an MV, its effect at a particular level of an MV

Purpose

Outline

Requirements

Download the complete R code

Motivation

Weight Loss Study

Definitions

Main vs. Simple effects (slopes)

Predicted Values vs. Slopes

Understanding slopes in regression

Exercise

Plotting a regression slope

Continuous by Continuous

Be wary of extrapolation

Simple slopes for a continuous by continuous model

Plotting a continuous by continuous interaction

Testing simple slopes in a continuous by continuous model

Testing differences in predicted values at a particular level of the moderator

(Optional) Manually calculating the simple slopes

Exercise

(Optional) Creating a publication quality graph with ggplot

Exercise

Continuous by Categorical

Interpreting the coefficients of the continuous by categorical interaction

Obtaining simple slopes by each level of the categorical moderator

Exercise

(Optional) Flipping the moderator (MV) and the independent variable (IV)

Plotting the continuous by categorical interaction

(Optional) Exercise

Categorical by Categorical

Simple effects in a categorical by categorical interaction

Exercise

Exercise

Plotting the categorical by categorical interaction

(Optional) Plotting simple effects using bar graphs with ggplot

Conclusion

Final Exercise