Decomposing, Probing, and Plotting Interactions in Stata

Purpose

This seminar will show you how to decompose, probe, and plot two-way interactions in linear regression using the margins command in Stata. This page is based off of the seminar Decomposing, Probing, and Plotting Interactions in R.

Outline

Throughout the seminar, we will be covering the following types of interactions:

Continuous by continuous
Continuous by categorical
Categorical by categorical

We can probe or decompose each of these interactions by asking the following research questions:

What is the predicted Y given a particular X and W? (predicted value)
What is relationship of X on Y at particular values of W? (simple slopes/effects)
Is there a difference in the relationship of X on Y for different values of W? (comparing simple slopes)

Proceed through the seminar in order or click on the hyperlinks below to go to a particular section:

Main vs. Simple effects (slopes)
Predicted Values vs. Slopes
- Understanding slopes in regression
- Plotting a regression slope
Continuous by Continuous
Continuous by Categorical
Categorical by Categorical
- Simple effects in a categorical by categorical interaction
- Plotting the categorical by categorical interaction

This seminar page was inspired by Analyzing and Visualizing Interactions in SAS and parallels Decomposing, Probing, and Plotting Interactions in R.

Requirements

Before beginning the seminar, please make sure you have Stata installed. The dataset used in the seminar can be found here as a Stata file exercise.dta or as a CSV file exercise.csv. You can also import the data directly into Stata via the URL using the following code:

use https://stats.idre.ucla.edu/wp-content/uploads/2020/06/exercise, clear

Before you begin the seminar, load the data as above and create value labels forgender and prog (exercise type):

label define progl 1 "jog" 2 "swim" 3 "read"
label define genderl 1 "male" 2 "female"
label values prog progl
label values gender genderl

Download links

PowerPoint slides: interactions_stata.pptx
PowerPoint slides as a PDF: interactions_stata.pdf
Complete Stata code: interactions.do
Part 2 + Lab PowerPoint slides: interactions_stata_lab.pptx
Part 2 + Lab PowerPoint slides as a PDF: interactions_stata_lab.pdf
(Bonus) Three-way interactions PowerPoint slides: three-way interactions.pptx
(Bonus) Three-way interactions Stata Do File: threeway-interactions.do

Motivation

Suppose you are doing a simple study on weight loss and notice that people who spend more time exercising lose more weight. Upon further analysis you notice that those who spend the same amount of time exercising lose more weight if they are more effortful. The more effort people put into their workouts, the less time they need to spend exercising. This is popular in workouts like high intensity interval training (HIIT).

You know that hours spent exercising improves weight loss, but how does it interact with effort? Here are three questions you can ask based on hypothetical scenarios.

I’m just starting out and don’t want to put in too much effort. How many hours per week of exercise do I need to put in to lose 5 pounds?
I’m moderately fit and can put in an average level of effort into my workout. For every one hour increase per week in exercise, how much additional weight loss do I expect?
I’m a crossfit athlete and can perform with the utmost intensity. How much more weight loss would I expect for every one hour increase in exercise compared to the average amount of effort most people put in?

Additionally, we can visualize the interaction to help us understand these relationships.

Weight Loss Study

This is a hypothetical study of weight loss for 900 participants in a year-long study of 3 different exercise programs, a jogging program, a swimming program, and a reading program which serves as a control activity. Variables include

loss: weight loss (continuous), positive = weight loss, negative scores = weight gain

hours: hours spent exercising (continuous)

effort: effort during exercise (continuous), 0 = minimal physical effort and 50 = maximum effort

prog: exercise program (categorical)

jogging=1
swimming=2
reading=3

gender: participant gender (binary)

male=1
female=2

Definitions

What exactly do I mean by decomposing, probing, and plotting an interaction?

decompose: to break down the interaction into its lower order components (i.e., predicted means or simple slopes)
probe: to use hypothesis testing to assess the statistical significance of simple slopes and simple slope differences (i.e., interactions)
plot: to visually display the interaction in the form of simple slopes such as values of the dependent variable are on the y-axis, values of the predictor is on the x-axis, and the moderator separates the lines or bar graphs

Let’s define the essential elements of the interaction in a regression:

DV: dependent variable (Y), the outcome of your study (e.g., weight loss)
IV: independent variable (X), the predictor of your outcome (e.g., time exercising)
MV: moderating variable (W) or moderator, a predictor that changes the relationship of the IV on the DV (e.g, effort)
coefficient: estimate of the direction and magnitude of the relationship between an IV and DV
continuous variable: a variable that can be measured on a continuous scale, e.g., weight, height
categorical or binary variable: a variable that takes on discrete values, binary variables take on exactly two values, categorical variables can take on 3 or more values (e.g., gender, ethnicity)
main effects or slopes: effects or slopes for models that do not involve interaction terms
simple slope: when a continuous IV interacts with an MV, its slope at a particular level of an MV
simple effect: when a categorical IV interacts with an MV, its effect at a particular level of an MV

Purpose

Outline

Requirements

Download links

Motivation

Weight Loss Study

Definitions

Main vs. Simple effects (slopes)

Predicted Values vs. Slopes

Understanding slopes in regression

Exercise

Plotting a regression slope

Continuous by Continuous

Be wary of extrapolation

Simple slopes for a continuous by continuous model

Plotting a continuous by continuous interaction

(Optional) Obtaining confidence interval bands

Testing simple slopes in a continuous by continuous model

Testing differences in predicted values at a particular level of the moderator

(Optional) Manually calculating the simple slopes

Exercise

Exercise

Continuous by Categorical

Interpreting the coefficients of the continuous by categorical interaction

Obtaining simple slopes by each level of the categorical moderator

Exercise

(Optional) Flipping the moderator (MV) and the independent variable (IV)

Plotting the continuous by categorical interaction

Categorical by Categorical

Simple effects in a categorical by categorical interaction

Exercise

Exercise

Plotting the categorical by categorical interaction

Conclusion

Final Exercise