How can I create a new variable that contains the slopes from a regression analysis by group?

Let’s say that we have done a simple regression analysis by group and we want to create a new column in the original data set that contains the slope for each group. We show an example here on how to accomplish this using OMS (Output Management System) utility. This example can be generalized to many other situations, such as to multiple regression or different types of regression models. The data set hsb2.sav used for example can be downloaded by click on the link.

In the example below, we will perform a simple regression analysis of writing score (variable write) on math score (variable math) by race. Our end goal is to have a new variable containing the regression coefficient of math by race group. We will show it in a step-by-step fashion.

In order for the steps below to work correctly, we have chosen to display the output using the following setup via pull-down menu Edit -> Options.

Image byvar_1

Step 1: Creating a new group variable that records the group by consecutive order needed in a later step to merge the regression result to the original data set. We have chosen to save the entire data set to a new data set since we don’t want keep our original data set as is.

get file ='D:workdataspsshsb2.sav'.
sort cases by race.
compute byrace = 1.
if lag(race) ~=  race   byrace = lag(byrace) + 1.
if lag(race)   = race   byrace = lag(byrace).
exe.
save outfile = 'd:workdataspssbyrace.sav'.

Step 2: Getting ready to perform the analysis by group. This is done using the command "split file". In general we need to sort the data before the splitting the file. But since we have already sorted the data by race in Step 1, we will go straight to split the data file by race.

split file separate by race.

Step 3: Setting up OMS utility. This can be done in two ways, either via directly using the syntax or via point-and-click, which will also generate syntax as shown below. We have included the option /columns sequence =[RALL CALL LALL] to request that all the estimates be in a single row for each group. The regression coefficients will be stored in a data set that we have named as slope1.

* OMS.
dataset declare slope1.
oms
  /select tables
  /if commands =['Regression'] SUBTYPES=['Coefficients']
  /destination format =SAV numbered=byrace outfile ='slope1'
  /columns sequence =[RALL CALL LALL].

Step 4. Running regression and outputting the regression results to the output data called slope1 set up in previous step. We also need to end OMS utility at this point by using the command omsend.

regression  
  /dependent write
  /method =enter math.
omsend.

Step 5. Processing the data obtained from previous step and merge it with the original data set.

dataset activate slope1.
save outfile = 'd:workdataspssslope1.sav' 
/rename=( @1_math_B = slope_math) 
/keep= byrace slope_math. 

get file ='D:workdataspssbyrace.sav'.
match files 
/file = *  
/table='D:workdataspssslope1.sav' 
/by byrace. 
exe.

At this point, the data set byrace.sav has all the original variable together with the new variable called slope_math which contains the slope of our regression analysis by race group.

The syntax file for this example can be downloaded as a text file.