SPSS Class Notes: Managing Data

1.0 SPSS commands used in this unit

select if	keeps selected cases in the data current data file
descriptives	procedure for obtaining means, standard deviations, etc.
save outfile	saves the current data file with a new name
display	displays requested file information
frequencies	calculates frequencies
add files	appends (stacks) data files (adds cases)
sort cases	sorts cases in the data file
match files	merges data files (adds variables)

2.0 Demonstration and explanation

In this unit we will illustrate methods for subsetting data (in other words, using only some of the cases), appending data (adding cases from another SPSS data file), and merging data (adding variables from another SPSS data file).

2.1 Subsetting cases

Let’s open the data file.

File
 Open
  Data
   choose c:\temp\hs1.sav

get file "c:\temp\hs1.sav".

Let’s pretend that we are working on our honors thesis and that we want to study just “good readers”, defined as those with reading scores 60 or higher. We will open the file and then “select cases” to include the students with reading scores of 60 or higher.

Data
 Select Cases
  click if "condition is satisfied" and click "if" box
   read >= 60
    Continue
      choose "Delete unselected cases"

* keeping cases for which students have a reading
* score of 60 or higher.
select if read >=60.
descriptives 
 /var=read.

Notice that the undesired cases have now been deleted. Now we will save our data.

```
File
 Save as
  c:\temp\hsgoodread.sav
```

* saving the data file.
save outfile "c:\temp\hsgoodread.sav".

2.2 Subsetting variables

Let’s open the hs1 data file again.

File
 Open
  Data
   choose c:\temp\hs1.sav

get file "c:\temp\hs1.sav".

We want to keep just some variables, including id female read and write. We keep these variables in the same procedure that we use to save the data file. Notice you can also choose keep all if that is more helpful to you.

File
 Save as
  choose c:\temp\hskept.sav
   variables
    drop all
     click check boxes next to id, female, read, write
      Save

File
 Display Data File Information
  Working File

* pretend we have 2000 variables and we want to keep just
* some of the variables.  We want to keep just the variables
* id female read write.
save outfile = "c:\temp\hskept.sav" 
 /keep=id  female read write.
display names.

2.3 Appending

Let’s suppose we are working on our masters thesis. There are two files, one for the males (hsmale.sav) and one for the females (hsfemale.sav). We would like to combine these files. We will start by opening the file with the data for the males.

```
File
 Open
  Data
   c:\temp\hsmale.sav
```

* have one file with males, females in another file 
* and need to "append" the files.
get file "c:\temp\hsmale.sav".

As we can see, the variable female (which indicates gender) is a constant. This is what we would expect in a file with data only for males.

Analyze
 Descriptive Statistics
  Frequencies
   select female

freq 
 /var=female.

Now we can append the files.

Data
 Merge Files
  Add Cases
   An external SPSS Statistics data file
    choose c:\temp\hsfemale.sav
     Continue

add files 
 /file=* 
 /file="c:\temp\hsfemale.sav".

We will now save the data file with a new name.

```
File
 Save As
  c:\temp\hsmasters.sav
```

save outfile "c:\temp\hsmasters.sav".

2.4 Merging

Now let’s suppose that we are working on our dissertation. The data are in two files, one with the demographic information (hsdem.sav) and one with the test scores (hstest.sav). We would like to match merge these files based on id. Before we can match merge these files, we need to open each file, sort it on id, and then save the sorted file.

```
File
 Open
  Data
   c:\temp\hsdem.sav
```
```
Data
 Sort Cases
  choose id
```
```
File
 Save As
  c:\temp\hsdem.sav
```

* one file has demographic information, the other has 
* test scores and we want to "match merge" the files.
get file "c:\temp\hsdem.sav".


sort cases by id.

save outfile "c:\temp\hsdem.sav".

Now that we have sorted and saved the first file (hsdem.sav), we will do the same thing for the second file (hstest.sav).

```
File
 Open
  Data
   c:\temp\hstest.sav
```
```
Data
 Sort Cases
  choose id
```
```
File
 Save As
  c:\temp\hstest.sav
```

get file "c:\temp\hstest.sav".

sort cases by id.

save outfile "c:\temp\hstest.sav".

Finally, we will open the first file (hsdem.sav) and merge it with the second file (hstest.sav). We will save the merged data file with the name hsdiss.sav.

```
File
 Open
  Data
   c:\temp\hsdem.sav
```

get file "c:\temp\hsdem.sav".

It is important that we merge the data sets by the same variable on which we sorted the two files.

Data
 Merge Files
  Add Variables
   An external SPSS Statistics data file
    choose c:\temp\hstest.sav
     Continue
      click "match cases on key variable"
       move id as key variable
        click "Indicate case as source variable" 
        (name it fromtest)

match files 
 /file=* 
 /in=fromdem 
 /file="c:\temp\hstest.sav" 
 /in=fromtest
 /by id.

Finally, we will save the data file with a new name.

```
File
 Save As
  c:\temp\hsdiss.sav
```

save outfile "c:\temp\hsdiss.sav".

3.0 Syntax version

* working on honors thesis.
* want to make a subset just keeping those who have read >= 60.
get file "c:\temp\hs1.sav".

* keeping cases for which students have a reading score of 60 or higher.
select if read >=60.
descriptives
/var=read.
save outfile "c:\temp\hsgoodread.sav".

* pretend we have 2000 variables and we want to keep just some of the variables.
* we want to keep just the variables id female read write.
save outfile = "c:\temp\hskept.sav"
/keep=id female read write.
display names.
get file "c:\temp\hskept.sav".
display names.

* extra example not in point and click.
* we want to drop just the variables ses and prog.
get file "c:\temp\hsgoodread.sav".
save outfile "c:\temp\hsdropped.sav"
/drop=ses prog.
display names.
get file "c:\temp\hsdropped.sav".
display names.

* have one file with males, females in another file and need to "append" the files.
get file "c:\temp\hsmale.sav".
freq
/var=female.

add files
/file=*
/file="c:\temp\hsfemale.sav".
freq
/var=female.

save outfile "c:\temp\hsmasters.sav".

* one file has demographic scores, the other has test scores and we want to "match merge" the files.
get file "c:\temp\hsdem.sav".
list cases from 1 to 10.

sort cases by id.
save outfile "c:\temp\hsdem.sav".

get file "c:\temp\hstest.sav".
list cases from 1 to 10.

sort cases by id.
save outfile "c:\temp\hstest.sav".

get file "c:\temp\hsdem.sav".
match files
/file=*
/in=fromdem
/file="c:\temp\hstest.sav"
/in=fromtest
/by id.

list cases from 1 to 10.

list variables id fromdem fromtest.
crosstab
/tables=fromdem by fromtest.

save outfile "c:\temp\hsdiss.sav".

4.0 For more information

SPSS Programming and Data Management, Fourth Edition

Chapter 4

SPSS Learning Modules