1. Reading dates in data
This module will show how to read date variables, use date functions, and use date display formats in SPSS. You are assumed to be familiar with reading reading data into SPSS, and using compute for creating new variables. We will begin with the example data file below, containing the names of four people and their birthdays.
John01/01/1960 Mary11/07/1955 Kate25/11/1962 Mark08/06/1959
The program below reads the data into SPSS. The (date10) format is used to read the birthdays telling SPSS that the dates are 10 columns wide.
DATA LIST / name 1-4 (A) birthday (date10). BEGIN DATA. John01/01/1960 Mary11/07/1955 Kate25/11/1962 Mark08/06/1959 END DATA. LIST.
The output of the list command is presented below. You can compare the dates in the data to the values of birthday and see that the dates were read properly.
NAME BIRTHDAY John 01-JAN-60 Mary 11-JUL-55 Kate 25-NOV-62 Mark 08-JUN-59
We could read the same data file using the program below by indicating that birthday occupies columns 5-14 and should be read using the (date) format.
DATA LIST / name 1-4 (A) birthday 5-14 (date). BEGIN DATA. John01/01/1960 Mary11/07/1955 Kate25/11/1962 Mark08/06/1959 END DATA. LIST.
The output of the list command below shows that the dates were read correctly. You can read dates in either of these styles, but notice that SPSS expects the dates to occupy a fixed set of columns. In the first example, the (date10) format indicated that the date occupied up to 10 columns, and in the second example the 5-14 indicated the columns that birthday occupied.
Also, notice that the dates in these examples followed the format of "day month year". The date format expects the dates to have the day, followed by month, followed by year.
NAME BIRTHDAY John 01-JAN-60 Mary 11-JUL-55 Kate 25-NOV-62 Mark 08-JUN-59
SPSS can read dates in other formats. Below we show an example of reading dates in month, day, year format. The dates shown in the example below contain a month, day, year separated by either spaces, dashes, slashes, decimal points, or commas. Even though the dates all look very different, as long as they all follow this basic format, SPSS can read them. The only exception is the last date (the one for Carl) because that date does not have any separators between the month day and year.
DATA LIST / name 1-4 (A) birthday (adate14). BEGIN DATA. John4-12-1990 MaryApr.12.1990 BethApr 12,1990 KateApr 12, 1990 MarkApril 12, 1990 Fran4/12.1990 CarlApr121990 END DATA. LIST.
The output is shown below. As we would have expected, the birthday for Carl is set to missing ( . ) because there was not a separator between the month day and year.
NAME BIRTHDAY John 04/12/1990 Mary 04/12/1990 Beth 04/12/1990 Kate 04/12/1990 Mark 04/12/1990 Fran 04/12/1990 Carl
Suppose you had a file that did not have separators between the month, day and year. You can’t read such dates directly into SPSS using date formats, but you can read the month, day and year, and then convert those components into dates. The example below illustrates this by reading the day of birth as dd the month of birth as mm and the year of birth as yy. The compute command uses the date.mdy function to convert mm dd yy into a date variable called birthday.
DATA LIST / name 1-4 (A) dd 5-6 mm 7-8 yy 9-12. BEGIN DATA. John01011960 Mary11071955 Kate25111962 Mark08061959 END DATA. COMPUTE birthday = DATE.MDY(mm,dd,yy). LIST.
The output of the list command is not what we would expect, as we see below.
NAME DD MM YY BIRTHDAY John 1 1 1960 1.19E+10 Mary 11 7 1955 1.18E+10 Kate 25 11 1962 1.20E+10 Mark 8 6 1959 1.19E+10
SPSS actually stores dates as the number of seconds that have elapsed since October 14, 1582. When you read a date value using a date format (as we did in the first examples) SPSS automatically uses date formats to display the dates, but when you create a date yourself (as we did in this example using the date.mdy example), you need to tell SPSS yourself to use a date format to display the data, as illustrated below.
FORMATS birthday (DATE). LIST.
After using the formats command the birthday displays as we would expect.
NAME DD MM YY BIRTHDAY John 1 1 1960 01-JAN-1960 Mary 11 7 1955 11-JUL-1955 Kate 25 11 1962 25-NOV-1962 Mark 8 6 1959 08-JUN-1959
2. Computations with elapsed dates
Because of the way that SPSS stores date variables, you can make computations computing elapsed time quite easily. For example, let’s calculate everyone’s age on January 1, 2000 below.
COMPUTE age_sec = (DATE.MDY(01,01,2000) - birthday). COMPUTE age_day = (DATE.MDY(01,01,2000) - birthday) / (60*60*24) . COMPUTE age_year = (DATE.MDY(01,01,2000) - birthday) / (60*60*24*365.25) . COMPUTE age_sec2 = DATEDIFF(DATE.MDY(01,01,2000), birthday, "seconds"). COMPUTE age_day2 = DATEDIFF(DATE.MDY(01,01,2000), birthday, "day"). COMPUTE age_year2 = DATEDIFF(DATE.MDY(01,01,2000), birthday, "year") . LIST.
The first command computing age_sec takes the difference between January 1, 200 and the person’s birthday, which computes the person’s age in seconds on January 1, 2000. This is probably not very useful. The second command computes age_day and is just like the prior command, except that it divides the result by the number of seconds in a day (60 * 60 * 24). The third example computes age_year, the person’s age in years, by dividing by the number of seconds in a year (60 * 60 * 24 * 365.25). Alternately, you can use SPSS’ builtin DATEDIFF function. DATEDIFF truncates values so that there are no decimal places. This may be useful in certain cases and problematic in others, so we demonstrate how to use DATEDIFF and how to do it by hand.
name dd mm yy birthday age_sec age_day age_year age_sec2 age_day2 age_year2 John 1 1 1960 01-JAN-1960 1.3E+009 14610.00 40.00 1.3E+009 14610.00 40.00 Mary 11 7 1955 11-JUL-1955 1.4E+009 16245.00 44.48 1.4E+009 16245.00 44.00 Kate 25 11 1962 25-NOV-1962 1.2E+009 13551.00 37.10 1.2E+009 13551.00 37.00 Mark 8 6 1959 08-JUN-1959 1.3E+009 14817.00 40.57 1.3E+009 14817.00 40.00
3. Other useful date functions
Sometimes you would like to extract the day, month, year or day of week for a given date. The following program demonstrates the use of these functions.
COMPUTE d=XDATE.MDAY(birthday). COMPUTE m=XDATE.MONTH(birthday). COMPUTE y=XDATE.YEAR(birthday). COMPUTE wkday=XDATE.WKDAY(birthday). LIST name birthday d m y wkday.
As the output below illustrates, d contains the day of month, m contains the month number, y contains the year, and wkday contains the day of week (where 1=Sunday and 7=Saturday).
NAME BIRTHDAY D M Y WKDAY John 01-JAN-1960 1.00 1.00 1960.00 6.00 Mary 11-JUL-1955 11.00 7.00 1955.00 2.00 Kate 25-NOV-1962 25.00 11.00 1962.00 1.00 Mark 08-JUN-1959 8.00 6.00 1959.00 2.00
The following example illustrates how you can use dates in a logical expression. For example, if you would like to select only the dates that occur prior to a given date, you could use the code illustrated below. Please note that this code deletes from your data set all of the dates not selected, so you may want to use a copy of your data set when doing this. In this example, any date prior to July 1, 2002 is deleted from the data set. The lt is the SPSS abbreviation for "less than".
SELECT IF date lt date.mdy(7,1,2002). execute.
You could use similar logic for computing the difference between dates, using dates in a filter (in which case the dates not selected would not be deleted from your data set) and in other SPSS commands.
4. Summary
- Dates are read with date formats, most commonly date or adate.
- Date functions can be used to create date values from their components date.mdy(mm,dd,yy), and to extract the components from a date value such as xdate.mday(birthday).
5. Problems to look out for
- Dates do not have separators between the day month and year. Solution: Read each component as a separate variable and then use the date.mdy function to convert the month day and year into a date variable.
6. For more information
- For information on reading data into SPSS, see the SPSS Learning Module Inputting raw data into SPSS and the SPSS Library page Inputting and manipulating dates in SPSS .