Stata makes it very simple to calculate the amount of time between two dates, because it internally represents each date as a number. This is convenient when you need to calculate the number of days between patient appointments, for example.
Example 1, dates in wide format
In this example, the start and end dates are in different variables (columns).
clear input id m1 d1 m2 d2, 0 1 2 3 4 0 5 6 7 8 0 9 10 11 12 1 4 5 6 7 1 8 9 10 11 end gen date1 = mdy(m1, d1, 2001) gen date2 = mdy(m2, d2, 2002) drop m1-d2 format date1 %td format date2 %td
To get the number of days in between, just subtract one date from the other:
. /* For wide format*/ . gen between = date2 - date1 . li +------------------------------+ | id date1 date2 between | |------------------------------| 1. | 0 14977 15403 426 | 2. | 0 15101 15529 428 | 3. | 0 15228 15656 428 | 4. | 1 15070 15498 428 | 5. | 1 15196 15624 428 | +------------------------------+
Example 2, dates in long format
Continuing the example above, we can also calculate the number of days between different observations. This is appropriate when data is stacked. Here we lag date1 by one, and again take the difference with the unlagged date1.
. /* For long format*/ . sort date1 . gen between2 = date1 - date1[_n-1] (1 missing value generated) . li +-------------------------------------------------+ | id date1 date2 between between2 | |-------------------------------------------------| 1. | 0 02jan2001 04mar2002 426 . | 2. | 1 05apr2001 07jun2002 428 93 | 3. | 0 06may2001 08jul2002 428 31 | 4. | 1 09aug2001 11oct2002 428 95 | 5. | 0 10sep2001 12nov2002 428 32 | +-------------------------------------------------+
Example 3, dates in long format by group
In previous example, we see that id is not in order, and the differences span those rows. This might not make sense, eg if the two observations came from two different subjects. To account for this we can use the by command.
. /* For long format by id */ . sort id date2 . by id: gen between3 = date2 - date2[_n-1] (2 missing values generated) . li +------------------------------------------------------------+ | id date1 date2 between between2 between3 | |------------------------------------------------------------| 1. | 0 02jan2001 04mar2002 426 . . | 2. | 0 06may2001 08jul2002 428 31 126 | 3. | 0 10sep2001 12nov2002 428 32 127 | 4. | 1 05apr2001 07jun2002 428 93 . | 5. | 1 09aug2001 11oct2002 428 95 126 | +------------------------------------------------------------+