Let’s read in some dates in string format.
clear input str10 date_str "4/16/2005" "3/29/11" "8/15/2009" "5/14/07" "5/3/08" "10/10/2010" end list, clean date_str 1. 4/16/2005 2. 3/29/11 3. 8/15/2009 4. 5/14/07 5. 5/3/08 6. 10/10/2010
As you can see there is a mixture of 2-digit and 4-digit years. Data entry issues like this are more common than you might suspect. So, we will use the date function along with the %td format.
generate date=date(date_str,"MDY") (3 missing values generated) format %td date list, clean date_str date 1. 4/16/2005 16apr2005 2. 3/29/11 . 3. 8/15/2009 15aug2009 4. 5/14/07 . 5. 5/3/08 . 6. 10/10/2010 10oct2010
All three of the dates with 4-digit years came out fine. We could use some string function or regular expression to insert the value 20 in front of the 2-digit years. However, a simper way is to use the date function with the mask “MD20Y“.
replace date=date(date_str,"MD20Y") (3 real changes made) list, clean date_str date 1. 4/16/2005 16apr2005 2. 3/29/11 29mar2011 3. 8/15/2009 15aug2009 4. 5/14/07 14may2007 5. 5/3/08 03may2008 6. 10/10/2010 10oct2010
That worked just fine, but what if you had dates that go back to the previous century? Check out these dates.
clear input str10 date_str "4/16/2005" "3/29/11" "8/15/2009" "5/14/07" "5/3/08" "10/10/2010" "7/13/99" "2/19/1997" end list, clean date_str 1. 4/16/2005 2. 3/29/11 3. 8/15/2009 4. 5/14/07 5. 5/3/08 6. 10/10/2010 7. 7/13/99 8. 2/19/1997
We will start out as before.
generate date=date(date_str,"MDY") (4 missing values generated) format %td date list, clean date_str date 1. 4/16/2005 16apr2005 2. 3/29/11 . 3. 8/15/2009 15aug2009 4. 5/14/07 . 5. 5/3/08 . 6. 10/10/2010 10oct2010 7. 7/13/99 . 8. 2/19/1997 19feb1997
Next, instead of one replace, we will use two. We will use the mask “MD20Y” if the last two values of date_str are less than “50”. When the last two values are greater than “50” we will use the mask “MD19Y”.
replace date=date(date_str,"MD20Y") if substr(date_str,-2,2)<"50" (3 real changes made) replace date=date(date_str,"MD19Y") if substr(date_str,-2,2)>"50" (1 real change made) list, clean date_str date 1. 4/16/2005 16apr2005 2. 3/29/11 29mar2011 3. 8/15/2009 15aug2009 4. 5/14/07 14may2007 5. 5/3/08 03may2008 6. 10/10/2010 10oct2010 7. 7/13/99 13jul1999 8. 2/19/1997 19feb1997