1. Introduction
A SAS function returns a value from a computation or system manipulation that requires zero or more arguments. Most functions use arguments supplied by the user; however, a few obtain their arguments from the operating system. Here is the syntax of a function:
function-name(argument1, argument2)
We will illustrate some functions using the following dataset that includes name, x, test1, test2, and test3.
DATA getdata; INPUT name $14. x test1 test2 test3; DATALINES; John Smith 4.2 86.5 84.55 81 Samuel Adams 9.0 70.3 82.37 . Ben Johnson -6.2 82.1 84.81 87 Chris Adraktas 9.5 94.2 92.64 93 John Brown . 79.7 79.07 72 ; RUN;
The data set funct1 will create new variables using the int, round and mean numeric functions. What happens to tave due to the missing value of test3?
DATA funct1; SET getdata; t1int = INT(test1); t2int = INT(test2); /* integer part of a number */ t1rnd = ROUND(test1);t2rnd = ROUND(test2,.1); /* round to nearest whole number */ tave = MEAN(test1, test2, test3); /* mean across variables */ RUN; PROC PRINT DATA=funct1; VAR test1 test2 test3 t1int t2int t1rnd t2rnd tave; RUN; OBS TEST1 TEST2 TEST3 T1INT T2INT T1RND T2RND TAVE 1 86.5 84.55 81 86 84 87 84.6 84.0167 2 70.3 82.37 . 70 82 70 82.4 76.3350 3 82.1 84.81 87 82 84 82 84.8 84.6367 4 94.2 92.64 93 94 92 94 92.6 93.2800 5 79.7 79.07 72 79 79 80 79.1 76.9233
Now let’s try some more math functions. What happens when there is a missing or negative value of x?
DATA funct2; SET getdata; xsqrt = SQRT(x); /* square root */ xlog = LOG(x); /* log base e */ xexp = EXP(x); /* e raised to the power */ RUN; PROC PRINT DATA=funct2; VAR x xsqrt xlog xexp; RUN; OBS X XSQRT XLOG XEXP 1 4.2 2.04939 1.43508 66.69 2 9.0 3.00000 2.19722 8103.08 3 -6.2 . . 0.00 4 9.5 3.08221 2.25129 13359.73 5 . . . .
This time we’ll try some string functions. In particular, look closely at the substr function that is used in fname and lname.
DATA funct3; SET getdata; c1 = UPCASE(name); /* convert to upper case */ c2 = SUBSTR(name,3,8); /* substring */ len = LENGTH(name); /* length of string */ ind = INDEX(name,' '); /* position in string */ fname = SUBSTR(name,1,INDEX(name,' ')); lname = SUBSTR(name,INDEX(name,' ')); RUN; PROC PRINT DATA=funct3; VAR name c1 c2 len ind fname lname; RUN; OBS NAME C1 C2 LEN IND FNAME LNAME 1 John Smith JOHN SMITH hn Smith 10 5 John Smith 2 Samuel Adams SAMUEL ADAMS muel Ada 12 7 Samuel Adams 3 Ben Johnson BEN JOHNSON n Johnso 11 4 Ben Johnson 4 Chris Adraktas CHRIS ADRAKTAS ris Adra 14 6 Chris Adraktas 5 John Brown JOHN BROWN hn Brown 10 5 John Brown
2. Random numbers in SAS
Random numbers are more useful than you might imagine. They are used extensively in Monte Carlo studies, as well as in many other situations. We will look at two of SAS’s random number functions.
- UNIFORM(SEED) – generates values from a random uniform distribution between 0 and 1
- NORMAL(SEED) – generates values from a random normal distribution with mean 0 and standard deviation 1
The statements if x>.5 then coin = ‘heads’ and else coin = ‘tails’ create a random variable called coins that has values ‘heads’ and ‘tails’. The data sets random1 and random2 use a seed value of -1. Negative seed values will result in different random numbers being generated each time.
DATA random1; x = UNIFORM(-1); y = 50 + 3*NORMAL(-1); IF x>.5 THEN coin = 'heads'; ELSE coin = 'tails'; RUN; DATA random2; x = UNIFORM(-1); y = 50 + 3*NORMAL(-1); IF x>.5 THEN coin = 'heads'; ELSE coin = 'tails'; RUN; PROC PRINT DATA=random1; VAR x y coin; RUN; PROC PRINT DATA=random2; VAR x y coin; RUN; OBS X Y COIN 1 0.24441 49.7470 tails OBS X Y COIN 1 0.16922 49.1155 tails
Sometimes we will want to generate the same random numbers each time so that we can debug our programs. To do this we just enter the same positive number as the seed value. The data sets random3 and random4 illustrate how to generate the same results each time.
data random3; x = UNIFORM(123456); y = 50 + 3*NORMAL(123456); IF x>.5 THEN coin = 'heads'; ELSE coin = 'tails'; RUN; data random4; x = UNIFORM(123456); y = 50 + 3*NORMAL(123456); IF x>.5 THEN coin = 'heads'; ELSE coin = 'tails'; RUN; PROC PRINT DATA=random3; VAR x y coin; RUN; PROC PRINT DATA=random4; VAR x y coin; RUN; OBS X Y COIN 1 0.73902 48.7832 heads OBS X Y COIN 1 0.73902 48.7832 heads
Now let’s generate 100 random coin tosses and compute a frequency table of the results.
DATA random5; DO i=1 to 100; x = UNIFORM(123456); IF x>.5 THEN coin = 'heads'; ELSE coin = 'tails'; OUTPUT; END; RUN; PROC FREQ DATA=random5; table coin; RUN; Cumulative Cumulative COIN Frequency Percent Frequency Percent --------------------------------------------------- heads 48 48.0 48 48.0 tails 52 52.0 100 100.0
3. Problems to look out for
Watch out for math errors, such as division by zero, square root of a negative number and taking the log of a negative number.
4. For more information
For information on functions is SAS consult the SAS Language manual.
5. Web notes
You can view the SAS program associated with this module by clicking funct.sas . While viewing the file, you can save it by choosing File then Save As from the pull-down menu of your web browser. In the Save As dialog box, change the file name to funct.sas and then choose the directory where you want to save the file, then click Save.