We will be using SAS-callable SUDAAN for this seminar. This means that
SUDAAN runs from within your SAS session. You do not need to "turn on"
SUDAAN or doing anything like that. Once SAS is running, you access SUDAAN
by running a SUDAAN command. Your SAS and SUDAAN code will be in the same
program and there is no need to differentiate the code in any way. If you have a
copy of stand-alone SUDAAN, most of what is presented here will work for you, except,
of course, the SAS code. Also, you will not need the **run** statements
at the end of the proc steps.

As you may have guessed, SUDAAN code looks much like SAS code. There
are eight analysis procs in SUDAAN. All of your data management must be
done in another package. Because we are using SAS-callable SUDAAN, we will
assume that you will do your data management in SAS, and we assume that you are
familiar with the basics of doing data management in SAS. If you are not,
or if you would like some additional information, please see our page on
data
management in SAS. Perhaps the most
important data management issue that you will encounter is that SUDAAN considers
values of 0 to be missing for all procs except **proc rlogist** (used for
logistic regression) when the
variable is used as the dependent variable. This means that if you have a
variable called **female** that is coded 1 for females and 0 for males, all
of the males in the data set will be considered missing. You can either
recode such variables in SAS or you can use the **recode** statement in SUDAAN.
We have some examples of how to use the **recode** statement in our FAQ
How can I use the recode statement in SUDAAN?
. Another important data management issue is how missing values are coded
in your data set. If you are using a public-use data file, you can look at
the codebook for this information. Frequently, values such as 999, 888,
-999 and -888 are used to indicate missing values. You will need to recode
these before using them in SUDAAN, as there is no way to tell SUDAAN to consider
such values missing. We have included some SAS code at the end of this
seminar that can be used as a template for recoding missing values in all
numerical variables in a data set.

One other thing that we should point out before we start running some code is the way SUDAAN code looks in the SAS Enhanced Program Editor. Unlike SAS, SUDAAN does not use the coloring of the key words. Hence, you will see some words in blue and others in red. The coloring is no indication of the correctness of your SUDAAN program. Just ignore it.

For the following examples, we will use the CHIS data set. CHIS is a
publicly available data set that uses replicate weights to correct the standard
errors of the estimates instead of PSUs and strata. For examples of how to
use these variables in SUDAAN, please see our page based on
textbook
examples. We have run
the SAS code to create a temporary SAS data set which we have called **chis**.
We will start with **proc descript**, which is used to get basic descriptive
statistics.

proc descript data=chis filetype=sas design = jackknife; weight rakedw0; jackwgts rakedw1--rakedw80 / adjjack=1; var ad5; run;

Let’s look at the code above before we look at the output. The **proc
descript** command is a SUDAAN command. The **data =** option is just
like the SAS **data =** option on a proc statement. Note that you
cannot use a pathname here (unlike SAS). You must use either a temporary
data set, as we do here, or a libname. The **filetype =** option is
needed to tell SUDAAN what type of data file you are using. Note that
SUDAAN can also read SAS export files (sasxport), ascii and SPSS files. The **design
=** option is also necessary. You will get this information from the
codebook. In this case, we specify the design as jackknife because we have
jackknife replicate weights. Jackknife is one of several ways to create
replicate weights, so even though you see replicate weights in the data file,
you need to know how they were created so that you can specify them correctly in
your SUDAAN program. Again, this is information that you will get from the
documentation. There is no way of knowing this information just from
looking at the data set. On the **weight** statement we indicate the pweight,
sometimes called the final pweight. On the **jackwgts** statement we
indicate which variables are the replicate weights. The double dash is
used to indicate positionally consecutive variables in the data set. The
**adjjack =** option after the slash is necessary. While this is often
set to 1, you will need to check the documentation for the correct value.
Although these three lines of code seem to be complex, once you have them
correct, they do not need to be modified again. Personally, I just
cut-and-paste them from one proc to the next. Because they specify the
sampling design used in the collection of the survey data, this is information
that will not change during the course of data analysis. All of this information is specified in exactly the
same way for every proc in SUDAAN. The **var** statement is the same as
the **var** statement is SAS. Then you give a **run** statement and you are
finished! Finally, let’s look at the output.

S U D A A N Software for the Statistical Analysis of Correlated Data Copyright Research Triangle Institute January 2003 Release 8.0.2

Number of observations read : 55428 Weighted count : 23847415 Denominator degrees of freedom : 80

Date: 02-26-2004 Research Triangle Institute Page : 1 Time: 13:20:25 The DESCRIPT Procedure Table : 1

Variance Estimation Method: Replicate Weight Jackknife by: Variable, One.

----------------------------------------------------- | | | | Variable | | One | | | 1 | ----------------------------------------------------- | | | | | How many Pap | Sample Size | 30530 | | smear tests | Weighted Size | 11141052.70 | | last 6 years | Total | 51425352.66 | | | Mean | 4.62 | | | SE Mean | 0.02 | -----------------------------------------------------

We see at the top that 55428 observations were read in the data set, and when
weighted with the pweight, the count is 23,847,415. The first line of the output
in the table shows the sample size. This is the number
of cases used in the analysis. The second line indicates the number of
individuals in the population the sample size represents. Note that these
numbers differ from those at the top of the output because of the variable we
are looking at (men don’t get Pap smears). The third line
gives the total, the fourth line the mean and the fifth line the standard
error of the mean of the variable specified on the **var** statement. You can use options on the **proc descript**
statement to add other statistics to this output, as well as adding an **output**
or **print** statement.

Now let’s try an example in which the totals and means are found for
different groups, such as race. We add the categorical variable on the **
tables** statement. In SUDAAN version 8, you will also need to use a
**subgroup** statement, on which you list the variables just as they appear on the
**tables** statement, and a **levels** statement, on which you specify the number of
levels of the variable(s) on the **subgroup** statement. We will look at seven
categories of race. If we only wanted to look at the first three
categories, we could type 3 on the **levels** statement. If you had two
variables listed on the **subgroup** statement, you would list the number of
categories for each variable on the **levels** statement in the order that
the variables are listed on the **subgroup** statement. Like the
variable names, the number of levels are separated by a space.

proc descript data=chis filetype=sas design = jackknife; weight rakedw0; jackwgts rakedw1--rakedw80 / adjjack=1; var ad5; tables racehpra; subgroup racehpra; levels 7; run;

Variance Estimation Method: Replicate Weight Jackknife by: Variable, Race - UCLA CHPR Definition.

----------------------------------------------------------------------------------- | | | | Variable | | Race - UCLA CHPR Definition | | | Total | LATINO | PACIFIC | | | | | | ISLANDER | ----------------------------------------------------------------------------------- | | | | | | | How many Pap | Sample Size | 30530 | 5027 | 106 | | smear tests | Weighted Size | 11141052.70 | 2504107.61 | 24165.49 | | last 6 years | Total | 51425352.66 | 11440176.40 | 111606.38 | | | Mean | 4.62 | 4.57 | 4.62 | | | SE Mean | 0.02 | 0.05 | 0.43 | -----------------------------------------------------------------------------------

----------------------------------------------------------------------------------- | | | | Variable | | Race - UCLA CHPR Definition | | | AIAN | ASIAN | AFRICAN | | | | | | AMERICAN | ----------------------------------------------------------------------------------- | | | | | | | How many Pap | Sample Size | 418 | 1814 | 1657 | | smear tests | Weighted Size | 39588.04 | 1002420.57 | 727651.50 | | last 6 years | Total | 177162.48 | 4112383.47 | 3629231.58 | | | Mean | 4.48 | 4.10 | 4.99 | | | SE Mean | 0.20 | 0.11 | 0.11 | -----------------------------------------------------------------------------------

-------------------------------------------------------------------- | | | | Variable | | Race - UCLA CHPR Definition | | | WHITE | OTH | | | | | SINGL/MULTI | | | | | RACE | -------------------------------------------------------------------- | | | | | | How many Pap | Sample Size | 20692 | 816 | | smear tests | Weighted Size | 6501687.43 | 341432.05 | | last 6 years | Total | 30299302.56 | 1655489.78 | | | Mean | 4.66 | 4.85 | | | SE Mean | 0.03 | 0.12 | --------------------------------------------------------------------

In the third column in the table on the top, we see the results for the total number of cases involved in the analysis. In the following columns we see the results broken out for each of the races. Notice the differences in the sample sizes.

Now let’s try using **proc crosstab**. We will make a crosstab of
**gender** and **race**. We are only going to use two levels of **race**, just to
make the output shorter.

proc crosstab data=chis filetype=sas design = jackknife; weight rakedw0; jackwgts rakedw1--rakedw80 / adjjack=1; tables srsex*racehpra; subgroup srsex racehpra; levels 2 2; run;

Variance Estimation Method: Replicate Weight Jackknife by: SRSEX, RACEHPRA.

----------------------------------------------------------------------------- | | | | SRSEX | | RACEHPRA | | | Total | LATINO | PACIFIC | | | | | | ISLANDER | ----------------------------------------------------------------------------- | | | | | | | Total | Sample Size | 9677 | 9458 | 219 | | | Weighted Size | 5705917.88 | 5643945.79 | 61972.10 | | | SE Weighted | 28246.94 | 28469.00 | 4755.06 | | | Row Percent | 100.00 | 98.91 | 1.09 | | | Col Percent | 100.00 | 100.00 | 100.00 | | | Tot Percent | 100.00 | 98.91 | 1.09 | | | SE Row Percent | 0.00 | 0.08 | 0.08 | | | SE Col Percent | 0.00 | 0.00 | 0.00 | | | SE Tot Percent | 0.00 | 0.08 | 0.08 | ----------------------------------------------------------------------------- | | | | | | | MALE | Sample Size | 4084 | 3983 | 101 | | | Weighted Size | 2866894.01 | 2836612.17 | 30281.84 | | | SE Weighted | 30195.55 | 29750.97 | 3293.96 | | | Row Percent | 100.00 | 98.94 | 1.06 | | | Col Percent | 50.24 | 50.26 | 48.86 | | | Tot Percent | 50.24 | 49.71 | 0.53 | | | SE Row Percent | 0.00 | 0.11 | 0.11 | | | SE Col Percent | 0.51 | 0.51 | 4.44 | | | SE Tot Percent | 0.51 | 0.50 | 0.06 | ----------------------------------------------------------------------------- | | | | | | | FEMALE | Sample Size | 5593 | 5475 | 118 | | | Weighted Size | 2839023.87 | 2807333.62 | 31690.25 | | | SE Weighted | 34030.51 | 34193.28 | 3931.19 | | | Row Percent | 100.00 | 98.88 | 1.12 | | | Col Percent | 49.76 | 49.74 | 51.14 | | | Tot Percent | 49.76 | 49.20 | 0.56 | | | SE Row Percent | 0.00 | 0.14 | 0.14 | | | SE Col Percent | 0.51 | 0.51 | 4.44 | | | SE Tot Percent | 0.51 | 0.51 | 0.07 | -----------------------------------------------------------------------------

In the row labeled "Total", we see the totals collapsed across **gender**.
In the column labeled "Total", we see the totals collapsed across **race**.
You can make multilayered tables by adding more variables to the **tables**
statement (with * in between each variable).

Although we did not have this happen in the above examples, it often happens that you see stars in your output instead of numbers. Don’t worry – this does not mean that there was an error in your SUDAAN code or trouble calculating things. Rather, it means that SUDAAN did not have enough space in the column to print the number (remember that numbers in this kind of output can get to be really big). We have a FAQ on how to change the stars to numbers that will show you how to fix this problem.

Let’s move from basic descriptive statistics to some analyses. We will
start with a regression. Remember that the output from the regression (as
well as from any other analysis in SUDAAN) is the same as it would be using
non-survey data. In other words, the interpretation of the output does not
change just because we are using survey data or a special statistical package
for the analysis. We will use **ae13**, which is the number of
drinks on the days on which one drinks alcohol, as the dependent variable, and
**ae14**, number of times having five or more drinks in past month, as the
independent variable. We do not claim that this model is sensible or that
we are testing any specific hypothesis. Rather, we just selected two
continuous variables for this example.

proc regress data=chis filetype=sas design = jackknife; weight rakedw0; jackwgts rakedw1--rakedw80 / adjjack=1; model ae13 = ae14; run;

Number of observations read : 55428 Weighted count: 23847415 Observations used in the analysis : 32538 Weighted count: 13783845 Denominator degrees of freedom : 80

Maximum number of estimable parameters for the model is 2 Weighted mean response is 2.188590

Multiple R-Square for the dependent variable AE13: 0.241897

Variance Estimation Method: Replicate Weight Jackknife Working Correlations: Independent Link Function: Identity Response variable AE13: Number of drinks on the days drinking alcohol

---------------------------------------------------------------------- Independent P-value Variables and Beta T-Test Effects Coeff. SE Beta T-Test B=0 B=0 ---------------------------------------------------------------------- Intercept 1.88 0.01 152.15 0.0000 Number of times having 5 or more drinks in past month 0.34 0.01 25.47 0.0000 ----------------------------------------------------------------------

-------------------------------------------------------

Contrast Degrees of P-value Freedom Wald F Wald F ------------------------------------------------------- OVERALL MODEL 2 12818.28 0.0000 MODEL MINUS INTERCEPT 1 648.71 0.0000 INTERCEPT 1 23150.59 0.0000 AE14 1 648.71 0.0000 -------------------------------------------------------

You will notice from the first two lines of the output that there were many (thousands) more observations read than were used in the analysis. This is because of missing data. We are also given the weighted mean of the dependent variable and the multiple R-squared. Next we see the coefficients and significance tests. The degrees of freedom are given in the last table. The t-tests shown in the first table are equivalent to the Wald F tests shown in the second table (within rounding error).

You can use categorical variables in your regression by using the **subgroup**
and **levels** statements.

proc regress data=chis filetype=sas design = jackknife; weight rakedw0; jackwgts rakedw1--rakedw80 / adjjack=1; model ae13 = ae14 srsex; subgroup srsex; levels 2; run;

Number of observations read : 55428 Weighted count: 23847415 Observations used in the analysis : 32538 Weighted count: 13783845 Denominator degrees of freedom : 80

Maximum number of estimable parameters for the model is 3 Weighted mean response is 2.188590

Multiple R-Square for the dependent variable AE13: 0.259603

Variance Estimation Method: Replicate Weight Jackknife Working Correlations: Independent Link Function: Identity Response variable AE13: Number of drinks on the days drinking alcohol

---------------------------------------------------------------------- Independent P-value Variables and Beta T-Test Effects Coeff. SE Beta T-Test B=0 B=0 ---------------------------------------------------------------------- Intercept 1.61 0.01 116.51 0.0000 Number of times having 5 or more drinks in past month 0.32 0.01 24.90 0.0000 Self-reported gender MALE 0.52 0.03 19.77 0.0000 FEMALE 0.00 0.00 . . ----------------------------------------------------------------------

-------------------------------------------------------

Contrast Degrees of P-value Freedom Wald F Wald F ------------------------------------------------------- OVERALL MODEL 3 10053.93 0.0000 MODEL MINUS INTERCEPT 2 528.93 0.0000 INTERCEPT . . . AE14 1 619.78 0.0000 SRSEX 1 390.70 0.0000 -------------------------------------------------------

You can change the reference level of the categorical variable by using the
**reflevel** statement, as shown below. All you need to do is list the
variable and indicate which value should be used as the reference level.
You can specify more than one variable on this statement if you need to set the
reference values for more than one variable. Please see our FAQ on
using categorical variables in regression
analyses for more details and examples using only some of the categories of
a variable.

proc regress data=chis filetype=sas design = jackknife; weight rakedw0; jackwgts rakedw1--rakedw80 / adjjack=1; model ae13 = ae14 srsex; subgroup srsex; relevel srsex = 1; levels 2; run;

Number of observations read : 55428 Weighted count: 23847415 Observations used in the analysis : 32538 Weighted count: 13783845 Denominator degrees of freedom : 80

Maximum number of estimable parameters for the model is 3 Weighted mean response is 2.188590

Multiple R-Square for the dependent variable AE13: 0.259603

Variance Estimation Method: Replicate Weight Jackknife Working Correlations: Independent Link Function: Identity Response variable AE13: Number of drinks on the days drinking alcohol

---------------------------------------------------------------------- Independent P-value Variables and Beta T-Test Effects Coeff. SE Beta T-Test B=0 B=0 ---------------------------------------------------------------------- Intercept 2.13 0.02 101.59 0.0000 Number of times having 5 or more drinks in past month 0.32 0.01 24.90 0.0000 Self-reported gender MALE 0.00 0.00 . . FEMALE -0.52 0.03 -19.77 0.0000 ----------------------------------------------------------------------

-------------------------------------------------------

Contrast Degrees of P-value Freedom Wald F Wald F ------------------------------------------------------- OVERALL MODEL 3 10053.93 0.0000 MODEL MINUS INTERCEPT 2 528.93 0.0000 INTERCEPT . . . AE14 1 619.78 0.0000 SRSEX 1 390.70 0.0000 -------------------------------------------------------

Perhaps now is a good time to talk about subpopulations. Many times
researchers are interested only in a certain subpopulation, or group, of
respondents. For example, you may be interested only females or only in
people who call themselves white, or white females. In order to limit your
analysis to just these folks, you might be tempted to use a SAS data set with a
subsetting **if** statement and create a smaller data set with just the individuals
of interest. DO NOT DO THIS!!!!! Instead, use the **subpopn**
statement with the intact (complete, whole) data set. We have a FAQ on
how to use the subpopn statement
where we list references documenting the concerns with incorrectly subsetting
your data set and the problems that this can cause. Happily, using the **
subpopn** statement is really easy, and it requires much less effort than
creating a new data set. Let’s suppose that we wanted to run our first
regression, but only with female respondents. First, let’s be very clear
about how the data are coded. We will use two **proc freq**s to do
this. In the first one, we see the labels for gender, and in the second
one we see the numbers with which the levels are actually coded.

proc freq data = chis; tables srsex; run;

The FREQ Procedure

Cumulative Cumulative SRSEX Frequency Percent Frequency Percent ----------------------------------------------------------- MALE 23002 41.50 23002 41.50 FEMALE 32426 58.50 55428 100.00

proc freq data = chis; tables srsex; format srsex; run;

The FREQ Procedure

Cumulative Cumulative SRSEX Frequency Percent Frequency Percent ---------------------------------------------------------- 1 23002 41.50 23002 41.50 2 32426 58.50 55428 100.00

Now that we are certain that the females are coded as 2, we can use that
value on our **subpopn** statement. Also note that we have bolded the
line in the output that tells you for what subpopulation the analysis was done.

proc regress data=chis filetype=sas design = jackknife; weight rakedw0; jackwgts rakedw1--rakedw80 / adjjack=1; model ae13 = ae14; subpopn srsex = 2; run;

Number of observations read : 55428 Weighted count: 23847415 Observations in subpopulation : 32426 Weighted count: 12215687 Observations used in the analysis : 17097 Weighted count: 6104293 Denominator degrees of freedom : 80

Maximum number of estimable parameters for the model is 2 Weighted mean response is 1.720202

Multiple R-Square for the dependent variable AE13: 0.162257

Variance Estimation Method: Replicate Weight Jackknife Working Correlations: Independent Link Function: Identity Response variable AE13: AE13For Subpopulation: SRSEX = 2

---------------------------------------------------------------------- Independent P-value Variables and Beta T-Test Effects Coeff. SE Beta T-Test B=0 B=0 ---------------------------------------------------------------------- Intercept 1.59 0.01 109.96 0.0000 AE14 0.37 0.03 13.80 0.0000 ----------------------------------------------------------------------

-------------------------------------------------------------------------------------------------- -------------------------------------------------------

Contrast Degrees of P-value Freedom Wald F Wald F ------------------------------------------------------- OVERALL MODEL 2 7841.08 0.0000 MODEL MINUS INTERCEPT 1 190.51 0.0000 INTERCEPT 1 12091.73 0.0000 AE14 1 190.51 0.0000 -------------------------------------------------------

While we will not cover interactions here, you can see our FAQ on how to create interaction terms in SUDAAN. Of course, you can always create the terms that you need in a SAS data step and then use them in your SUDAAN code.

Finally, let’s try a logistic regression. Because we are using
SAS-callable SUDAAN, we need to use an alias for **proc logistic** (so that SAS
knows to call the SUDAAN version of the command instead of the SAS version).
The alias is **proc rlogist**. Remember that the dependent variable in
**proc rlogist** must be coded 0/1, but two-level categorical independent
variables cannot be coded 0/1 (use 1/2 instead). We are going to use **ae9** as
our dependent variable, which indicates if the respondent has taken a vitamin or
dietary supplement in the past month. We will use **proc freq** to see
how this variable is coded. From the first output, we see that the values
are labeled "yes" and "no". In the second **proc freq**, we will use
the **format** statement to suppress these labels, showing us that the variable is
coded 1/2.

proc freq data = chis; tables ae9; run;

The FREQ Procedure

Cumulative Cumulative AE9 Frequency Percent Frequency Percent -------------------------------------------------------------------- YES 34110 61.59 34110 61.59 NO 21271 38.41 55381 100.00

Frequency Missing = 47

proc freq data = chis; tables ae9; format ae9; run;

The FREQ Procedure

Cumulative Cumulative AE9 Frequency Percent Frequency Percent -------------------------------------------------------- 1 34110 61.59 34110 61.59 2 21271 38.41 55381 100.00

Frequency Missing = 47

Now that we are certain how the data are coded, we can write a little data step to change the coding to 0/1. After that, we are ready to run the logistic regression.

data chis1; set chis; ae91 = ae9 - 1; run;

proc rlogist data=chis1 filetype=sas design = jackknife; weight rakedw0; jackwgts rakedw1--rakedw80 / adjjack=1; model ae91 = ae14; run;

Number of zero responses : 21202 Number of non-zero responses : 11843

Independence parameters have converged in 5 iterations

Number of observations read : 55428 Weighted count: 23847415 Observations used in the analysis : 33045 Weighted count: 13995933 Denominator degrees of freedom : 80

Maximum number of estimable parameters for the model is 2

Sample and Population Counts for Response Variable AE91 0: Sample Count 21202 Population Count 8418183 1: Sample Count 11843 Population Count 5577750

R-Square for dependent variable AE91 (Cox & Snell, 1989): 0.003526

-2 * Normalized Log-Likelihood with Intercepts Only : 44439.56 -2 * Normalized Log-Likelihood Full Model : 44322.82 Approximate Chi-Square (-2 * Log-L Ratio) : 116.73 Degrees of Freedom : 1

Note: The approximate Chi-Square is not adjusted for clustering. Refer to hypothesis test table for adjusted test.

Variance Estimation Method: Replicate Weight Jackknife Working Correlations: Independent Link Function: Logit Response variable AE91: AE91

---------------------------------------------------------------------- Independent P-value Variables and Beta T-Test Effects Coeff. SE Beta T-Test B=0 B=0 ---------------------------------------------------------------------- Intercept -0.45 0.02 -29.76 0.0000 AE14 0.04 0.01 7.07 0.0000 ----------------------------------------------------------------------

Variance Estimation Method: Replicate Weight Jackknife Working Correlations: Independent Link Function: Logit Response variable AE91: AE91

-------------------------------------------------------

Contrast Degrees of P-value Freedom Wald F Wald F ------------------------------------------------------- OVERALL MODEL 2 442.74 0.0000 MODEL MINUS INTERCEPT 1 49.97 0.0000 INTERCEPT 1 885.38 0.0000 AE14 1 49.97 0.0000 -------------------------------------------------------

----------------------------------------------------------- Independent Variables and Lower 95% Upper 95% Effects Odds Ratio Limit OR Limit OR ----------------------------------------------------------- Intercept 0.64 0.62 0.66 AE14 1.04 1.03 1.06 -----------------------------------------------------------

As we can see from the output, we are given both the coefficients and the odds ratios by default. We also get all of the standard output that you expect when you run a logistic regression.

Lastly, let’s talk about missing values in data sets. Below is some SAS code that we have written to change all of the missing values in the numerical variables in the data set into missing values that SUDAAN can understand. Remember that when we got the CHIS data set, missing values were coded as -9, -8, etc., which SUDAAN (and SAS) interpret as valid values. These need to changed in some way so that the program understands that these are really missing values. For more information on missing values in SAS (and an explanation of .a, .b, etc.), please see our FAQ on how to code missing data in SAS.

data chis; set "D:CHIS DataCHIS2001_PUFA2_082802"; array allnum(*) _numeric_; do i = 1 to dim(allnum); if allnum(i) =-1 then allnum(i) = .a; else if allnum(i) =-2 then allnum(i) = .b; else if allnum(i) =-3 then allnum(i) = .c; else if allnum(i) =-4 then allnum(i) = .d; else if allnum(i) =-5 then allnum(i) = .e; else if allnum(i) =-6 then allnum(i) = .f; else if allnum(i) =-7 then allnum(i) = .g; else if allnum(i) =-8 then allnum(i) = .h; else if allnum(i) =-9 then allnum(i) = .i; else if allnum(i) =-10 then allnum(i) = .j; else if allnum(i) =-11 then allnum(i) = .k; end; drop i; run;