------------------------------------------------------------------------------- help for cxi -------------------------------------------------------------------------------Centered interaction expansion
cxi [, prefix(string)] term(s)
cxi [, prefix(string)] : any_stata_command varlist_with_terms ...
where a term is of the form:
i.varname or I.varname i.varname1*i.varname2 I.varname1*I.varname2 i.varname1*varname3 I.varname1*varname3 i.varname1|varname3 I.varname1|varname3
varname, varname1, and varname2 denote categorical variables and may be numeric or string. varname3 denotes a continuous, numeric variable.
Description
cxi is a modified version of xi in which the dummy coded variables are centered. cxi expands terms containing categorical variables into centered indicator (also called centered dummy) variable sets by creating new variables, and, in the second syntax, executes the specified command with the expanded terms.
The i.varname syntax is interpreted as
i.varname creates dummies for categorical variable varname.
i.varname1*i.varname2 creates dummies for categorical variables varname1 and varname2: main effects and interactions.
i.varname1*varname3 creates dummies for categorical variable varname1 and includes continuous variable varname3: all interactions and main effects.
i.varname1|varname3 creates dummies for categorical variable varname1 and includes continuous variable varname3: all interactions and main effect of varname3, but not main effect of varname1.
Option
prefix(string) allows users to choose a prefix other than _I for the newly created interaction variables. The length of the prefix is restricted to 4 characters. By default, xi will create interaction variables starting with the prefix _I. When you use xi, xi first drops all previously created interaction variables starting with the prefix specified in the prefix() option (which is _I by default). Only those variables originally created by xi are deleted, even if there are other variables created in some other way that start with the same prefix. If you want to keep the variables created from a previous use of cxi, specify a different prefix in the prefix() option of subsequent uses of cxi.
Examples
. cxi: logistic outcome weight i.agegrp bp . cxi: logistic outcome weight bp i.agegrp i.race . cxi: logistic outcome weight bp i.agegrp*i.race . cxi: logistic outcome bp i.agegrp*weight i.race
Summary of i.varname
1. varname may be string or numeric.
2. Indicator (dummy) variables are created automatically.
3. By default, the dummy-variable set is identified by dropping the dummy corresponding to the smallest value of the variable (how to specify otherwise is discussed below).
4. The new dummy variables are left in your dataset. By default, the names of the new dummy variables start with _I, therefore you can drop them by typing "drop _I*". You do not have to do this; each time you use xi, any previously created automatically generated dummies with the same prefix as the one specified in the prefix() option (_I by default) are dropped and new ones created.
5. The new dummy variables have variable labels so you can determine to what they correspond by typing "describe" or "describe _I*"; see help describe.
6. cxi may be used with any Stata command (not just logistic).
Summary of controlling the omitted dummy
i.varname omits the first group by default but if you define
char _dta[omit] "prevalent"
then the default behavior changes to that of dropping the most prevalent group. You can restore the default behavior by typing
char _dta[omit]
Either way, if you define a variable characteristic of the form
char varname[omit] #
or, if varname is a string,
char varname[omit] "string_literal"
then the specified value will be omitted.
Examples: . char agegrp[omit] 1 . char race[omit] "White" (for race a string variable) . char agegrp[omit] (to restore default)
How cxi names variables
The names cxi assigns to the centered dummy variables it creates are of the form:
<prefix><stub>_<groupid>
By default, the prefix is _I:
_I<stub>_<groupid>
You may subsequently refer to the entire set of variables by <prefix><stub>*.
For example:
name = _I + <stub> + _ + <groupid> Entire set -------------------------------------------------------------- _Iagegrp_1 _I agegrp _ 1 _Iagegrp* _Iagegrp_2 _I agegrp _ 2 _Iagegrp* _IageXwgt_1 _I ageXwgt _ 1 _IageXwgt* _IageXrac_1_2 _I ageXrac _ 1_2 _IageXrac* _IageXrac_2_1 _I ageXrac _ 2_1 _IageXrac*
cxi as a command rather than a command prefix
cxi can be used as a command prefix or as a command by itself. In the latter form, cxi merely creates the indicator and interaction variables. Equivalent to typing,
. cxi: regress y i.agegrp*wgt
is
. cxi i.agegrp*wgt i.agegrp _Iagegrp_1-4 (naturally coded; Iagegrp_1 omitted) i.agegrp*wgt _IageXwgt_1-4 (coded as above)
. regress y _Iagegrp* _IageXwgt*
Warnings
- cxi creates new variables in your data; most are doubles but interactions with continuous variables will have the storage type of the underlying continuous variable; see help datatypes.
- when using cxi with an estimation command, you may get the message "matsize too small". If so, see help matsize.
Acknowledgements
With the exception of about three lines of code the cxi ado program is identical to Stata's xi command. I gratefully acknowledge the work of Stata's wonderful programmers.
Author
Philip B. Ender UCLA Depatment of Education UCLA Academic Technology Services enderatucla.edu