Types of weights
There are several types of weights that you might find or create in a data set.
- probability weights – Perhaps the most common type of weights are probability weights. These weights represent the probability that a case (or subject) was selected into the sample from a population. These weights are calculated by taking the inverse of the sampling fraction. For example, if you have a population of 10 widgets and you select 3 into your sample, your sampling fraction would be 3/10 and your probability weight would be 10/3 = 3.33. You often find this type of weight in complex survey data.
- frequency weights – Frequency weights are whole numbers (i.e., integers) that tell the software how many cases each case represents. It is a kind of short cut: if you have five rows of data that are identical, you can use a frequency weight with a value of 5 and spare yourself having to input the same row five times.
- analytic weights – Analytic weights are used when the cases are actually an average. If the averages are based on different numbers of observations (for example, some averages are based on three observations and others are based on 30 observations), some cases (averages) are measured with more precision than others, and you want the more precisely-measured cases to have a greater weight than the less precisely-measured cases. The more measurements used in the average, the more precise the average will be. The weights are proportional to the inverse of the variance, meaning that the more precisely-measured averages (cases) will have higher weights than less precisely-measured averages (cases).
- importance weights – Importance weights are just what you think they should be – they are weights that indicate how “important” a case is. There is no standard way of calculating this type of weight.
In SAS
You need to read the documentation for the procedure (proc) that you are using to determine what kind of weight should be used with the weight statement. The weight statement used in one proc might assume frequency weights while another assumes probability weights. If you cannot tell from the documentation which type of weight will be used, you will either need to do some experimenting or contact SAS Technical Support.
In SPSS
All of the SPSS modules recognize only frequency weights, except the Complex Samples module, which recognizes sampling weights (AKA probability weights). If you weight your data with a different type of weight, SPSS may not issue an error message; however, you should be sure that you really want your weights to be treated as frequency weights. Note that if you specify probability weights with a weight command, some procedures SPSS will round the values of the weights to the nearest whole number and use them as frequency weights. In some cases, the nearest whole number may be zero, in which case you may get a message in your output about this. One exception to this is in the crosstabs procedure. If you click on the “Cells” button or use the count subcommand, you can choose between having the values of the weights rounded or truncated. You can learn more about weights in SPSS by reading the section in the SPSS Command Syntax Reference on the WEIGHT command.
In Stata
Stata recognizes all four type of weights mentioned above. You can specify which type of weight you have by using the weight option after a command. Note that not all commands recognize all types of weights. If you use the svyset command, the weight that you specify must be a probability weight. You can find out more about using weights in Stata by seeing help weight.