This module will show how to create labels for your data. Stata allows you to label your data file (data label), to label the variables within your data file (variable labels), and to label the values for your variables (value labels). Let’s use a file called autolab that does not have any labels.
use https://stats.idre.ucla.edu/stat/stata/modules/autolab.dta, clear
Let’s use the describe command to verify that indeed this file does not have any labels.
describeContains data from /stata/modules/autolab.dta obs: 74 1978 Automobile Data vars: 12 23 Oct 2008 13:36 size: 3,478 (99.9% of memory free) (_dta has notes) ------------------------------------------------------------------------------------------------------------------------- storage display value variable name type format label variable label ------------------------------------------------------------------------------------------------------------------------- make str18 %-18s price int %8.0gc mpg int %8.0g rep78 int %8.0g headroom float %6.1f trunk int %8.0g weight int %8.0gc length int %8.0g turn int %8.0g displacement int %8.0g gear_ratio float %6.2f foreign byte %8.0g ------------------------------------------------------------------------------- Sorted by:
Let’s use the label data command to add a label describing the data file. This label can be up to 80 characters long.
label data "This file contains auto data for the year 1978"
The describe command shows that this label has been applied to the version that is currently in memory.
describeContains data from /stata/modules/autolab.dta obs: 74 This file contains auto data for the year 1978 vars: 12 23 Oct 2008 13:36 size: 3,478 (99.9% of memory free) (_dta has notes) ------------------------------------------------------------------------------------------------------------------------- storage display value variable name type format label variable label ------------------------------------------------------------------------------------------------------------------------- make str18 %-18s price int %8.0gc mpg int %8.0g rep78 int %8.0g headroom float %6.1f trunk int %8.0g weight int %8.0gc length int %8.0g turn int %8.0g displacement int %8.0g gear_ratio float %6.2f foreign byte %8.0g ------------------------------------------------------------------------------- Sorted by:
Let’s use the label variable command to assign labels to the variables rep78 price, mpg and foreign.
label variable rep78 "the repair record from 1978"label variable price "the price of the car in 1978"label variable mpg "the miles per gallon for the car"label variable foreign "the origin of the car, foreign or domestic"
The describe command shows these labels have been applied to the variables.
describeContains data from /stata/modules/autolab.dta obs: 74 This file contains auto data for the year 1978 vars: 12 23 Oct 2008 13:36 size: 3,478 (99.9% of memory free) (_dta has notes) ------------------------------------------------------------------------------------------------------------------------- storage display value variable name type format label variable label ------------------------------------------------------------------------------------------------------------------------- make str18 %-18s price int %8.0gc the price of the car in 1978 mpg int %8.0g the miles per gallon for the car rep78 int %8.0g the repair record from 1978 headroom float %6.1f trunk int %8.0g weight int %8.0gc length int %8.0g turn int %8.0g displacement int %8.0g gear_ratio float %6.2f foreign byte %8.0g the origin of the car, foreign or domestic ------------------------------------------------------------------------------- Sorted by:
Let’s make a value label called foreignl to label the values of the variable foreign. This is a two step process where you first define the label, and then you assign the label to the variable. The label define command below creates the value label called foreignl that associates 0 with domestic car and 1 with foreign car.
label define foreignl 0 "domestic car" 1 "foreign car"
The label values command below associates the variable foreign with the label foreignl.
label values foreign foreignl
If we use the describe command, we can see that the variable foreign has a value label called foreignl assigned to it.
describeContains data from /stata/modules/autolab.dta obs: 74 This file contains auto data for the year 1978 vars: 12 23 Oct 2008 13:36 size: 3,478 (99.9% of memory free) (_dta has notes) ------------------------------------------------------------------------------------------------------------------------- storage display value variable name type format label variable label ------------------------------------------------------------------------------------------------------------------------- make str18 %-18s price int %8.0gc the price of the car in 1978 mpg int %8.0g the miles per gallon for the car rep78 int %8.0g the repair record from 1978 headroom float %6.1f trunk int %8.0g weight int %8.0gc length int %8.0g turn int %8.0g displacement int %8.0g gear_ratio float %6.2f foreign byte %12.0g foreignl the origin of the car, foreign or domestic ------------------------------------------------------------------------------- Sorted by:
Now when we use the tabulate foreign command, it shows the labels domestic car and foreign car instead of just 0 and 1.
table foreign-------------+----------- the origin | of the car, | foreign or | domestic | Freq. -------------+----------- domestic car | 52 foreign car | 22 -------------+-----------
Value labels are used in other commands as well. For example, below we issue the ttest , by(foreign) command, and the output labels the groups as domestic and foreign (instead of 0 and 1).
ttest mpg , by(foreign)Two-sample t test with equal variances ------------------------------------------------------------------------------ Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] ---------+-------------------------------------------------------------------- domestic | 52 19.82692 .657777 4.743297 18.50638 21.14747 foreign | 22 24.77273 1.40951 6.611187 21.84149 27.70396 ---------+-------------------------------------------------------------------- combined | 74 21.2973 .6725511 5.785503 19.9569 22.63769 ---------+-------------------------------------------------------------------- diff | -4.945804 1.362162 -7.661225 -2.230384 ------------------------------------------------------------------------------ Degrees of freedom: 72 Ho: mean(domestic) - mean(foreign) = diff = 0 Ha: diff <0 Ha: diff ~="0" Ha: diff> 0 t = -3.6308 t = -3.6308 t = -3.6308 P < t = 0.0003 P > |t| = 0.0005 P > t = 0.9997
One very important note: These labels are assigned to the data that is currently in memory. To make these changes permanent, you need to save the data. When you save the data, all of the labels (data labels, variable labels, value labels) will be saved with the data file.
Summary
Assign a label to the data file currently in memory.
label data "1978 auto data"
Assign a label to the variable foreign.
label variable foreign "the origin of the car, foreign or domestic"
Create the value label foreignl and assign it to the variable foreign.
label define foreignl 0 "domestic car" 1 "foreign car" label values foreign foreignl