The use command gets a Stata data file from disk and places it in memory so you can analyze and/or modify it. A data file must be read into memory before you can analyze it. It is kind of like when you open a Word document; you need to read a Word document into Word before you can work with it. The use command below gets the Stata data file called auto.dta from disk and places it in memory so we can analyze and/or modify it. Since Stata data files end with .dta you need only say use auto and Stata knows to read in the file called auto.dta.
sysuse auto
The describe command tells you information about the data that is currently sitting in memory.
describe Contains data from auto.dta obs: 74 vars: 12 17 Feb 1999 10:49 size: 3,108 (99.6% of memory free) ------------------------------------------------------------------------------- 1. make str17 %17s 2. price int %9.0g 3. mpg byte %9.0g 4. rep78 byte %9.0g 5. hdroom float %9.0g 6. trunk byte %9.0g 7. weight int %9.0g 8. length int %9.0g 9. turn byte %9.0g 10. displ int %9.0g 11. gratio float %9.0g 12. foreign byte %9.0g ------------------------------------------------------------------------------- Sorted by:
Now that the data is in memory, we can analyze it. For example, the summarize command gives summary statistics for the data currently in memory.
summarize Variable | Obs Mean Std. Dev. Min Max ---------+----------------------------------------------------- make | 0 price | 74 6165.257 2949.496 3291 15906 mpg | 74 21.2973 5.785503 12 41 rep78 | 69 3.405797 .9899323 1 5 hdroom | 74 2.993243 .8459948 1.5 5 trunk | 74 13.75676 4.277404 5 23 weight | 74 3019.459 777.1936 1760 4840 length | 74 187.9324 22.26634 142 233 turn | 74 39.64865 4.399354 31 51 displ | 74 197.2973 91.83722 79 425 gratio | 74 3.014865 .4562871 2.19 3.89 foreign | 74 .2972973 .4601885 0 1
Let’s make a change to the data in memory. We will compute a variable called price2 which will be double the value of price.
generate price2 = 2*price
If we use the describe command again, we see the variable we just created is part of the data in memory. We also see a note from Stata saying dataset has changed since last saved. Stata knows that the data in memory has changed, and would need to be saved to avoid losing the changes. It is like when you are editing a Word document; if you don’t save the data, any changes you make will be lost. If we shut the computer off before saving the changes, the changes we made would be lost.
describeContains data from auto.dta obs: 74 vars: 13 17 Feb 1999 10:49 size: 3,404 (99.6% of memory free) ------------------------------------------------------------------------------- 1. make str17 %17s 2. price int %9.0g 3. mpg byte %9.0g 4. rep78 byte %9.0g 5. hdroom float %9.0g 6. trunk byte %9.0g 7. weight int %9.0g 8. length int %9.0g 9. turn byte %9.0g 10. displ int %9.0g 11. gratio float %9.0g 12. foreign byte %9.0g 13. price2 float %9.0g ------------------------------------------------------------------------------- Sorted by: Note: dataset has changed since last saved
The save command is used to save the data in memory permanently on disk. Let’s save this data and call it auto2 (Stata will save it as auto2.dta).
save auto2 file auto2.dta saved
Let’s make another change to the dataset. We will compute a variable called price3 which will be three times the value of price.
generate price3 = 3*price
Let’s try to save this data again to auto2
save auto2file auto2.dta already exists r(602);
Did you see how Stata said file auto2.dta already exists? Stata is worried that you will accidentally overwrite your data file. You need to use the replace option to tell Stata that you know that the file exists and you want to replace it.
save auto2, replace file auto2.dta saved
Let’s make another change to the data in memory by creating a variable called price4 that is four times the price.
generate price4 = price*4
Suppose we want to use the original auto file and we don’t care if we lose the changes we just made in memory (i.e., losing the variable price4). We can try to use the auto file.
sysuse auto no; data in memory would be lost r(4);
See how Stata refused to use the file, saying no; data in memory would be lost? Stata did not want you to lose the changes that you made to the data sitting in memory. If you really want to discard the changes in memory, then you need to use the clear option on the use command, as shown below.
sysuse auto, clear
Stata tries to protect you from losing your data by doing the following: 1. If you want to save a file over an existing file, you need to use the replace option, e.g., save auto, replace. 2. If you try to use a file and the file in memory has unsaved changes, you need to use the clear option to tell Stata that you want to discard the changes, e.g., use auto, clear.
Before we move on to the next topic, let’s clear out the data in memory.
clear
Using files larger than 1 megabyte
When you use a data file, Stata reads the entire file into memory. By default, Stata limits the size of data in memory to 1 megabyte (PC version 6.0 Intercooled). You can view the amount of memory that Stata has reserved for data with the memory command.
memory Total memory 1,048,576 bytes 100.00% overhead (pointers) 0 0.00% data 0 0.00% ------------ data + overhead 0 0.00% programs, saved results, etc. 1,152 0.11% ------------ Total 1,152 0.11% Free 1,047,424 99.89%
If you try to use a file which exceeds the amount of memory Stata has allocated for data, it will give you an error message like this. no room to add more observations r(901); You can increase the amount of memory that Stata has allocated to data using the set memory command. For example, if you had a data file which was 1.5 megabytes, you can set the memory to, say, 2 megabytes shown below.
set memory 2m (2048k)
Once you have increased the memory, you should be able to use the data file if you have allocated enough memory for it.
Summary
To use the auto file from disk and read it into memory
sysuse auto
To save the file auto from memory to disk
save auto
To save a file if the file auto already exists
save auto, replace
to use a file auto and clear out the current data in memory
sysuse auto, clear
If you want to clear out the data in memory, you want to lose the changes
clear
To allocate 2 megabytes of memory for a data file.
set memory 2m
To view the allocation of memory to data and how much is used.
memory