Say that we have a tiny data file which has just ID variables like the one below.
input id 123456789 123456790 123456791 123456792 123456793 123456794 123456795 123456796 end
If we go to list out the values, they are displayed in scientific notation so it is hard to read the values.
list id 1. 1.23e+08 2. 1.23e+08 3. 1.23e+08 4. 1.23e+08 5. 1.23e+08 6. 1.23e+08 7. 1.23e+08 8. 1.23e+08
We can use the format command to tell Stata that we would like it to display the values with 9 values before the decimal place, and with no values after the decimal, as shown below. This way we can clearly see the values for id and we can see that the ID values were not stored properly.
format id %9.0f list id 1. 123456792 2. 123456792 3. 123456792 4. 123456792 5. 123456792 6. 123456792 7. 123456792 8. 123456800
If we use the describe command, we can see that Stata stored this value with the type float. The problem is that a float can only store an integer value with up to 7 digits of accuracy (but our id values were 9 digits).
describe Contains data obs: 8 vars: 1 size: 64 (99.9% of memory free) ------------------------------------------------------------------------------- 1. id float %9.0f ------------------------------------------------------------------------------- Sorted by: Note: dataset has changed since last saved
If you are storing an identification number (like we are), we need our values to be stored with perfect accuracy. If your variable contains just whole numbers (like our id) variable and is up to 9 digits, you can store it as a long integer, or if it can be up to 16 digits, you can store it as a double. If your identification variable was over 16 digits long, you could store the variable as a string variable without any loss of precision (but you would not be able to do any numerical computations with it).
Here is an example showing how to read the variable id as a long integer.
input long id 123456789 123456790 123456791 123456792 123456793 123456794 123456795 123456796 end format id %9.0f list id 1. 123456789 2. 123456790 3. 123456791 4. 123456792 5. 123456793 6. 123456794 7. 123456795 8. 123456796
Here is an example showing how to read the variable id as a string variable with a length of 9 (since the ID variable is 9).
input str9 id 123456789 123456790 123456791 123456792 123456793 123456794 123456795 123456796 end list id 1. 123456789 2. 123456790 3. 123456791 4. 123456792 5. 123456793 6. 123456794 7. 123456795 8. 123456796
For more information, see the Stata manual or Stata Help for datatypes.