In this example, we show how to read a binary file into Stata. The original binary file can be downloaded following the link. The zip file here contains data file, the pdf file of the codebook and the Stata code example in case the links above are not available.
The data file contains 3520 bytes of header information in ASCII and here are the beginning part of it.
CCSD3ZF0000100000001CCSD3VS00006PRODUCER Product_File_Name = JA1_IGD_2PcP243_093; Producer_Agency_Name = CNES; Processing_Center = SSALTO; File_Data_Type = IGDR; Reference_Document = SMM-ST-M-EA-10879-CN Issue 4.0; Reference_Software = CMAV9.2_01/G5OS5; Operating_System = SunOS 5.9; Product_Creation_Time = 2008-08-20T13:57:26.000000; CCSD$$MARKERPRODUCERCCSD3KS00006PASSFILE Mission_Name = Jason-1; Altimeter_Sensor_Name = POSEIDON-2; Radiometer_Sensor_Name = JMR; DORIS_Sensor_Name = DORIS-2 GM; Acquisition_Station_Name = JTCCS ; Cycle_Number = 243; Absolute_Revolution_Number = 30781; Pass_Number = 93; Absolute_Pass_Number = 61561; Equator_Time = 2008-08-14T09:54:37.743000; Equator_Longitude = +235.99<deg>; First_Measurement_Time = 2008-08-14T10:00:01.008141; Last_Measurement_Time = 2008-08-14T10:22:43.766073; First_Measurement_Latitude = +15.81<deg>; Last_Measurement_Latitude = +66.15<deg>; First_Measurement_Longitude = +241.82<deg>; Last_Measurement_Longitude = +318.85<deg>; Pass_Data_Count = 765; Ocean_Pass_Data_Count = 483; Ocean_PCD = 0<%>; Time_Epoch = 1958-01-01T00:00:00.000000;
It includes the information on the operating system used, the number of observations and time the first measurement is taken. There are many variables in the data set. In this example, we only show how to read the first eight variables. This gives us a chance to demonstrate how to use the file seek command. Here is the code for reading the data into Stata.
clear /******************************************************** A couple of numbers here: The total number of bytes of this file is 340120. 3520 is the number of bytes of the header. The number of bytes for the scientific records are then 340120 - 3520 = 336600 for this file. There are total of 765 records from header information: Pass_Data_Count = 765 leading to 440 bytes per record. ********************************************************/ set obs 765 gen day =. gen long time_sec = . gen long time_ms = . gen long latitude = . gen long longitude = . gen byte surface_type = . gen byte alt_echo_type = . gen byte rad_surf_type = . file open t using test1.dat, read binary file set t byteorder hilo file seek t 3520 quietly foreach i of numlist 1/765 { tempname word1 file read t %4bu `word1' replace day = `word1' in `i' file read t %4bu `word1' replace time_sec = `word1' in `i' file read t %4bu `word1' replace time_ms = `word1' in `i' file read t %4b `word1' replace latitude = `word1' in `i' file read t %4bu `word1' replace longitude = `word1' in `i' file read t %1bu `word1' replace surface_type = `word1' in `i' file read t %1bu `word1' replace alt_echo_type = `word1' in `i' file read t %1bu `word1' replace rad_surf_type = `word1' in `i' local a = 440*`i' + 3520 file seek t `a' } file close tgen date = day -(d('1jan1960')-d('1jan1958')) format date %d clist date time_sec time_ms latitude longitude surface_type in 1/10 date time_sec time_ms latitude longitude surfac~e 1. 14aug2008 36001 8141 15806591 241819737 0 2. 14aug2008 36002 27717 15856153 241839314 0 3. 14aug2008 36003 47293 15905712 241858902 0 4. 14aug2008 36004 66870 15955268 241878502 0 5. 14aug2008 36005 86444 16004821 241898113 0 6. 14aug2008 36006 106023 16054371 241917737 0 7. 14aug2008 36007 125598 16103918 241937372 0 8. 14aug2008 36008 145174 16153462 241957019 0 9. 14aug2008 36009 164750 16203004 241976678 0 10. 14aug2008 36010 184327 16252542 241996348 0