Reading EBCDIC data is one of the trickiest and laborious problems that we see. Please feel free to come by consulting (see https://stats.idre.ucla.edu/stat/ and see consulting services), and we would be pleased to work with you on this.
The most critical part in solving this is having a codebook that tells you the “record length” of each record, the column positions for each variable, and hopefully information about how each variable is stored (e.g., whether as packed decimal or just as regular data). The file can then be read in SAS using “informats” that indicate that the type of data you are reading. Below is a baby example that reads a file that has a “record length” of 448, and we are reading 3 variables.
- zip code, starting at column 1, for a length of 5, stored as regular ebcdic data (indicated by the $ebcdic5. informat)
- extended zip code, starting at column 6, stored as packed decimal with a length of 3
- first name, starting at column 24 for a length of 13, stored as regular ebcdic data.
filename in 'small.txt' ; data test; infile in lrecl=448 recfm=f; input @1 zip $ebcdic5. @6 zip2 s370FPD3. @24 fname $ebcdic13. ; run; proc freq data=test; tables zip zip2 fname ; run;
The informats relevant to EBCDIC data are “$ebcdic” and the “S370…” informats.