If you need a list of files in a given directory, for example, so you can loop through them, how can you get a clean list of file names? On a windows based PC, you can use dos commands. You’ll need to start by accessing the command line. Below are directions on how to do that in Windows. Note that if you are using Stata, you can access the command line by starting the command with a “!” in other words, do get a list of files in the current directory one would type “! dir”.
First you’ll need to get to the command prompt, you can do this by going to:
Start -> Run -> Type in “cmd”
This will open the command window. Next I will have to move into the correct directory. On my computer, the default directory is on the C: drive, but the folder I want to list the files for is on the D: drive, the first thing I will see is the prompt “C:\>”. The first command below (d:) changes to the D: drive. The second command moves to the directory d:mydir which is the directory I want to list the files in. The final line asks for a listing of the directory, the resulting list of files is shown below.
d: cd d:\mydir dir
Now I know I’m in the right directory. The basic command to list the files in a directory and place them in a text file is seen below, dir indicates that I want a directory listing, and the >..myfile.txt indicates that I want to place that listing in a file called myfile.txt one directory above the directory I am listing. (If you do not use the “..” to place the file in the directory above the current directory, myfile.txt will be listed along with your other files, which you probably don’t want.)
dir >..\myfile.txt
If I open myfile.txt in notepad, I see the following:
Volume in drive D is New Volume Volume Serial Number is 822A-8A09 Directory of D:mydir 11/15/2007 03:03 PM <DIR> . 11/15/2007 03:03 PM <DIR> .. 11/15/2007 01:38 PM 0 10oct2006.txt 11/08/2007 04:28 PM 368 11nov2007.do 11/15/2007 01:39 PM 0 5june2007.txt 03/11/2007 10:39 AM 1,869,429 beameruserguide.pdf 08/10/2007 01:24 PM 22,016 blog - jsm 2007.doc 04/25/2007 03:07 PM 199,887 clarify.pdf 11/15/2007 01:40 PM 0 houseplants.txt 04/25/2007 11:42 AM 371,225 Mardia 1970 - multivar skew and kurt.pdf 03/27/2007 01:18 PM 319,864 multiple imputation a primer by schafer.pdf 11/15/2007 02:49 PM 0 output 1.txt 11/15/2007 02:49 PM 0 output 2.txt 11/15/2007 02:49 PM 0 output 3.txt 11/15/2007 02:49 PM 0 output 4.txt 11/08/2007 03:59 PM 8,514 results.dta 11/15/2007 01:31 PM <DIR> sub1 11/15/2007 01:31 PM <DIR> sub2 11/14/2007 04:27 PM 952 test.txt 05/21/2007 03:23 PM 1,430,743 zelig.pdf 18 File(s) 4,225,738 bytes 4 Dir(s) 249,471,307,776 bytes free
This is a listing of the directory, but it’s not really what I want, there is too much extra information, so I can add options to the command to get just the names of the files. Adding /b to the command causes the list to contain just the file and directory names, not the information on the number of files, when they were created, etc.. Adding /a-d to the command removes the names of the directories, so all we have are the file names. (There are a number of other options as well, typing help dir will list them. Note that I don’t need to delete myfile.txt to rerun the command, the old content will automatically be replaced with the output generated by the new command.
dir /a-d /b >..\myfile.txt
now myfile.txt contains:
10oct2006.txt 11nov2007.do 5june2007.txt beameruserguide.pdf blog - jsm 2007.doc clarify.pdf houseplants.txt Mardia 1970 - multivar skew and kurt.pdf multiple imputation a primer by schafer.pdf output 1.txt output 2.txt output 3.txt output 4.txt results.dta test.txt zelig.pdf
Now suppose I wanted the list to contain only a certain type of file, for example, only text files with the extension txt. To do this, I can use a wildcard (*) and the file extention to get only files with the extention .txt . The command below does this.
dir *.txt /a-d /b >..\myfile.txt
myfile.txt contains:
10oct2006.txt 5june2007.txt houseplants.txt output 1.txt output 2.txt output 3.txt output 4.txt test.txt