Inputting data into SPLUS

Opening SPLUS Data Files Using the File Menu Opening SPLUS Data Files Using the openData Function Importing Formatted Data Files Using the Import Menu Importing Formatted Data files Using the import.data Function Importing Free Formatted Data Files Using the read.table Function Importing Data Files Using the scan Function Importing Fixed Format Files Using the read.table Function Exporting Data Files Using the write.table Function

Opening SPLUS Data Files Using the File Menu

You have can opan an SPLUS data file (.sdd extension) by using the file menu. This method will open the data file in a data editor.

File 
 Open Data 
 then browse to the directory which contains the data file

Opening SPLUS Data Files Using the openData Function

You can also open an SPLUS data file by the openData function but note that this method does not open the data file in a data editor. To see the data in a data editor you have to then use the Data menu, select Data and then select the data file that you want to see in the data editor.

data.new

Importing Formatted Data Files Using the Import Menu There many ways of inputting data and one of the easiest methods is to use the import data menu in the file menu.

```
File 
 Import Data 
  From File...
```

If you are trying to import a formatted data file you first change the file type to all files to see all the files available then you use the browse bottom and browse to where your data is located then double click on the file that you want to import. The SPLUS import function should automatically select the correct file type in the File Format field. If it does not select the correct file type you will have to select it yourself by clicking on the arrow that brings up the file type selection and then highlighting the correct file type.

SPLUS 2000 will import ASCII files with various types of delimiters including space, comma and user specified delimiters; SPSS (.sav and .por) and SAS files (.sd2, .ssd01, .ssd04, and .tpt). Unfortunately, in SPLUS 2000 there are a number of formats that do not seem to be supported through the import menu. The STATA data files do not seem to be able to be imported through the import menu, however, they can be imported by using the import function via syntax which is discussed in the “importing using the import function” section. For the complete list of data files that can be imported into SPLUS 2000 wia the import menu please refer to the manual.

SPLUS v6.1 will also import all the same types of ASCII files with various delimiters including space, comma and user specified delimiters; it will also import STATA data files (.dta), SPSS (.sav and .por), SAS files (.sas7bdat, .sd2, .ssd01, .ssd04, .sd7, .xpt, and .tpt) and dBase files. Note that it will not import STATA version 7 SE files since the SE version of STATA was developed later than the SPLUS version 6.1. For the complete list of data files that can be imported into SPLUS v6.1 using the import menu please refer to the manual.

Importing Formatted Data Files Using the import.data Function

The import function can import any formatted data file of the following form: “ACCESS”, “ASCII”, “DBASE”, “EXCEL”, “FASCII”, “GAUSS”, “LOTUS”, “MATLAB”, “ODBC”, “PARADOX”, “QUATTRO”, “SAS”, “SAS_TPT”, “SPLUS”, “SIGMAPLOT”, “SPSS”, “SPSS_POR”, “STATA”, “SYSTAT”. The import function will allow you to import any STATA data file (version 4, 5, 6 and 7). Beware that SPLUS 2000 may not support the most current versions of the other types of formatted files. For example, it will only import SAS version 6 files (.sd2 extensions) as well as SAS export files (.tpt extensions). Note the use of the double backslashes in the file address. Here are examples of importing a SAS data file called test.sd2, an ASCII file which is space delimited called https://stats.idre.ucla.edu/wp-content/uploads/2016/02/test.txt and a STATA 7 SE data file called /r/faq/test.dta. If you have saved these files on the D drive of your computer then the syntax would look like the following programs. Note that SPLUS is case sensitive. Furthermore, note that the names of the SPLUS data frames are totally arbitrary. The data frame test.sas could just as well have been named “potato” or “aunt.moe”.

import.data(DataFrame="test.sas", FileName="d:/test.sd2", FileType = "SAS")
print(test.sas)
   MAKE   MODEL MPG WEIGHT PRICE 
1   AMC Concord  22   2930  4099
2   AMC   Pacer  17   3350  4749
3   AMC  Spirit  22   2640  3799
4 Buick Century  20   3250  4816
5 Buick Electra  15   4080  7827

import.data(DataFrame="test.txt1", FileName="d:/test.txt", FileType="ASCII")
print(test.txt1)
         make mpg weight price 
Concord   AMC  22   2930  4099
  Pacer   AMC  17   3350  4749
 Spirit   AMC  22   2640  3799
Century Buick  20   3250  4816
Electra Buick  15   4080  7827

import.data(DataFrame="test.stata", FileName="d://r/faq/test.dta", FileType = "STATA")
print(test.stata)
   make   model mpg weight price 
1   AMC Concord  22   2930  4099
2   AMC   Pacer  17   3350  4749
3   AMC  Spirit  22   2640  3799
4 Buick Century  20   3250  4816
5 Buick Electra  15   4080  7827

Importing Free Formatted (Delimited) Data Files Using the read.table Function

The read.table function is very useful when reading in ASCII files that contain rectangular data. When the file contains the variable names in the first line of data the option header should be set to True. This function can read ASCII files with any type of delimiter as well as fixed format files.

The default delimiter is blank space, other delimiters must be specified by using the sep option and setting it equal to the delimiter in quotes (i.e., sep=”;” for the semicolon delimited data file).

Here are some examples of data with different types of delimiters. The data called https://stats.idre.ucla.edu/wp-content/uploads/2016/02/testsemicolon.txt has semicolon delimiters and the dataset test called https://stats.idre.ucla.edu/wp-content/uploads/2016/02/testz.txt uses the letter z as a delimiter both of which are acceptable delimiters in SPLUS. The option row.names=NULLforces the row names to be the observation number.

test.semi <- read.table("d:/testsemicolon.txt", header=T, sep=";", row.names=NULL)
print(test.semi)
   make   model mpg weight price 
1   AMC Concord  22   2930  4099
2   AMC   Pacer  17   3350  4749
3   AMC  Spirit  22   2640  3799
4 Buick Century  20   3250  4816
5 Buick Electra  15   4080  7827

test.z <- read.table("d:/https://stats.idre.ucla.edu/wp-content/uploads/2016/02/testz.txt", header=T, sep="z", row.names=NULL)
print(test.z)
   make   model mpg weight price 
1   AMC Concord  22   2930  4099
2   AMC   Pacer  17   3350  4749
3   AMC  Spirit  22   2640  3799
4 Buick Century  20   3250  4816
5 Buick Electra  15   4080  7827

Importing Data Files Using the scan Function

The scan function is an extremely flexible tool for importing data. It can be used to read in almost any type of data, numeric, character or complex and it can be used for fixed or free formatted files. Moreover, by using the scan function it is possible to input data directly from the console. The scan function reads the fields of data in the file as specified by the what option with the default being numeric. If the what option is specified to be what=character() or what=” ” then all the fields will be read as strings. If the data is a mix of numeric, string or complex data then a list can be used in the what option. The default separator for the scan function is any white space (single space, tab, or new line). However, unlike the read.table function which returns a data frame, the scan function returns a list or a vector. This makes the scan function less useful for inputting “rectangular” data such as the car data set that was seen in the previous examples. In the following examples we input first numeric data and then string data directly from the console; then we input the text file, scan.txt, where the first variable is a string variable and the second variable is numeric.

#inputting data directly from the console in the Commands window 
x <- scan()
3 5 6 9
2 5 6 

x
1: 5: 8: [1] 3 5 6 9 2 5 6

#inputting string data directly from the console in the Commands window
name.x 
1: 2: 4: 5: [1] "bobby" "kate"  "dave"  "mia"  

#inputting a text file and outputting a list
x 
$age
[1] 12 24 35 20

$name
[1] "bobby"   "kate"    "david"   "michael"

#using the same text file and saving only the names as a vector
x  0] 
x
$name
[1] "bobby"   "kate"    "david"   "michael"

is.vector(x)
[1] TRUE

Importing Fixed Format Files Using the read.table Function

For fixed format files the variables names are often in a separate file from the data. In this example the variable names are in a file called names and the data are in a file called https://stats.idre.ucla.edu/wp-content/uploads/2016/02/testfixed.txt. This is especially convenient when the fixed format file is very large and has many variables then it becomes rather impractical to have to type in all the variable names. In this situation the sep option is used to specify the column number where each variable begins and the col.name option specifies the file containing the variable names. So, first we read in the file for the names using the scan function. The what option is where we specify that the values in the file are character values. The print function with argument “names” shows us that the names is a list of string values. By using the col.names option in the read.table function this list will supply the variables names.

names <- scan("d:/names.txt", what=character() )
print(names)
[1] "model"  "make"   "mph"    "weight" "price" 

test.fixed <- read.table("d:/https://stats.idre.ucla.edu/wp-content/uploads/2016/02/testfixed.txt", col.names=names, row.names=NULL, sep = c(1, 6, 13, 15, 19))
print(test.fixed)
  model    make mph weight price 
1   AMC Concord  22   2930  4099
2   AMC   Pacer  17   3350  4749
3   AMC  Spirit  22   2640  3799
4 Buick Century  20   3250  4816
5 Buick Electra  15   4080  7827

Exporting Files Using the write.table Function

The write.table function outputs data files. The first argument specifies which data frame is to be exported. The next argument specifies the file to be created. The default separator is a blank space but any separator can be specified in the sep option. The default value for both the row.names and col.names options is TRUE. In the example we specify that we do not wish to include row names. The default setting for the quote option is to include quotes around all the character values, i.e. around values in string variables and around the column names. As we have shown in the example it is very common not to want the quotes when creating a text file.

#using the test.csv data frame to write a text file with no row names 
#and without quotes around the character values (both column names and string variables)
write.table(test.csv, "d:/test1.txt", row.names=F, quote=F)