Welcome to the IDRE Introduction to Regression in R Seminar!
This seminar will introduce some fundamental topics in regression analysis using R in three parts. The first part will begin with a overview on the theory of the simple regression using R. It follows by running simple and multiple regression in R including continuous and categorical predictors and interpreting regression analysis results. In the last part we will introduce regression diagnostics such as checking for normality of residuals, unusual and influential data, homoscedasticity and multicollinearity. The following seminar is based on R version 4.2.2.
In this seminar, we will be using a data file that was created by randomly sampling 400 elementary schools from the California Department of Education’s API (Academic Performance Index) 2000 dataset. This data file contains a measure of school academic performance as well as other attributes of the elementary schools, such as class size, enrollment, socioeconomic status, etc.
The links to the seminar slide is here:
The data file you need for this seminar is elemapi2v2. You can click on the link and save the data file in any folder you choose. You can also load the data file directly in the R environment from the web link.
Click on the following link to download syntax for all three lessons: Introduction to Regression in R Code. Left-click the link and copy and paste the code directly into the RStudio Editor or right-click to download.
R is a free software environment for statistical computing and graphics. You can download R from The Comprehensive R Archive Network, CRAN website or choose your preferred CRAN mirror and download R from there.
As an option to Rgui (the default R interface) you can use RStudio. RStudio can makes R easier to use. It includes a code editor, debugging & visualization tools. You can download RStudio desktop open source license for free from RStudio website.
For introduction to R you can visit the Introduction to R seminar.