Given a dataset in which variables are measured within a given area (state, country, zip code, county), the data can be presented in a map. There are several packages in R with functions that provide mechanisms for going from a table of data to a map.
In this page, we will present how to create a map in R with areas defined from an imported shape file. We have a dataset, incidents, in which incidents are measured within a Los Angeles area zip code. There are multiple observations for some zip codes and other zip codes do not appear in our dataset. We wish to calculate the total number of incidents within each zip code and create a map that is color-coded according to this sum.
The Shape Data
We will need to provide R with data defining the zip code areas. Such a dataset can be downloaded from the Census Bureau’s website. First, download zt06_d00_shp.zip and then extract the files. The .shp file will be the one we provide to R.
In this dataset, each observation corresponds to a discontinuity in the outline of the area. The X and Y variables indicate where the discontinuity is, SEGMENT indicates how many shapes are outlined as part of the same area, and NAME indicates the zip code.
Combining datasets
Once the shape file has been successfully read in and defined as a map, we can combine our data observed at the zip code level with the map. We will want to be able to match the zip code variable in both datasets together, so we first look at the contents of our map dataset to note the format and match it in our dataset. Then we use proc sql to create a new dataset with one observation for each zip code that appears in our dataset with the sum of the incident variable. Zip codes in the vicinity that do not appear in our dataset may appear as gaps in our map since we are not observing anything about them.
Creating the map
References and resources