Moran’s I is a measure of spatial autocorrelation–how related the values of a
variable are based on the locations where they were measured. Using
functions in the **ape** library, we can calculate Moran’s I in R. To download
and load this library, enter **install.packages(“ape”)** and then **
library(ape)**.

Let’s look at an example. Our dataset,
**ozone**, contains ozone measurements from
thirty-two locations in the Los Angeles area aggregated over one month. The dataset
includes the station number (**Station**), the latitude and longitude of the station
(**Lat** and
**Lon**), and the average
of the highest eight hour daily averages (**Av8top**). This data, and other spatial datasets, can be
downloaded from the University of Illinois’ Spatial Analysis Lab. We can look
at a summary of our location variables to see the range of locations under
consideration.

ozone <- read.table("https://stats.idre.ucla.edu/stat/r/faq/ozone.csv", sep=",", header=T) head(ozone, n=10)Station Av8top Lat Lon 1 60 7.225806 34.13583 -117.9236 2 69 5.899194 34.17611 -118.3153 3 72 4.052885 33.82361 -118.1875 4 74 7.181452 34.19944 -118.5347 5 75 6.076613 34.06694 -117.7514 6 84 3.157258 33.92917 -118.2097 7 85 5.201613 34.01500 -118.0597 8 87 4.717742 34.06722 -118.2264 9 88 6.532258 34.08333 -118.1069 10 89 7.540323 34.38750 -118.5347

To calculate Moran’s I, we will need
to generate a matrix of inverse distance weights. In the matrix, entries
for pairs of points that are close together are higher than for pairs of points
that are far apart. For simplicity, we will treat the latitude and
longitude as values on a plane rather than on a sphere–our locations are close
together and far from the poles. When using latitude and longitude coordinates
from more distant locations, it’s wise to calculate distances based on spherical
coordinates (the **geosphere** package can be used).

We can first generate a distance matrix, then take inverse of the matrix values and replace the diagonal entries with zero:

ozone.dists <- as.matrix(dist(cbind(ozone$Lon, ozone$Lat)))ozone.dists.inv <- 1/ozone.distsdiag(ozone.dists.inv) <- 0ozone.dists.inv[1:5, 1:5]1 2 3 4 5 1 0.000000 2.539795 2.446165 1.627570 5.391160 2 2.539795 0.000000 2.667061 4.531428 1.741071 3 2.446165 2.667061 0.000000 1.954357 2.002389 4 1.627570 4.531428 1.954357 0.000000 1.258716 5 5.391160 1.741071 2.002389 1.258716 0.000000

We have created a matrix where each off-diagonal entry [*i*,
*j*] in the
matrix is equal to 1/(distance between point *
i* and point
*j*).
Note that this is just one of several ways in which we can calculate an inverse
distance matrix. This is the formulation used by Stata. In SAS,
inverse distance matrices have entries equal to 1/(1+ distance between point *
i* and point
*j*) and there are numerous scaling options available.

We can now calculate Moran’s I using the command **Moran.I**.

Moran.I(ozone$Av8top, ozone.dists.inv)$observed [1] 0.2265501 $expected [1] -0.03225806 $sd [1] 0.03431138 $p.value [1] 4.596323e-14

Based on these results, we can reject the null hypothesis that there is zero spatial autocorrelation present in the variable
**Av8top **at alpha = .05.

Alternatively, one could use a binary distance matrix. This might be
used in looking at networking problems where two points are either connected or
not connected. In this example, we can choose a distance *d* such
that pairs of points with distance less than *d* are considered connected
and pairs with distance greater than *d* are not and use this weights
matrix to calculate Moran’s I. The code below does this for *d* =
.75.

ozone.dists.bin <- (ozone.dists > 0 & ozone.dists <= .75)Moran.I(ozone$Av8top, ozone.dists.bin)$observed [1] 0.2045738 $expected [1] -0.03225806 $sd [1] 0.04771132 $p.value [1] 6.910876e-07

In this example, we see similar results to those generated using the inverse Euclidean distance matrix
and we can reject the null hypothesis that there is zero spatial autocorrelation present in the variable
**Av8top **at alpha = .05.