Measures of Spatial Correlation

DESCRIPTION:
Computes the Moran, Geary, and other measures of spatial correlation. Also computes a Monte Carlo estimate of the distribution of the spatial correlation statistic.

USAGE:
spatial.cor(x, neighbor, statistic="moran", sampling="nonfree",
            npermutes=0, weight.fun=NULL, cov.fun=NULL)

REQUIRED ARGUMENTS:
x:
numeric vector or matrix containing the spatial observations. If a matrix, then each row in the matrix (or each element in a vector) corresponds to a different spatial region.
neighbor:
an object of class "spatial.neighbor" containing the spatial weights and specifying the spatial connectivity matrix. It is assumed that observations from connected spatial regions are correlated, and that the strength of their correlation is given by the spatial weights specified in the spatial neighbor object.

OPTIONAL ARGUMENTS:
statistic:
a character string to select the statistic to be used in computing the spatial correlation measure. This can be one of "moran", "geary" or "user". Partial matching is allowed.

The choices are:

"moran" - the Moran (1950) measure of spatial association.

"geary" - the Geary (1954) index of spatial association.

"user" - a user specified measure of spatial association.

The statistic="user" option allows the user to define their own correlation measure. In this case, estimates of the variance cannot be computed, though it is still possible to compute a Monte Carlo estimate of the permutation distribution. When statistic="user", the arguments weight.fun and cov.fun must be specified.

sampling:
a character string giving the sampling assumptions to be used when computing the variances. Two sampling assumptions are possible: "nonfree", in which the variance, conditioned upon the observed values of argument x, is computed; and "free", in which each of the observations in argument x is free to vary. See Cliff and Ord (1981, page 12) for discussion. Variances are not computed if statistic="user". Partial matching is allowed.
npermutes:
integer value. The number of permutations to be used in estimating the distribution of the spatial correlation measure. The permutation distribution estimate is computed as follows: npermutes random permutations of the (rows of) data in x are generated, and for each permutation the spatial correlation measure is computed for each column in x. The vector of permutation correlations provide an estimate of the permutation distribution of the spatial correlation for each column in x. The permutation distribution is the distribution one would obtain if a random permutation of the data were used to compute the spatial correlation measure. That is, the permutation distribution is the null distribution of the test statistic, conditional upon the observed data values.

As with many Monte Carlo simulations, npermutes=100 is often satisfactory for estimating p-values, though additional precision is obtained when more permutations are taken. More observations are commonly used when confidence intervals are to be computed. See, e.g., Good (1994, page 163), for a discussion.

weight.fun:
when argument statistic="user", you must supply an S-PLUS weight function. This weight function must have two arguments, (x,A), where x is a vector representing a single column of the input matrix x, and A is the sum of the weights. Using these two arguments, a normalizing constant is computed, and the spatial correlation measure is computed as the product of this normalizing constant times the measure of association computed by the function given in the argument cov.fun. For the Moran measure of spatial correlation, the weight function is: MORAN.weight <- function(x, A) { length(x)/(A * (length(x) - 1) * var(x)) }
cov.fun:
when argument statistic="user", you must provide an S-PLUS function for computing the covariances in your correlation measure. The covariance function has four arguments, (x, row.id, col.id, weights), where x is a vector containing (a single column of) input matrix x, and row.id, col.id, and weights are vectors specified in argument neighbor giving the connections between observations in x (see function spatial.neighbor). Observations row.id[i] and col.id[i] in x are connected with weight given by weights[i]. For example, the Moran covariance function is: MORAN.cov <- function(x, row.id, col.id, weights) { m <- mean(x) sum(weights * (x[row.id] - m) * (x[col.id] - m)) }

VALUE:
an object of class "spatial.cor" with components:
statistic:
the statistic used in computing the correlation measure. Same as its input value.
sampling:
the sampling assumption used for variance estimation. Same as its input value.
n:
the number of observation or sampling units.
correlation:
the spatial correlation estimate.
variance:
estimate of the variance of the spatial correlation estimates. Variances are not computed if statistic="user".
perm.p.value:
one-sided p-value computed using the permutation distribution. Each p-value gives the probability (as estimated from the permutation distribution) that the corresponding spatial correlation measure is larger than the observed value, under the null hypothesis of no spatial correlation.
perm.corr:
permutation estimates of the correlation measures. Confidence intervals and other quantities of interest can be computed from the (permutation) distribution of these estimates.

The print method, print.spatial.cor, prints out the normal z statistic and its two-sided p-value for the null hypothesis of no spatial correlation when statistic is "moran" or "geary".


DETAILS:
The routine spatial.cor computes one of two built-in measures of spatial correlation. These are:

The Moran coefficient:

M == (n/A)*sum(w[i,j]*z[i]*z[j]) / sum(z[i]*z[i]).

Here the numerator sum is over i and j, while the denominator sum is over i, w[i,j] is the weight for the relationship between observations i and j (zero means no relationship), A is the sum of the weights w[i,j], and z[i]=x[i]-mean(x) is the centered variate obtained from x[i].

The Geary coefficient:

G == (n-1)/(2*A)*sum(d[i,j]*(x[i]-x[j])^2) / sum(z[i]*z[i]).

The measures (except those for statistic="user") are described in Cliff and Ord (1981, Chapter 1).

The Moran measure most resembles a Pearson correlation coefficient, and has mean -1/(n-1) when there is no association. Here n is the number of rows in x (or, for vectors, n is the length of x). The Geary measure has mean 1 in the null case.

In addition to these two measures, you can specify other measures of spatial association by providing an S-PLUS function to compute a weighting or scaling factor (the weight.fun function), with a second function to compute a covariance or association measure (the cov.fun function). These functions, whose arguments are obtained from the input arguments to spatial.cor, each return a single value, and the measure of association is computed as the product of these values. Routine call_S is used in computing the permutation distribution for user specified correlations. call_S is somewhat slower (and uses more memory) than the C code used in computing the permutation distribution for the built-in measures.

Permutation distributions are important when computing measures of spatial correlation because the null distribution of the association statistic varies with the spatial lattice size and shape. This variability makes it difficult to provide approximate theoretical distributions, making the distribution of the Monte Carlo estimates all the more valuable. Confidence intervals and tests can be computed from the permutation distribution as they would be from an exact distribution. For example, a two-sided 10 percent confidence is obtained as the 5-th and 95-th percentile from the permutation distribution. Notice, however, that different runs of the program with different random number seeds will lead to slightly different results. Use set.seed to set the random number seed.


REFERENCES:
Cliff, A. D. and Ord, J. K. (1981). Spatial Processes: Models and Applications. Pion Limited, London.

Geary, R. C. (1954). The contiguity ratio and statistical mapping. The Incorporated Statistician. 5, 115-145.

Good, P. (1994). Permutation Tests. New York. Springer Verlag.

Moran, P. A. P. (1948). The interpretation of statistical maps. Journal of the Royal Statistical Society, Series B. 10, 243-251.

Moran, P. A. P. (1950). Notes on continuous stochastic phenomena. Biometrika. 37, 17-23.


SEE ALSO:
spatial.weights , spatial.neighbor , call_S , set.seed

EXAMPLES:
sids.cor <- spatial.cor(sids$sid, neighbor=sids.neighbor, statistic="geary",
       sampling="free", npermutes=100)
sids.cor