cmdscale(d, k=2, eig=F, add=F)
Otherwise, a list with two or three components named points, plus eig and/or ac.
The additive constant is typically used when the "distances" in d are subjective dissimilarities. The ac constant attempts to make the distances conform to a Euclidean space with as small of dimension as possible. The estimation of ac is done under the assumption that the Euclidean space has only one dimension; an assumption that simplifies computation. A more technical explanation is that the constant attempts to eliminate negative eigenvalues of the doubly centered matrix of the squared distances.
There are various measures of the goodness of fit of a solution in the literature. Two of them are given in the function in the example section below, see Mardia, Kent and Bibby (1979, p. 408).
Results are currently computed to single-precision accuracy only.
Some examples of its use are: anthropologists studying cultural differences based on language, art, etc.; and marketing researchers assessing product similarity. The technique can be used to "serialize" data if the result is close to a curve in two dimensions or a string in three. For example, archeologists might try to place several cultures into a time order.
Johnson, R. A. and Wichern, D. W. (1982). Applied Multivariate Statistical Analysis. Prentice-Hall, Englewood Cliffs, New Jersey.
Mardia, K. V., Kent, J. T. and Bibby, J. M. (1979). Multivariate Analysis. Academic Press, London.
Torgerson, W. S. (1958). Theory and Methods of Scaling, Wiley, New York.
x <- cmdscale(dist.x) #default 2-space coord1 <- x[,1]; coord2 <- x[,2] par( pty="s" ) #set up square plot r <- range(x) #get overall max, min plot(coord1, coord2, type="n", xlim=r, ylim=r) #set up plot # note units per inch same on x and y axes text(coord1, coord2, seq(coord1)) #plot integers # use brush to explore a 3-dimensional scaling dis.vote <- dist(votes.repub) vote.scale <- cmdscale(dis.vote, 4) brush(vote.scale, rowlab=state.abb)# below is a function that calculates two measures of stress # it is fairly slow for datasets of more than 50 or so. cmdscale.gof <- function(dis, k = 4) { amat <- -0.5 * (dist2full(dis))^2 # see dist help file bmat <- sweep(amat, 1, apply(amat, 1, mean)) bmat <- sweep(bmat, 2, apply(bmat, 2, mean)) eigs <- svd(bmat, 0, 0) gof1 <- 1 - (cumsum(abs(eigs$d[1:k]))/sum(abs(eigs$d))) gof2 <- 1 - (cumsum(eigs$d[1:k]^2)/sum(eigs$d^2)) list(gof1 = gof1, gof2 = gof2, eig = eigs$d) } vote.scale <- cmdscale(dist(votes.repub)) plot(vote.scale, type="n") text(vote.scale, state.abb)