kmeans(x, centers, iter.max=10)
It may be necessary to scale the columns of x in order for the clustering to be sensible. The larger a variable's variance, the more important it will be to the clustering.
When deciding on the number of clusters, Hartigan (1975, pp 90-91) suggests the following rough rule of thumb. If k is the result of kmeans with k groups and kplus1 is the result with k+1 groups, then it is justifiable to add the extra group when
(sum(k$withinss)/sum(kplus1$withinss)-1)*(nrow(x)-k-1)
is greater than 10.
Hartigan, J. A. and Wong, M. A. (1979). A k-means clustering algorithm. Applied Statistics 28, 100-108.
irismean <- t(apply(iris, c(2, 3), 'mean')) x <- rbind(iris[,,1], iris[,,2], iris[,,3]) km <- kmeans(x, irismean) wrong <- km$cluster!=rep(1:3, c(50, 50, 50))spin(x, highlight=wrong)
plot(x[,2], x[,3], type="n") text(x[!wrong, 2], x[!wrong, 3], km$cluster) # identify cluster membership that is correct points(x[wrong, 2], x[wrong, 3], pch=15) # boxes for points in error title(main="K-Means Clustering of the Iris Data")