clara(x, k, metric = "euclidean", stand = F, samples = 5, sampsize = 40 + 2 * k)
Each sub-dataset is partitioned into k clusters using the same algorithm as in the pam function. Once k representative objects have been selected from the sub-dataset, each object of the entire dataset is assigned to the nearest medoid. The sum of the dissimilarities of the objects to their closest medoid, is used as a measure of the quality of the clustering. The sub-dataset for which the sum is minimal, is retained. A further analysis is carried out on the final partition. Each sub-dataset is forced to contain the medoids obtained from the best sub-dataset until then. Randomly drawn objects are added to this set until sampsize has been reached.
# generate 500 objects, divided into 2 clusters. x <- y_rbind(cbind(rnorm(200,0,8), rnorm(200,0,8)), cbind(rnorm(300,50,8), rnorm(300,50,8)))clarax <- clara(x, 2) clarax clarax$clusinfo plot(clarax)