pam(x, k, diss = F, metric = "euclidean", stand = F)
In case of a matrix or dataframe, each row corresponds to an observation, and each column corresponds to a variable. All variables must be numeric. Missing values (NAs) are allowed.
In case of a dissimilarity matrix, x is typically the output of daisy or dist. Also a vector with length n*(n-1)/2 is allowed (where n is the number of objects), and will be interpreted in the same way as the output of the above-mentioned functions. Missing values (NAs) are not allowed.
The pam-algorithm is based on the search for k representative objects or medoids among the objects of the dataset. These objects should represent the structure of the data. After finding a set of k medoids, k clusters are constructed by assigning each object to the nearest medoid. The goal is to find k representative objects which minimize the sum of the dissimilarities of the objects to their closest representative object. The algorithm first looks for a good initial set of medoids (this is called the BUILD phase). Then it finds a local minimum for the objective function, that is, a solution such that there is no single switch of an object with a medoid that will decrease the objective (this is called the SWAP phase).
# generate 25 objects, divided into 2 clusters. x <- rbind(cbind(rnorm(10,0,0.5), rnorm(10,0,0.5)), cbind(rnorm(15,5,0.5), rnorm(15,5,0.5)))pamx <- pam(x, 2) pamx summary(pamx) plot(pamx)
pam(daisy(x, metric = "manhattan"), 2, diss = T)