Split Data by Groups

DESCRIPTION:
Returns a list in which each component is a vector of values from data that correspond to a unique value in group.

USAGE:
split(data, group)

REQUIRED ARGUMENTS:
data:
vector containing data values to be grouped. Missing values (NAs) are allowed.
group:
vector or category giving the group for each data value. If this is shorter than data, it is replicated to be the same length as data. If it is longer than data, then a warning is issued and some of the components of the result will have length zero. Missing values are not accepted.

VALUE:
list in which each component contains all data values associated with a particular value in group. For example, if the third value of group is 12, the third value in data will be placed in a component of the output with all other data values whose group is 12.

Within each group, data values are ordered as they originally appeared in data. The name of the component is the corresponding value in group, or the corresponding category name.


DETAILS:
A common use for split is to create a data structure to give to boxplot. A combination of category and tapply is usually preferred to using split followed by sapply.

SEE ALSO:
boxplot , category , sapply , slice.in , tapply .

EXAMPLES:
split(c("Martin", "Mary", "Matt"), c("M", "F", "M"))

attach(market.frame) boxplot(split(age, employment), notch = TRUE)

sapply(split(income,age), mean) # mean income level by age tapply(income,list(age), mean) # alternative computation

split(people,age %/% 10) # by decades split(ship, cycle(ship)) #component for each month attach(lung) split(time, sex) # survival time by sex

split(x, group = slice.index(x,2)) # compare to x[,1,], x[,2,], x[,3,]

# Produces the following output: $"1": [1] 101 102 103 104 113 114 115 116

$"2": [1] 105 106 107 108 117 118 119 120

$"3": [1] 109 110 111 112 121 122 123 124