Create Factor Object

DESCRIPTION:
A factor is a character vector with class attribute factor and a levels attribute which determines what character strings may be included in the vector. The function factor creates a factor object out of data and allows one to set the levels attribute. is.factor determines whether a vector is a factor. as.factor coerces a vector into a factor object.

USAGE:
factor(x, levels = <<see below>>, labels = <<see below>>, exclude = NA)
is.factor(x)
as.factor(x)

REQUIRED ARGUMENTS:
x:
data, to be thought of as taking values on a finite set (the levels). Missing values (NAs) are allowed.

OPTIONAL ARGUMENTS:
levels:
optional vector of levels for the factor. Any data value that does not match a value in levels will be NA in the factor. The default value of levels is the sorted list of distinct values of x. (If x is numeric and contains NAs, they will be placed at the end of the default value for levels.) If x is character data or you wish to exclude other values from the levels you may use the exclude argument.
labels:
optional vector of values to use as labels for the levels of the factor. The default is as.character(levels).
exclude:
a vector of values to be excluded from forming levels. Any value that appears in both x and exclude will be NA in the result and it will not appear in the default levels attribute.


VALUE:
object of class "factor", representing values taken from the finite set given by levels. It is important that this object is not numeric; in particular, comparisons and other operations behave as if they operated on values from the levels set, which is always of mode character. can appear, indicating that the corresponding value is undefined. The expression na.include(f) returns a factor like f, but with NAs made into a level.

is.factor returns TRUE if x is a factor object, FALSE otherwise.

as.factor returns x, if x is a factor, factor(x) otherwise.


WARNING:
You may need to use unclass in your functions that expect a categorical variable in order for the functions to work with factors.

SEE ALSO:
is.na , ordered , table .

EXAMPLES:
occupation <- c("doctor", "lawyer", "mechanic", "engineer")
income <- c(150000,100000,30000,60000)
factor(occupation)
factor(cut(income, breaks = c(0,30000,70000,200000)),
       labels = c("low","mid","high"))

# make readable labels occ <- factor(occupation,level = c("d","l","m","e"), label = c("Doctor","Lawyer","Mechanic","Engineer"))

# turn factor into character vector as.vector(factor)

color <- c("red", "red", "red", "green", "blue") colors <- factor(color, c("red","green","blue")) table(colors) # table counting occurrences of colors

# treat word "Unknown" as a missing value flag colors <- factor(c("red","green","Unknown","blue"), exclude = "Unknown") is.na(colors) # 3rd value will be T, the rest F