Matrix of Predictors

DESCRIPTION:
Returns a matrix of predictors from terms.object. The function is primarily used as an internal call in other models functions.

USAGE:
model.matrix(object, ...)
model.matrix.default(terms.object, data, contrasts)

REQUIRED ARGUMENTS:
object:
an object from which a model matrix can be inferred. For the default method, it is usually a formula or a terms object constructed by a model-fitting function, based on the model formula. A fitted model can also be given for any class of model inheriting from class lm.

OPTIONAL ARGUMENTS:
terms.object:
usually, the terms object constructed by a model-fitting function, based on the model formula. A fitted model can also be given for any class of model inheriting from class lm.
data:
usually, the model frame constructed by model.frame. In general, can be any data frame or source for the data, or can be missing. If it is not the model frame, it will be turned into one by a call to model.frame. It then supplies the data from which columns of the matrix are computed. In standard use of model.matrix, the variables will be numeric vectors, factors or ordered factors, or numeric matrices. The computations will also handle character or logical vectors (which are turned into factors) or subsidiary data frames (provided their columns are numeric).
contrasts:
an optional list giving contrasts for some or all of the factors appearing in the terms object. The elements of the list should have the same name as the variable and should be either a contrast matrix (specifically, any full-rank matrix with as many rows as there are levels in the factor), or else a function to compute such a matrix given the number of levels. The complete contrast list (anything given as an argument plus any additional contrast matrices computed) will be returned as the "contrasts" attribute of the model matrix, and hence as the "contrasts" component of fitted models returned by lm() and its descendants.

VALUE:
an object of class "model.matrix" which inherits from "matrix". This is a matrix of predictor variables, including contrasts for all factors and ordered factors in the terms object. If the model includes an intercept, the first column will be the vector of 1s. The matrix has several special attributes:
assign:
a list, of length equal to the number of terms in the model. The elements of the list identify which columns of the model matrix encode the corresponding term.
dimnames:
the row labels from the model frame and column labels constructed from the variable names. See below for the latter. The column labels define the names for the coefficients and effects of the fitted model. (In the case of multivariate response models, read "row labels" for "names".) The row labels are the same as the names or row labels for fitted values and residuals, but these generally come directly from the model frame,via model.extract.
formula:
The formula from the terms object.
order,:
The order (1 for main effects, 2 for second-order interactions, etc.) and the character-string labels for the terms (these are identical to the corresponding attributes of the terms object, plus 0 and "(Intercept)" if there was an intercept). Note that the names attribute of the "assign" attribute is also equal to the term labels. These two attributes basically exist to save functions the trouble of inferring the corresponding information from a terms object. They are somewhat vestigial---they probably won't go away, but the use of the term labels, in particular, is less common than to rely on the names of the "assign" attribute.
contrasts:
a list containing contrast matrices or character vectors. Any contrast matrices used will be returned in an element of the list with the same name as the corresponding variable. See lm.object for further details.

CONTRASTS:
Factors, including ordered factors, are turned into columns of numeric variables using contrasts or dummy variables according to the instructions coded in the terms objects "factors" attribute. Particular contrasts are chosen using the contrasts argument as supplied (typically as passed down from lm(), etc.), from the "contrasts" attribute of the factor, if any, or from the default choice of contrast functions. In the absence of this attribute, the two character strings in options("contrasts") define the choice of contrast function for factors and ordered factors. Note that the same variable may be used both with contrasts and without. Interaction terms are formed by computing the various main effects and then taking all products of the corresponding columns (in principle---in practice the computations look back at previously computed terms an an attempt to avoid re-computation). See contrasts and C for details of specifying contrast functions as arguments.

LABELS:
The column labels are constructed by the following definition. Numeric variables inherit the corresponding term label. Numeric matrices produce column labels that concatenate the term label with the column labels of the matrix, if any, or with "1", "2", etc. Main effects for factors or ordered factors use the column label concatenated with the column labels of the contrast matrix, again using "1", "2", etc. as default. For both cases, the term label is used alone if there is only one column or one contrast.

This is primarily a support routine, called by lm and by other model-fitting functions that call or derive from lm, such as aov, glm, and gam. Note that the model-fitting functions loess and tree do not use model.matrix, chiefly because they do not use contrasts to handle factors.


SEE ALSO:
model.frame , model.extract , terms.object .