Data Frame Objects

DESCRIPTION:
Data frames are objects of class "data.frame".

They combine the behavior of matrices, in the sense that they can be addressed by rows (meaning observations) and columns (meaning variables), with the behavior of lists or frames in S-PLUS, in the sense that the variables can be used like individual objects;


GENERATION:
The functions data.frame and read.table generate data frames.

METHODS:
Generic functions that have methods for class "data.frame" include: [, [[, aperm, atan, dbdetach, dim, dimnames, formula, ordered<-, pairs, plot, print, signif, summary, t.

In addition the groups Math, Ops and Summary all have methods for "data.frame".


INHERITANCE:
The classes "design" and "pframe" inherit from "data.frame".

STRUCTURE:
Data frames are implemented as lists all of whose components have the same length. The following attributes must be included and behave as follows.
row.names:
a vector of length equal to the number of observations (and therefore equal to either the length or the number of rows of every variable). There must be no duplicate values. Where no explicit row names are supplied in creating the data frame, 1:nrows(x) will be used.
names:
the names must exist, be of full length, and be unique.

DETAILS:
Data frames can be used like lists or frames, for example, by attaching the object to the search list, by setting it up as a frame in the evaluator or the browser, or by passing it to a model-fitting function along with a formula using the variable names in the data frame.

Many matrix-like computations are defined as methods for data frames, notably, subsets and the dim and dimnames attributes. However, data frames are not matrices; most importantly, any object can become a variable in the data frame, so long as it is addressable by the observations. In practice, this means that the variables should be one of vectors, matrices, or some other class of objects that can itself be treated as either a vector or matrix (in particular, can be subset like a vector or matrix). If the variable is vector-like, it should have length equal to the number of rows; if matrix-like, it should have the same number of rows as the data frame.

The definition of the dimension and the dimnames of a data frame is done differently from that of a matrix. Every data frame is required to have an attribute "row.names" whose length is, by definition, the number of rows of the data frame. The number of columns is by definition the number of variables; that is, the length of the data frame as a list. The dimnames list is equivalent to list(row.names(x), names(x)) Both the row names and the names are required to be there and to have no duplicate values.


SEE ALSO:
data.frame , design , design.object , pframe.object , read.table , data.matrix .