Estimate Probability Density Function

DESCRIPTION:
Returns x and y coordinates of a non-parametric estimate of the probability density of the data. Options include the choice of the window to use and the number of points at which to estimate the density.

USAGE:
density(x, n=50, window="g", na.rm=F, width=<<see below>>,
         from=<<see below>>, to=<<see below>>, cut=<<see below>>)

REQUIRED ARGUMENTS:
x:
vector of observations from the distribution whose density is to be estimated. Missing values are allowed if na.rm is TRUE.

OPTIONAL ARGUMENTS:
n:
the number of equally spaced points at which to estimate the density.
window:
character string giving the type of window used in the computations. One of: "cosine", "gaussian", "rectangular", "triangular" (one character is sufficient).
na.rm:
logical flag: should missing values be removed before estimation?
width:
width of the window. The default is the width of a histogram bar which is determined by log(length(x), base=2) + 1 bars to cover the range of x. The standard error of a Gaussian window is width/4. For the other windows width is the width of the interval on which the window is non-zero.
from,:
the n estimated values of density are equally spaced between from and to. The default is the range of the data extended by width*cut.
cut:
The fraction of the window width that the x values are to be extended by. The default is .75 for the Gaussian window and .5 for the other windows.

VALUE:
list with two components, x and y, suitable for giving as an argument to one of the plotting functions.
x:
vector of n points at which the density is estimated.
y:
density estimate at each x point.

DETAILS:
Missing values are excluded if na.rm is TRUE, and they cause an error otherwise.

These are kernel estimates. For each x value in the output, the window is centered on that x and the heights of the window at each datapoint are summed. This sum, after a normalization, is the corresponding y value in the output. Results are currently computed to single-precision accuracy only.


BACKGROUND:
Density estimation is essentially a smoothing operation. Inevitably there is a trade-off between bias in the estimate and the estimate's variability: wide windows will produce smooth estimates that may hide local features of the density.

REFERENCES:
Silverman, B. W. (1986). Density Estimation for Statistics and Data Analysis. Chapman and Hall, London.

Wegman, E. J. (1972). Nonparametric probability density estimation. Technometrics, 14, 533-546.


SEE ALSO:
hist , ksmooth .

EXAMPLES:
plot(density(x), type="b")

den.co2 <- density(co2, width=4) hist(co2) den.co2$y <- den.co2$y*length(co2)*2 # multiply density by length of series and width of histogram bar lines(den.co2)