S-PLUS help

Median Absolute Deviation

DESCRIPTION:: Returns a robust scale estimate of the data. By default the median is taken as the center of the data and the estimate is scaled to be a consistent estimator of the standard deviation at the Gaussian model.

USAGE:

mad(y, center=median(y), constant=1.4826, na.rm=F, low=F)
scale.tau(y, center=median(y), weights=<<see below>>,
  init.scale=<<see below>>, tuning=1.95, na.rm=F)
scale.a(y, center=median(y), weights=<<see below>>,
  init.scale=<<see below>>, tuning=3.85, na.rm=F)

REQUIRED ARGUMENTS:

y:: vector of numeric data.

Missing values (NA) are allowed.

OPTIONAL ARGUMENTS:

center:: location parameter to be subtracted from each element of y before computing mad.
weights:: vector the same length as y of observation weights. The default is to give equal weight to all observations.
init.scale:: the value used as the initial scale estimate. The default is to use the Gaussian consistent MAD with low equal TRUE.
constant:: number that multiplies the median of the absolute values. The default value makes the estimate consistent for the standard deviation at the Gaussian model.
na.rm:: logical flag: should missing values be removed before computations?
low:: logical flag: if TRUE, then the low median is used. If FALSE, the central median is used. (There is no difference for an odd number of datapoints.)
tuning:: tuning parameter for the tau- and A-estimates. Larger numbers makes the estimate more efficient at the Gaussian distribution, but susceptible to larger bias.

VALUE:: a number which is a robust estimate of scale. It is consistent for the standard deviation for Gaussian data. The mad function returns constant * median(abs(y - center)), while scale.tau returns a Huber tau-estimate of scale, and scale.a returns a bisquare A-estimate of scale. Both of the latter are 80 percent efficient with the default tuning parameter (the MAD is about 36% efficient).

DETAILS:

If na.rm is FALSE, then any missing values will cause the result to be NA. Missing values will be removed before computations are performed when na.rm is TRUE.

The MAD scale estimate has a 50% breakdown point and generally has very small bias compared with other scale estimators when there is "contamination" in the data. Tau-estimates and A-estimates also have 50% breakdown, but are more efficient for Gaussian data. The A-estimate that scale.a computes is redescending, so it is inappropriate if it necessary that the scale estimate always be increasing as the size of a datapoint is increased. However, the A-estimate is very good if all of the contamination is far from the "good" data.

Burns and Martin (1992) compares tau-estimates and A-estimates. A-estimates are also discussed in Hoaglin, Mosteller and Tukey (1983). Code for another class of scale estimate can be found in Croux and Rousseeuw (1992).

REFERENCES:

Burns, P. J. and Martin, R. D. (1992). One-sample robust scale estimation in contamination models. (submitted).

Croux, C. and Rousseeuw, P. J. (1992). Time-efficient algorithms for two highly robust estimators of scale. to appear in COMPSTAT 1992.

Hampel, F. R., Ronchetti, E. M., Rousseeuw, P. J. and Stahel, W. A. (1986). Robust Statistics: The Approach Based on Influence Functions. Wiley, New York.

Hoaglin, D. C., Mosteller, F. and Tukey, J. W., editors (1983). Understanding Robust and Exploratory Data Analysis. Wiley, New York.

SEE ALSO:: var for the square of the standard deviation.

EXAMPLES:

mad(corn.yield, constant=1)

mad(rnorm(200)) # approximately 1.