lmsreg(x, y, samples=<<see below>>, intercept=T, wt=T, diagnostic=F, yname=NULL)
Since specifying samples="all" can easily be a request for millions of samples, the maximum number of non-singular samples is limited to 30,000. This limit can be changed by editing the function.
The lmsreg function has a built-in random number generator that starts with the same seed on each call to lmsreg. Thus the same subsamples and hence the same answer will be found by similar calls. The default value of 3000 random samples will give greater than 99% probability of a 50% breakdown point for problems with 9 or fewer explanatory variables. The probability of a high breakdown drops sharply as the number of explanatory variables grows beyond 10.
Least median of squares regression has a very high breakdown point of almost 50%. That is, almost half of the data can be corrupted in an arbitrary fashion and the least median of squares estimates continue to follow the majority of the data. At the present time this property is virtually unique among the robust regression methods that are publicly available, including the methods in rreg.
There are, however, two disadvantages of least median of squares. There is no known feasible algorithm to compute the actual least median of squares estimate in most problems; thus the lmsreg function can only yield a high probability of a 50% breakdown point for all but the smallest problems. Least median of squares is statistically very inefficient; one remedy to this is to use lsfit with the weights returned from lmsreg in a weighted least squares regression. This procedure will give high breakdown estimates that are also quite efficient. The test statistics derived from the least squares regression will not be strictly correct, but can be used informally.
Rousseeuw, P. J. and Leroy, A. M. (1987). Robust Regression and Outlier Detection. New York: Wiley.
stacklms <- lmsreg(stack.x, stack.loss, samples="all")