predict.loess(object, newdata, se.fit = FALSE)
For one predictor, newdata can be a vector rather than a data frame. For two or more predictors, the names of newdata must include the names of predictors used in formula as they appear on the database from which they come. For example, if the right side of formula is log(E)*C, then there must be names C and E in newdata. Note that the specification of E in this example is not on the transformed but rather on the original scale.
For two or more predictors, there are two data structures that can be given to newdata. The first is a plain old data frame; the result is a vector whose length is equal to the number of rows of newdata, and the element of the vector in position i is the evaluation of the surface at row i of newdata. A second data structure can be used when the evaluation points form a grid. In this case, newdata is the result of the function expand.grid. If se.fit=FALSE, the result of predict.loess is a numeric array whose dimension is equal to the number of predictors; if se.fit=TRUE, then the components fit and se.fit are both such arrays.
The computations of predict.loess that produce the component se.fit are much more costly than those that produce fit, so the number of points at which standard errors are computed should be modest compared to those at which we do evaluations. Often this means calling predict.loess twice, once at a large number of points with se.fit equal to FALSE to get a thorough description of the surface, and once at a small number of points to get standard-error information.
Suppose the computation method for loess surfaces is interpolate, the default for the argument surface. Then the evaluation values of a numeric predictor must lie within the range of the values of the predictor used in the fit. The evaluation values for a predictor that is a factor must be one of the levels of the factor. For any evaluation point for which these conditions are not met, an NA is returned.
If se.fit is TRUE or the computation method for loess surfaces is direct, then predict.loess must use the data originally used to fit the loess model to compute the predictions. If you fit the loess model using the data argument, then the data set given by data should not be changed between the fit and the prediction. If you attached a data frame to supply data for the model, then that same data frame must be attached to compute the predicted values.
attach(ethanol) ethanol.cp <- loess(formula = NOx ~ C * E, span = 1/2, degree = 2, parametric = "C", drop.square = "C")# Example 1 - evaluation at 5 points # newdata is a data frame with variables C and E predict(ethanol.cp, newdata) [1] 0.2815825 2.5971411 3.0667178 3.2555778 1.0637788
# Example 2 - evaluation at 9 grid points C.marginal <- seq(min(C), max(C), length = 3) E.marginal <- seq(min(E), max(E), length = 3) CE.grid <- expand.grid(list(C = C.marginal, E = E.marginal)) predict(ethanol.cp, CE.grid)
# Gives the following output: E=0.5350 E=0.8835 E=1.2320 C= 7.50 -0.1039991 3.399360 0.6823181 C=12.75 0.2057837 3.850801 0.6481270 C=18.00 0.5155665 4.302243 0.6139359
# Example 3 - evaluate and compute estimates of standard errors gas.m <- loess(formula = NOx ~ E, span = 2/3, degree = 2) predict(gas.m, newdata = seq(min(E), max(E),length = 5), se.fit = T)$se.fit
# Gives the following output: [1] 0.18423300 0.07098989 0.076443326 0.07503963 0.11885234