Compute a Survival Curve for Censored Data

DESCRIPTION:
Computes an estimate of a survival curve for censored data using either the Kaplan-Meier or the Fleming-Harrington method. This function is deprecated, use survfit instead.

USAGE:
surv.fit(time, status, strata=rep(1, length(time)), na.strata=F, type=
         "kaplan-meier", error="greenwood", conf.level=.95, conf.type=
         "log", wt=rep(1, length(time)), coxreg.list, x, predict.at=
         <<see below>>)

REQUIRED ARGUMENTS:
time:
vector of time values; all values must be greater than or equal to zero.
Missing values (NA) are allowed.
status:
vector of status values. Typically, the values are 0 or 1, in which case 0 means censored and 1 means uncensored (dead). The values can also be 1 and 2, in which case 1 is subtracted from all of the values. If the only value in status is 1, then this is interpreted as meaning that all values are uncensored.
Missing values (NA) are allowed. This must have the same length as time.

OPTIONAL ARGUMENTS:
strata:
an optional vector that will be used to divide the subjects into disjoint groups. Each group generates a survival curve.
Missing values (NA) are allowed.
na.strata:
if TRUE, then missing values in the strata variable are counted as a separate group. If FALSE, these subjects are ignored.
type:
either "kaplan-meier" , or "fleming-harrington", (only the first character is necessary). The default is "fleming-harrington" if coxreg.list is given, and "kaplan-meier" otherwise.
error:
either "greenwood" for the Greenwood formula, "tsiatis" for the Tsiatis formula, or "cox" in which case coxreg.list and x must be given (only the first character is necessary.) The default is "cox" when coxreg.list is given, and "greenwood" otherwise.
conf.level:
the level for a two-sided confidence interval on the survival curve(s) based on a Normal approximation. This must be a number between 0 and 1.
conf.type:
one of "none", "plain", "log", or "log-log". Only enough of the string to uniquely identify it is necessary. The first option causes confidence intervals not to be generated. The second causes the standard intervals "curve +- k *se(curve)", where k is determined from conf.level. The "log" option causes intervals based on the cumulative hazard or log(survival). The last option, "log-log", bases intervals on the log hazard or log(-log(survival)). These will never extend past 0 or 1.
wt:
vector of risk weights (relative to 1).
Missing values (NA) are allowed. This must have the same length as time.
coxreg.list:
a list as that returned by the coxreg function. Arguments x and predict.at are only used when this is given. In this case, type is always set to "fleming-harrington" and error is always "cox".
x:
the vector or matrix of explanatory variables used in the Cox model's fit.
predict.at:
a vector as long as the number of columns in x. The curve produced will be representative of a cohort whose covariate values are equal to this vector. The default is to use the column means of x.

VALUE:
a list of class "surv.fit" representing the survival curve, with the following components:
time:
the unique time values, in sorted order within strata. Note that this may be shorter than the input time if there are duplicates.
n.risk:
vector of the number at risk for each timepoint. If weights are used, it will be the sum of the weights.
n.event:
vector of the number of events for each timepoint.
surv:
vector of the estimates of survival for each timepoint.
std.err:
vector of the estimated standard errors of the cumulative hazards, or -log(survival) for each timepoint. According to Link (1984) this is the most appropriate scale for confidence intervals.
strata:
a vector of strata values. This is present only if strata was specified in the input. If present, the output will be sorted by strata, and it will consist of multiple curves "end to end".
upper:
upper confidence limit. Truncated to be <=1, if necessary.
lower:
lower confidence limit. Truncated to be >=0, if necessary.
conf.type:
same as its input value.
conf.level:
same as its input value.

DETAILS:
Actually, the estimates used are the Kalbfleisch-Prentice (Kalbfleisch and Prentice, 1980, p.86) and the Tsiatis/Link/Breslow which reduce to the Kaplan-Meier and Fleming-Harrington estimates, respectively, when the weights are unity. When curves are fit by a Cox model, subject weights of exp(sum(coef*(x-predict.at))) are used (ignoring any value for wt input by the user), a correction is also made to the variance based on the variance of coef.

The Greenwood formula for the variance is a sum of terms d/(n*(n-m)), where d is the number of deaths at a given time point, n is the sum of wt for all individuals still at risk at that time, and m is the sum of weights for the deaths at that time. The justification is based on a binomial argument when weights are all equal to one; extension to the weighted case is ad hoc.


REFERENCES:
Kalbfleisch, J. D. and Prentice, R. L. (1980). The Statistical Analysis of Failure Time Data. Wiley, New York.

Link, C. L. (1984). Confidence intervals for the survival function using Cox's proportional hazards model with covariates. Biometrics 40, 601-610.

Tsiatis, A. (1981). A large sample study of the estimate for the integrated hazard function in Cox's regression model for survival data. Annals of Statistics 9, 93-108.


SEE ALSO:
coxreg , print.surv.fit , plot.surv.fit , surv.diff .

EXAMPLES:
fit <- surv.fit(cancer$time, cancer$status, type="fleming")
print(fit)                             # printout in matrix form of results.
plot(fit)                              # plot of the survival function(s)
                                       # including 95% C.I.