此函數應用建議處理right-censored數據的引導重采樣類型。它還可以使用 Cox 回歸模型進行基於模型的重采樣。
censboot(data, statistic, R, F.surv, G.surv, strata = matrix(1,n,2),
sim = "ordinary", cox = NULL, index = c(1, 2), ...,
parallel = c("no", "multicore", "snow"),
ncpus = getOption("boot.ncpus", 1L), cl = NULL)
包含數據的 DataFrame 或矩陣。它必須至少有兩列,其中一列包含時間,另一列包含審查指標。允許根據需要擁有任意數量的其他列(盡管大量列的效率會降低),但 |
對 DataFrame 進行操作並返回所需統計數據的函數。它的第一個參數必須是數據。它需要的任何其他參數都可以使用 |
引導程序重複的數量。 |
調用 |
調用 |
調用 |
模擬類型。可能的類型是 |
從 |
長度為 2 的向量,給出 |
其他命名參數在每次調用時都會原封不動地傳遞給 |
請參閱 |
Davison 和 Hinkley (1997) 的 3.5 和 7.3 節說明了各種類型的重采樣。最簡單的是案例重采樣,它隻是通過觀察值的替換進行重采樣。
條件引導程序根據生存分布的估計來模擬故障時間。然後,對於每個觀察,如果觀察被審查,則其模擬審查時間等於觀察到的審查時間,並且根據估計審查分布生成,條件是大於觀察到的失敗時間(如果觀察未經審查)。如果最大值被審查,則其名義審查時間為 Inf
,相反,如果未經審查,則其名義審查時間為 Inf
如果 Cox 回歸模型適合數據並提供,則使用該模型根據生存分布生成故障時間。在這種情況下,審查時間可以根據估計的審查分布(sim = "model"
)或從上一段(sim = "cond"
奇怪的引導程序將經過審查的觀察結果以及觀察到的故障時間保持為固定狀態。然後,它使用均值為 1 的二項式分布生成每個故障時間的事件數,分母為原始數據集中當時可能發生的故障數。在我們的實現中,我們堅持認為每個引導數據集的每個層中至少有一個模擬事件。
時,情況變得更加困難。由於生存分布和審查分布的層不同,因此對於某些觀察,模擬故障時間和模擬審查時間可能都是無限的。要了解這一點,請考慮在 1F 層中觀察生存分布,在 1G 層中觀察審查分布。現在,如果層 1F 中的最大值被審查,則給出標稱故障時間 Inf
,同樣,如果層 1G 中的最大值未經審查,則給出標稱審查時間 Inf
當未提供parallel = "snow"
應用於原始數據時 |
R |
執行的引導複製次數。 |
使用的模擬類型。這通常是 |
用於引導程序的數據。這通常是 |
重采樣中使用的地層。當 |
對 |
# Example 3.9 of Davison and Hinkley (1997) does a bootstrap on some
# remission times for patients with a type of leukaemia. The patients
# were divided into those who received maintenance chemotherapy and
# those who did not. Here we are interested in the median remission
# time for the two groups.
data(aml, package = "boot") # not the version in survival.
aml.fun <- function(data) {
surv <- survfit(Surv(time, cens) ~ group, data = data)
out <- NULL
st <- 1
for (s in 1:length(surv$strata)) {
inds <- st:(st + surv$strata[s]-1)
md <- min(surv$time[inds[1-surv$surv[inds] >= 0.5]])
st <- st + surv$strata[s]
out <- c(out, md)
aml.case <- censboot(aml, aml.fun, R = 499, strata = aml$group)
# Now we will look at the same statistic using the conditional
# bootstrap and the weird bootstrap. For the conditional bootstrap
# the survival distribution is stratified but the censoring
# distribution is not.
aml.s1 <- survfit(Surv(time, cens) ~ group, data = aml)
aml.s2 <- survfit(Surv(time-0.001*cens, 1-cens) ~ 1, data = aml)
aml.cond <- censboot(aml, aml.fun, R = 499, strata = aml$group,
F.surv = aml.s1, G.surv = aml.s2, sim = "cond")
# For the weird bootstrap we must redefine our function slightly since
# the data will not contain the group number.
aml.fun1 <- function(data, str) {
surv <- survfit(Surv(data[, 1], data[, 2]) ~ str)
out <- NULL
st <- 1
for (s in 1:length(surv$strata)) {
inds <- st:(st + surv$strata[s] - 1)
md <- min(surv$time[inds[1-surv$surv[inds] >= 0.5]])
st <- st + surv$strata[s]
out <- c(out, md)
aml.wei <- censboot(cbind(aml$time, aml$cens), aml.fun1, R = 499,
strata = aml$group, F.surv = aml.s1, sim = "weird")
# Now for an example where a cox regression model has been fitted
# the data we will look at the melanoma data of Example 7.6 from
# Davison and Hinkley (1997). The fitted model assumes that there
# is a different survival distribution for the ulcerated and
# non-ulcerated groups but that the thickness of the tumour has a
# common effect. We will also assume that the censoring distribution
# is different in different age groups. The statistic of interest
# is the linear predictor. This is returned as the values at a
# number of equally spaced points in the range of interest.
data(melanoma, package = "boot")
library(splines)# for ns
mel.cox <- coxph(Surv(time, status == 1) ~ ns(thickness, df=4) + strata(ulcer),
data = melanoma)
mel.surv <- survfit(mel.cox)
agec <- cut(melanoma$age, c(0, 39, 49, 59, 69, 100))
mel.cens <- survfit(Surv(time - 0.001*(status == 1), status != 1) ~
strata(agec), data = melanoma)
mel.fun <- function(d) {
t1 <- ns(d$thickness, df=4)
cox <- coxph(Surv(d$time, d$status == 1) ~ t1+strata(d$ulcer))
ind <- !duplicated(d$thickness)
u <- d$thickness[!ind]
eta <- cox$linear.predictors[!ind]
sp <- smooth.spline(u, eta, df=20)
th <- seq(from = 0.25, to = 10, by = 0.25)
predict(sp, th)$y
mel.str <- cbind(melanoma$ulcer, agec)
# this is slow!
mel.mod <- censboot(melanoma, mel.fun, R = 499, F.surv = mel.surv,
G.surv = mel.cens, cox = mel.cox, strata = mel.str, sim = "model")
# To plot the original predictor and a 95% pointwise envelope for it
mel.env <- envelope(mel.mod)$point
th <- seq(0.25, 10, by = 0.25)
plot(th, mel.env[1, ], ylim = c(-2, 2),
xlab = "thickness (mm)", ylab = "linear predictor", type = "n")
lines(th, mel.mod$t0, lty = 1)
matlines(th, t(mel.env), lty = 2)
Angelo J. Canty. Parallel extensions by Brian Ripley
Andersen, P.K., Borgan, O., Gill, R.D. and Keiding, N. (1993) Statistical Models Based on Counting Processes. Springer-Verlag.
Burr, D. (1994) A comparison of certain bootstrap confidence intervals in the Cox model. Journal of the American Statistical Association, 89, 1290-1302.
Davison, A.C. and Hinkley, D.V. (1997) Bootstrap Methods and Their Application. Cambridge University Press.
Efron, B. (1981) Censored data and the bootstrap. Journal of the American Statistical Association, 76, 312-319.
Hjort, N.L. (1985) Bootstrapping Cox's regression model. Technical report NSF-241, Dept. of Statistics, Stanford University.
