R smooth.f 數據點上的平滑分布

R語言 smooth.f 位於 boot 包(package)。

說明

此函數使用頻率平滑方法來查找數據集上的分布，該數據集具有所需的感興趣統計值 theta 。該方法導致分布隨 theta 平滑變化。

用法

smooth.f(theta, boot.out, index = 1, t = boot.out$t[, index],
         width = 0.5)

參數

`theta`	感興趣的統計所需的值。如果 `theta` 是向量，則將為 `theta` 的每個元素找到單獨的分布。
`boot.out`	通過調用 `boot` 返回的引導輸出對象。
`index`	`boot.out$statistic` 輸出中感興趣的變量的索引。如果提供了`t`，則忽略此參數。 `index` 必須是標量。
`t`	感興趣的統計量的引導值。這必須是長度為 `boot.out$R` 的向量，並且值的順序必須與引導程序在 `boot.out` 中複製的順序相同。
`width`	內核平滑的標準化寬度。平滑使用 `width*s` 值作為 epsilon，其中 `s` 是感興趣統計量的標準誤差的自舉估計。 `width` 應采用 (0.2, 1) 範圍內的值以產生合理的平滑分布。如果`width`太大，則分布變得更接近均勻。

細節

新的分布權重是通過將正態核平滑器應用於 t 的觀測值(按自舉模擬中觀測到的頻率加權)來找到的。生成的分布的參數值可能不完全等於所需值 theta，但它通常具有接近 theta 的值。有關此方法如何工作的詳細信息，請參閱 Davison、Hinkley 和 Worton (1995) 以及 Davison 和 Hinkley (1997) 的第 3.9.2 節。

值

如果length(theta) 為1，則返回與數據集boot.out$data 長度相同的向量。位置i 中的值是賦予位置i 中的數據點的概率，使得分布的參數值大約等於theta 。如果 length(theta) 大於 1，則返回值是一個具有 length(theta) 行的矩陣，每行對應一個分布，參數值大約等於 theta 的對應值。

例子

# Example 9.8 of Davison and Hinkley (1997) requires tilting the resampling
# distribution of the studentized statistic to be centred at the observed
# value of the test statistic 1.84.  In the book exponential tilting was used
# but it is also possible to use smooth.f.
grav1 <- gravity[as.numeric(gravity[, 2]) >= 7, ]
grav.fun <- function(dat, w, orig) {
     strata <- tapply(dat[, 2], as.numeric(dat[, 2]))
     d <- dat[, 1]
     ns <- tabulate(strata)
     w <- w/tapply(w, strata, sum)[strata]
     mns <- as.vector(tapply(d * w, strata, sum)) # drop names
     mn2 <- tapply(d * d * w, strata, sum)
     s2hat <- sum((mn2 - mns^2)/ns)
     c(mns[2] - mns[1], s2hat, (mns[2]-mns[1]-orig)/sqrt(s2hat))
}
grav.z0 <- grav.fun(grav1, rep(1, 26), 0)
grav.boot <- boot(grav1, grav.fun, R = 499, stype = "w", 
                  strata = grav1[, 2], orig = grav.z0[1])
grav.sm <- smooth.f(grav.z0[3], grav.boot, index = 3)

# Now we can run another bootstrap using these weights
grav.boot2 <- boot(grav1, grav.fun, R = 499, stype = "w", 
                   strata = grav1[, 2], orig = grav.z0[1],
                   weights = grav.sm)

# Estimated p-values can be found from these as follows
mean(grav.boot$t[, 3] >= grav.z0[3])
imp.prob(grav.boot2, t0 = -grav.z0[3], t = -grav.boot2$t[, 3])


# Note that for the importance sampling probability we must 
# multiply everything by -1 to ensure that we find the correct
# probability.  Raw resampling is not reliable for probabilities
# greater than 0.5. Thus
1 - imp.prob(grav.boot2, index = 3, t0 = grav.z0[3])$raw
# can give very strange results (negative probabilities).

參考

Davison, A.C. and Hinkley, D.V. (1997) Bootstrap Methods and Their Application. Cambridge University Press.

Davison, A.C., Hinkley, D.V. and Worton, B.J. (1995) Accurate and efficient construction of bootstrap likelihoods. Statistics and Computing, 5, 257-264.

也可以看看

boot , exp.tilt , tilt.boot

相關用法

注：本文由純淨天空篩選整理自R-devel大神的英文原創作品 Smooth Distributions on Data Points。非經特殊聲明，原始代碼版權歸原作者所有，本譯文未經允許或授權，請勿轉載或複製。