R rsample int_pctl 自舉置信區間

使用各種方法計算引導程序置信區間。

用法

int_pctl(.data, ...)

# S3 method for bootstraps
int_pctl(.data, statistics, alpha = 0.05, ...)

int_t(.data, ...)

# S3 method for bootstraps
int_t(.data, statistics, alpha = 0.05, ...)

int_bca(.data, ...)

# S3 method for bootstraps
int_bca(.data, statistics, alpha = 0.05, .fn, ...)

參數

.data: 包含使用 bootstraps() 創建的引導重采樣的數據幀。對於 t- 和 BCa-intervals，apparent 參數應設置為 TRUE 。即使對於百分位數方法將 apparent 參數設置為 TRUE，表觀數據也不會用於計算百分位數置信區間。
...: 要傳遞給 .fn 的參數(僅限 int_bca())。
statistics: 不帶引號的列名稱或 dplyr 選擇器，用於標識包含各個引導估計的數據集中的單個列。這必須是整齊的 tibbles 列表列(包含 term 和 estimate 列)。對於t-intervals，需要一個標準的整齊列(通常稱為std.err)。請參閱下麵的示例。
alpha: 重要性程度。
.fn: 計算興趣統計量的函數。該函數應采用 rsplit 作為第一個參數，並且需要 ...。

值

每個函數返回一個包含列 .lower 、 .estimate 、 .upper 、 .alpha 、 .method 和 term 的 tibble。 .method 是間隔類型(例如"percentile"、"student-t" 或"BCa")。 term 是估計的名稱。請注意從 int_pctl() 返回的 .estimate

是引導重采樣的估計值的平均值，而不是表觀模型的估計值。

細節

百分位數間隔是獲得置信區間的標準方法，但需要數千次重采樣才能準確。 T-intervals 可能需要更少的重采樣，但需要相應的方差估計。偏差校正和加速間隔需要用於創建感興趣的統計數據的原始函數，並且計算量很大。

參考

https://rsample.tidymodels.org/articles/Applications/Intervals.html

戴維森，A. 和欣克利，D. (1997)。 Bootstrap 方法及其應用。劍橋：劍橋大學出版社。號碼：10.1017/CBO9780511802843

也可以看看

reg_intervals()

例子

# \donttest{
library(broom)
library(dplyr)
library(purrr)
library(tibble)

lm_est <- function(split, ...) {
  lm(mpg ~ disp + hp, data = analysis(split)) %>%
    tidy()
}

set.seed(52156)
car_rs <-
  bootstraps(mtcars, 500, apparent = TRUE) %>%
  mutate(results = map(splits, lm_est))

int_pctl(car_rs, results)
#> Warning: Recommend at least 1000 non-missing bootstrap resamples for terms: `(Intercept)`, `disp`, `hp`.
#> # A tibble: 3 × 6
#>   term         .lower .estimate   .upper .alpha .method   
#>   <chr>         <dbl>     <dbl>    <dbl>  <dbl> <chr>     
#> 1 (Intercept) 27.5      30.7    33.6       0.05 percentile
#> 2 disp        -0.0440   -0.0300 -0.0162    0.05 percentile
#> 3 hp          -0.0572   -0.0260 -0.00840   0.05 percentile
int_t(car_rs, results)
#> # A tibble: 3 × 6
#>   term         .lower .estimate   .upper .alpha .method  
#>   <chr>         <dbl>     <dbl>    <dbl>  <dbl> <chr>    
#> 1 (Intercept) 28.1      30.7    34.6       0.05 student-t
#> 2 disp        -0.0446   -0.0300 -0.0170    0.05 student-t
#> 3 hp          -0.0449   -0.0260 -0.00337   0.05 student-t
int_bca(car_rs, results, .fn = lm_est)
#> Warning: Recommend at least 1000 non-missing bootstrap resamples for terms: `(Intercept)`, `disp`, `hp`.
#> # A tibble: 3 × 6
#>   term         .lower .estimate   .upper .alpha .method
#>   <chr>         <dbl>     <dbl>    <dbl>  <dbl> <chr>  
#> 1 (Intercept) 27.7      30.7    33.7       0.05 BCa    
#> 2 disp        -0.0446   -0.0300 -0.0172    0.05 BCa    
#> 3 hp          -0.0576   -0.0260 -0.00843   0.05 BCa    

# putting results into a tidy format
rank_corr <- function(split) {
  dat <- analysis(split)
  tibble(
    term = "corr",
    estimate = cor(dat$sqft, dat$price, method = "spearman"),
    # don't know the analytical std.err so no t-intervals
    std.err = NA_real_
  )
}

set.seed(69325)
data(Sacramento, package = "modeldata")
bootstraps(Sacramento, 1000, apparent = TRUE) %>%
  mutate(correlations = map(splits, rank_corr)) %>%
  int_pctl(correlations)
#> # A tibble: 1 × 6
#>   term  .lower .estimate .upper .alpha .method   
#>   <chr>  <dbl>     <dbl>  <dbl>  <dbl> <chr>     
#> 1 corr   0.737     0.768  0.796   0.05 percentile
# }

源代碼：R/bootci.R

相關用法

注：本文由純淨天空篩選整理自Hannah Frick等大神的英文原創作品 Bootstrap confidence intervals。非經特殊聲明，原始代碼版權歸原作者所有，本譯文未經允許或授權，請勿轉載或複製。