R rsample bootstraps 引導抽樣

引導樣本是使用替換方法生成的與原始數據集大小相同的樣本。這會導致分析樣本具有某些原始數據行的多個重複項。評估集定義為未包含在引導樣本中的原始數據行。這通常稱為 "out-of-bag" (OOB) 示例。

用法

bootstraps(
  data,
  times = 25,
  strata = NULL,
  breaks = 4,
  pool = 0.1,
  apparent = FALSE,
  ...
)

參數

data: 一個 DataFrame 。
times: 引導樣本的數量。
strata: data 中的變量(單個字符或名稱)用於進行分層抽樣。如果不是 NULL ，則每次重新采樣都會在分層變量中創建。數字 strata 被分為四分位數。
breaks: 給出對數值分層變量進行分層所需的箱數的單個數字。
pool: 用於確定特定組是否太小的數據比例，是否應合並到另一個組中。我們不建議將此參數降低到默認值 0.1 以下，因為分層組太小存在危險。
apparent: 一個合乎邏輯的。如果分析和保留子集是整個數據集，是否應該添加額外的重新采樣。對於 summary 函數使用的一些需要表觀錯誤率的估計器來說，這是必需的。
...: 這些點用於將來的擴展，並且必須為空。

值

帶有類 bootstraps 、 rset 、 tbl_df 、 tbl 和 data.frame 的 tibble。結果包括數據分割對象的列和名為 id 的列，其中包含帶有重采樣標識符的字符串。

細節

參數 apparent 啟用附加 "resample" 的選項，其中分析和評估數據集與原始數據集相同。這對於某些類型的引導結果分析可能是必需的。

使用 strata 參數，在分層變量內進行隨機抽樣。這有助於確保重采樣與原始數據集具有相同的比例。對於分類變量，采樣是在每個類別內單獨進行的。對於數字分層變量，strata 被分為四分位數，然後用於分層。低於總數10%的地層合並在一起；有關更多詳細信息，請參閱make_strata()。

例子

bootstraps(mtcars, times = 2)
#> # Bootstrap sampling 
#> # A tibble: 2 × 2
#>   splits          id        
#>   <list>          <chr>     
#> 1 <split [32/10]> Bootstrap1
#> 2 <split [32/15]> Bootstrap2
bootstraps(mtcars, times = 2, apparent = TRUE)
#> # Bootstrap sampling with apparent sample 
#> # A tibble: 3 × 2
#>   splits          id        
#>   <list>          <chr>     
#> 1 <split [32/11]> Bootstrap1
#> 2 <split [32/13]> Bootstrap2
#> 3 <split [32/32]> Apparent  

library(purrr)
library(modeldata)
data(wa_churn)

set.seed(13)
resample1 <- bootstraps(wa_churn, times = 3)
map_dbl(
  resample1$splits,
  function(x) {
    dat <- as.data.frame(x)$churn
    mean(dat == "Yes")
  }
)
#> [1] 0.2798523 0.2639500 0.2648019

set.seed(13)
resample2 <- bootstraps(wa_churn, strata = churn, times = 3)
map_dbl(
  resample2$splits,
  function(x) {
    dat <- as.data.frame(x)$churn
    mean(dat == "Yes")
  }
)
#> [1] 0.2653699 0.2653699 0.2653699

set.seed(13)
resample3 <- bootstraps(wa_churn, strata = tenure, breaks = 6, times = 3)
map_dbl(
  resample3$splits,
  function(x) {
    dat <- as.data.frame(x)$churn
    mean(dat == "Yes")
  }
)
#> [1] 0.2625302 0.2659378 0.2696294

源代碼：R/boot.R

相關用法

注：本文由純淨天空篩選整理自Hannah Frick等大神的英文原創作品 Bootstrap Sampling。非經特殊聲明，原始代碼版權歸原作者所有，本譯文未經允許或授權，請勿轉載或複製。