R stacks add_candidates 將模型定義添加到數據堆棧

add_candidates() 將評估集預測和附加屬性從提供的模型定義(即 "candidates" 集)整理到數據堆棧。

在幕後，數據堆棧對象隻是 tibble::tbl_df ，其中第一列給出真實的響應值，其餘列給出每個候選的評估集預測。在回歸設置中，每個集成成員隻有一列。在分類設置中，每個候選集成成員的列數與結果變量的級別一樣多。

要初始化數據堆棧，請使用stacks() 函數。使用多次調用 add_candidates() 迭代地將模型定義附加到數據堆棧。使用blend_predictions() 函數評估數據堆棧。

用法

add_candidates(
  data_stack,
  candidates,
  name = deparse(substitute(candidates)),
  ...
)

參數

data_stack

data_stack 對象。

candidates

定義候選模型堆棧成員的(一組)模型定義。應繼承自 tune_results 或 workflow_set 。

tune_results ：從 tune::tune_grid() 、 tune::tune_bayes() 或 tune::fit_resamples() 輸出的對象。
workflow_set ：從workflowsets::workflow_map() 輸出的對象。這種方法允許僅通過一次調用 add_candidates 來提供多組候選成員。有關示例代碼，請參閱 package website 上的“使用工作流集堆疊”一文！

無論如何，這些結果必須符合 control 設置 save_pred = TRUE, save_workflow = TRUE — 有關輔助函數，請參閱 control_stack_grid() 、 control_stack_bayes() 和 control_stack_resamples() 文檔。

name

模型定義的標簽---默認為candidates 對象的名稱。如果 candidates 繼承自 workflow_set 則忽略。

...

附加參數。目前被忽略。

值

data_stack 對象 - 請參閱stacks() 了解更多詳細信息！

示例數據

該軟件包提供了一些重采樣對象和數據集，用於源自對 1212 個red-eyed 樹蛙胚胎的研究的示例和小插圖！

如果 Red-eyed 樹蛙 (RETF) 胚胎檢測到潛在的捕食者威脅，它們的孵化時間可能會比正常情況下的 7 天更早。研究人員想要確定這些樹蛙胚胎如何以及何時能夠檢測到來自環境的刺激。為此，他們通過用鈍探針搖動胚胎，對不同發育階段的胚胎進行"predator stimulus"測試。盡管一些胚胎事先接受了慶大黴素處理，慶大黴素是一種可以消除側線(感覺器官)的化合物。研究員朱莉·榮格(Julie Jung)和她的團隊發現，這些因子決定了胚胎是否過早孵化！

請注意，stacks 包中包含的數據不一定是完整數據集的代表性或無偏差子集，並且僅用於演示目的。

reg_folds 和 class_folds 是來自 rsample 的 rset 交叉驗證對象，分別將訓練數據分為回歸模型對象和分類模型對象。 tree_frogs_reg_test 和tree_frogs_class_test 是類似的測試集。

reg_res_lr、reg_res_svm 和 reg_res_sp 分別包含線性回歸、支持向量機和樣條模型的回歸調整結果，擬合 latency(即胚胎響應抖動需要多長時間孵化)在 tree_frogs 數據中，使用大多數其他變量作為預測變量。請注意，這些模型背後的數據經過過濾，僅包含來自響應刺激而孵化的胚胎的數據。

class_res_rf 和 class_res_nn 分別包含隨機森林和神經網絡分類模型的多類分類調整結果，使用大多數其他變量作為預測變量在數據中擬合 reflex(耳朵函數的度量)。

log_res_rf 和 log_res_nn 分別包含隨機森林和神經網絡分類模型的二元分類調整結果，使用大多數其他變量擬合 hatched(無論胚胎是否響應刺激而孵化)預測因子。

請參閱?example_data 了解有關這些對象的更多信息，並瀏覽生成它們的源代碼。

也可以看看

其他核心動詞：blend_predictions()、fit_members()、stacks()

例子

# see the "Example Data" section above for
# clarification on the objects used in these examples!

# put together a data stack using
# tuning results for regression models
reg_st <- 
  stacks() %>%
  add_candidates(reg_res_lr) %>%
  add_candidates(reg_res_svm) %>%
  add_candidates(reg_res_sp)
  
reg_st
#> # A data stack with 3 model definitions and 16 candidate members:
#> #   reg_res_lr: 1 model configuration
#> #   reg_res_svm: 5 model configurations
#> #   reg_res_sp: 10 model configurations
#> # Outcome: latency (numeric)
  
# do the same with multinomial classification models
class_st <-
  stacks() %>%
  add_candidates(class_res_nn) %>%
  add_candidates(class_res_rf)
#> Warning: Predictions from 1 candidate were identical to those from existing
#> candidates and were removed from the data stack.
  
class_st
#> # A data stack with 2 model definitions and 10.6666666666667 candidate members:
#> #   class_res_nn: 1 model configuration
#> #   class_res_rf: 9.66666666666667 model configurations
#> # Outcome: reflex (factor)
  
# ...or binomial classification models
log_st <-
  stacks() %>%
  add_candidates(log_res_nn) %>%
  add_candidates(log_res_rf)
  
log_st
#> # A data stack with 2 model definitions and 11 candidate members:
#> #   log_res_nn: 1 model configuration
#> #   log_res_rf: 10 model configurations
#> # Outcome: hatched (factor)
  
# use custom names for each model:
log_st2 <-
  stacks() %>%
  add_candidates(log_res_nn, name = "neural_network") %>%
  add_candidates(log_res_rf, name = "random_forest")
  
log_st2
#> # A data stack with 2 model definitions and 11 candidate members:
#> #   neural_network: 1 model configuration
#> #   random_forest: 10 model configurations
#> # Outcome: hatched (factor)
  
# these objects would likely then be
# passed to blend_predictions():
log_st2 %>% blend_predictions()
#> ── A stacked ensemble model ─────────────────────────────────────
#> 
#> Out of 11 possible candidate members, the ensemble retained 2.
#> Penalty: 0.1.
#> Mixture: 1.
#> 
#> The 2 highest weighted member classes are:
#> # A tibble: 2 × 3
#>   member                      type        weight
#>   <chr>                       <chr>        <dbl>
#> 1 .pred_no_neural_network_1_1 mlp           4.99
#> 2 .pred_no_random_forest_1_05 rand_forest   1.35
#> 
#> Members have not yet been fitted with `fit_members()`.

源代碼：R/add_candidates.R

相關用法

注：本文由純淨天空篩選整理自Max Kuhn等大神的英文原創作品 Add model definitions to a data stack。非經特殊聲明，原始代碼版權歸原作者所有，本譯文未經允許或授權，請勿轉載或複製。