R stacks add_candidates 将模型定义添加到数据堆栈

add_candidates() 将评估集预测和附加属性从提供的模型定义(即 "candidates" 集)整理到数据堆栈。

在幕后，数据堆栈对象只是 tibble::tbl_df ，其中第一列给出真实的响应值，其余列给出每个候选的评估集预测。在回归设置中，每个集成成员只有一列。在分类设置中，每个候选集成成员的列数与结果变量的级别一样多。

要初始化数据堆栈，请使用stacks() 函数。使用多次调用 add_candidates() 迭代地将模型定义附加到数据堆栈。使用blend_predictions() 函数评估数据堆栈。

用法

add_candidates(
  data_stack,
  candidates,
  name = deparse(substitute(candidates)),
  ...
)

参数

data_stack

data_stack 对象。

candidates

定义候选模型堆栈成员的(一组)模型定义。应继承自 tune_results 或 workflow_set 。

tune_results ：从 tune::tune_grid() 、 tune::tune_bayes() 或 tune::fit_resamples() 输出的对象。
workflow_set ：从workflowsets::workflow_map() 输出的对象。这种方法允许仅通过一次调用 add_candidates 来提供多组候选成员。有关示例代码，请参阅 package website 上的“使用工作流集堆叠”一文！

无论如何，这些结果必须符合 control 设置 save_pred = TRUE, save_workflow = TRUE — 有关辅助函数，请参阅 control_stack_grid() 、 control_stack_bayes() 和 control_stack_resamples() 文档。

name

模型定义的标签---默认为candidates 对象的名称。如果 candidates 继承自 workflow_set 则忽略。

...

附加参数。目前被忽略。

值

data_stack 对象 - 请参阅stacks() 了解更多详细信息！

示例数据

该软件包提供了一些重采样对象和数据集，用于源自对 1212 个red-eyed 树蛙胚胎的研究的示例和小插图！

如果 Red-eyed 树蛙 (RETF) 胚胎检测到潜在的捕食者威胁，它们的孵化时间可能会比正常情况下的 7 天更早。研究人员想要确定这些树蛙胚胎如何以及何时能够检测到来自环境的刺激。为此，他们通过用钝探针摇动胚胎，对不同发育阶段的胚胎进行"predator stimulus"测试。尽管一些胚胎事先接受了庆大霉素处理，庆大霉素是一种可以消除侧线(感觉器官)的化合物。研究员朱莉·荣格(Julie Jung)和她的团队发现，这些因子决定了胚胎是否过早孵化！

请注意，stacks 包中包含的数据不一定是完整数据集的代表性或无偏差子集，并且仅用于演示目的。

reg_folds 和 class_folds 是来自 rsample 的 rset 交叉验证对象，分别将训练数据分为回归模型对象和分类模型对象。 tree_frogs_reg_test 和tree_frogs_class_test 是类似的测试集。

reg_res_lr、reg_res_svm 和 reg_res_sp 分别包含线性回归、支持向量机和样条模型的回归调整结果，拟合 latency(即胚胎响应抖动需要多长时间孵化)在 tree_frogs 数据中，使用大多数其他变量作为预测变量。请注意，这些模型背后的数据经过过滤，仅包含来自响应刺激而孵化的胚胎的数据。

class_res_rf 和 class_res_nn 分别包含随机森林和神经网络分类模型的多类分类调整结果，使用大多数其他变量作为预测变量在数据中拟合 reflex(耳朵函数的度量)。

log_res_rf 和 log_res_nn 分别包含随机森林和神经网络分类模型的二元分类调整结果，使用大多数其他变量拟合 hatched(无论胚胎是否响应刺激而孵化)预测因子。

请参阅?example_data 了解有关这些对象的更多信息，并浏览生成它们的源代码。

也可以看看

其他核心动词：blend_predictions()、fit_members()、stacks()

例子

# see the "Example Data" section above for
# clarification on the objects used in these examples!

# put together a data stack using
# tuning results for regression models
reg_st <- 
  stacks() %>%
  add_candidates(reg_res_lr) %>%
  add_candidates(reg_res_svm) %>%
  add_candidates(reg_res_sp)
  
reg_st
#> # A data stack with 3 model definitions and 16 candidate members:
#> #   reg_res_lr: 1 model configuration
#> #   reg_res_svm: 5 model configurations
#> #   reg_res_sp: 10 model configurations
#> # Outcome: latency (numeric)
  
# do the same with multinomial classification models
class_st <-
  stacks() %>%
  add_candidates(class_res_nn) %>%
  add_candidates(class_res_rf)
#> Warning: Predictions from 1 candidate were identical to those from existing
#> candidates and were removed from the data stack.
  
class_st
#> # A data stack with 2 model definitions and 10.6666666666667 candidate members:
#> #   class_res_nn: 1 model configuration
#> #   class_res_rf: 9.66666666666667 model configurations
#> # Outcome: reflex (factor)
  
# ...or binomial classification models
log_st <-
  stacks() %>%
  add_candidates(log_res_nn) %>%
  add_candidates(log_res_rf)
  
log_st
#> # A data stack with 2 model definitions and 11 candidate members:
#> #   log_res_nn: 1 model configuration
#> #   log_res_rf: 10 model configurations
#> # Outcome: hatched (factor)
  
# use custom names for each model:
log_st2 <-
  stacks() %>%
  add_candidates(log_res_nn, name = "neural_network") %>%
  add_candidates(log_res_rf, name = "random_forest")
  
log_st2
#> # A data stack with 2 model definitions and 11 candidate members:
#> #   neural_network: 1 model configuration
#> #   random_forest: 10 model configurations
#> # Outcome: hatched (factor)
  
# these objects would likely then be
# passed to blend_predictions():
log_st2 %>% blend_predictions()
#> ── A stacked ensemble model ─────────────────────────────────────
#> 
#> Out of 11 possible candidate members, the ensemble retained 2.
#> Penalty: 0.1.
#> Mixture: 1.
#> 
#> The 2 highest weighted member classes are:
#> # A tibble: 2 × 3
#>   member                      type        weight
#>   <chr>                       <chr>        <dbl>
#> 1 .pred_no_neural_network_1_1 mlp           4.99
#> 2 .pred_no_random_forest_1_05 rand_forest   1.35
#> 
#> Members have not yet been fitted with `fit_members()`.

源代码：R/add_candidates.R

相关用法

注：本文由纯净天空筛选整理自Max Kuhn等大神的英文原创作品 Add model definitions to a data stack。非经特殊声明，原始代码版权归原作者所有，本译文未经允许或授权，请勿转载或复制。