R workflows extract-workflow 提取工作流程的元素

这些函数从工作流对象中提取各种元素。如果它们尚不存在，则会抛出错误。

extract_preprocessor() 返回用于预处理的公式、配方或变量表达式。
extract_spec_parsnip() 返回防风草模型规范。
extract_fit_parsnip() 返回防风草模型拟合对象。
extract_fit_engine() 返回嵌入防风草模型拟合中的引擎特定拟合。例如，当将 parsnip::linear_reg() 与 "lm" 引擎一起使用时，这将返回底层 lm 对象。
extract_mold() 返回从 hardhat::mold() 返回的预处理的 "mold" 对象。它包含有关预处理的信息，包括准备好的配方、公式术语对象或变量选择器。
extract_recipe() 返回配方。 estimated 参数指定是返回拟合配方还是原始配方。
extract_parameter_dials() 返回单个拨号参数对象。
extract_parameter_set_dials() 返回一组拨号参数对象。

用法

# S3 method for workflow
extract_spec_parsnip(x, ...)

# S3 method for workflow
extract_recipe(x, ..., estimated = TRUE)

# S3 method for workflow
extract_fit_parsnip(x, ...)

# S3 method for workflow
extract_fit_engine(x, ...)

# S3 method for workflow
extract_mold(x, ...)

# S3 method for workflow
extract_preprocessor(x, ...)

# S3 method for workflow
extract_parameter_set_dials(x, ...)

# S3 method for workflow
extract_parameter_dials(x, parameter, ...)

参数

x: 工作流程
...: 目前未使用。
estimated: 是否应返回原始(不适合)配方或适合配方的逻辑。这个参数应该被命名。
parameter: 参数 ID 的单个字符串。

值

从对象 x 中提取的值，如说明部分所述。

细节

提取底层引擎拟合有助于说明模型(通过 print() 、 summary() 、 plot() 等)或变量重要性/解释器。

但是，用户不应在提取的模型上调用predict() 方法。 workflows 在将数据提供给模型之前可能已对数据执行了预处理操作。绕过这些可能会导致错误或默默地生成不正确的预测。

好的：

workflow_fit %>% predict(new_data)

坏的：

workflow_fit %>% extract_fit_engine()  %>% predict(new_data)
# or
workflow_fit %>% extract_fit_parsnip() %>% predict(new_data)

例子

library(parsnip)
library(recipes)
library(magrittr)

model <- linear_reg() %>%
  set_engine("lm")

recipe <- recipe(mpg ~ cyl + disp, mtcars) %>%
  step_log(disp)

base_wf <- workflow() %>%
  add_model(model)

recipe_wf <- add_recipe(base_wf, recipe)
formula_wf <- add_formula(base_wf, mpg ~ cyl + log(disp))
variable_wf <- add_variables(base_wf, mpg, c(cyl, disp))

fit_recipe_wf <- fit(recipe_wf, mtcars)
fit_formula_wf <- fit(formula_wf, mtcars)

# The preprocessor is a recipe, formula, or a list holding the
# tidyselect expressions identifying the outcomes/predictors
extract_preprocessor(recipe_wf)
#> 
#> ── Recipe ────────────────────────────────────────────────────────────────
#> 
#> ── Inputs 
#> Number of variables by role
#> outcome:   1
#> predictor: 2
#> 
#> ── Operations 
#> • Log transformation on: disp
extract_preprocessor(formula_wf)
#> mpg ~ cyl + log(disp)
#> <environment: 0x5603e9f1eed0>
extract_preprocessor(variable_wf)
#> $outcomes
#> <quosure>
#> expr: ^mpg
#> env:  0x5603e9f1eed0
#> 
#> $predictors
#> <quosure>
#> expr: ^c(cyl, disp)
#> env:  0x5603e9f1eed0
#> 
#> attr(,"class")
#> [1] "workflow_variables"

# The `spec` is the parsnip spec before it has been fit.
# The `fit` is the fitted parsnip model.
extract_spec_parsnip(fit_formula_wf)
#> Linear Regression Model Specification (regression)
#> 
#> Computational engine: lm 
#> 
extract_fit_parsnip(fit_formula_wf)
#> parsnip model object
#> 
#> 
#> Call:
#> stats::lm(formula = ..y ~ ., data = data)
#> 
#> Coefficients:
#> (Intercept)          cyl  `log(disp)`  
#>     67.6674      -0.1755      -8.7971  
#> 
extract_fit_engine(fit_formula_wf)
#> 
#> Call:
#> stats::lm(formula = ..y ~ ., data = data)
#> 
#> Coefficients:
#> (Intercept)          cyl  `log(disp)`  
#>     67.6674      -0.1755      -8.7971  
#> 

# The mold is returned from `hardhat::mold()`, and contains the
# predictors, outcomes, and information about the preprocessing
# for use on new data at `predict()` time.
extract_mold(fit_recipe_wf)
#> $predictors
#> # A tibble: 32 × 2
#>      cyl  disp
#>    <dbl> <dbl>
#>  1     6  5.08
#>  2     6  5.08
#>  3     4  4.68
#>  4     6  5.55
#>  5     8  5.89
#>  6     6  5.42
#>  7     8  5.89
#>  8     4  4.99
#>  9     4  4.95
#> 10     6  5.12
#> # … with 22 more rows
#> 
#> $outcomes
#> # A tibble: 32 × 1
#>      mpg
#>    <dbl>
#>  1  21  
#>  2  21  
#>  3  22.8
#>  4  21.4
#>  5  18.7
#>  6  18.1
#>  7  14.3
#>  8  24.4
#>  9  22.8
#> 10  19.2
#> # … with 22 more rows
#> 
#> $blueprint
#> Recipe blueprint: 
#>  
#> # Predictors: 2 
#>   # Outcomes: 1 
#>    Intercept: FALSE 
#> Novel Levels: FALSE 
#>  Composition: tibble 
#> 
#> $extras
#> $extras$roles
#> NULL
#> 
#> 

# A useful shortcut is to extract the fitted recipe from the workflow
extract_recipe(fit_recipe_wf)
#> 
#> ── Recipe ────────────────────────────────────────────────────────────────
#> 
#> ── Inputs 
#> Number of variables by role
#> outcome:   1
#> predictor: 2
#> 
#> ── Training information 
#> Training data contained 32 data points and no incomplete rows.
#> 
#> ── Operations 
#> • Log transformation on: disp | Trained

# That is identical to
identical(
  extract_mold(fit_recipe_wf)$blueprint$recipe,
  extract_recipe(fit_recipe_wf)
)
#> [1] TRUE

源代码：R/extract.R

相关用法

注：本文由纯净天空筛选整理自Davis Vaughan等大神的英文原创作品 Extract elements of a workflow。非经特殊声明，原始代码版权归原作者所有，本译文未经允许或授权，请勿转载或复制。