fit_best()
從調整許多模型中獲取結果,並將與最佳性能相關的工作流配置與訓練集相匹配。
參數
- x
-
已使用
workflow_map()
求值的workflow_set
對象。請注意,工作流程集必須已安裝 control optionsave_workflow = TRUE
。 - metric
-
給出對結果進行排名的指標的字符串。
- ...
-
要傳遞給 tune::fit_best 的其他選項。
細節
此函數是在擬合的工作流程集中擬合數值最佳配置所需步驟的快捷方式。該函數對結果進行排名,提取與最佳結果相關的調整結果,然後再次對包含最佳結果的調整結果調用fit_best()
(本身是一個包裝器)。
在偽代碼中:
rankings <- rank_results(wf_set, metric, select_best = TRUE)
tune_res <- extract_workflow_set_result(wf_set, rankings$wflow_id[1])
fit_best(tune_res, metric)
注意
該軟件包提供兩個預生成的工作流程集 two_class_set
和 chi_features_set
,以及適合 two_class_res
和 chi_features_res
的相關模型集。
two_class_*
對象基於使用 modeldata 包中的 two_class_dat
數據的二元分類問題。這六個模型利用裸公式或基本配方,利用 recipes::step_YeoJohnson()
作為預處理器,以及決策樹、邏輯回歸或 MARS 模型規範。有關源代碼,請參閱?two_class_set
。
chi_features_*
對象基於使用 modeldata 包中的 Chicago
數據的回歸問題。這三個模型均采用線性回歸模型規範,具有不同複雜性的三種不同配方。這些對象旨在近似 Kuhn 和 Johnson (2019) 第 1.3 節中構建的模型序列。有關源代碼,請參閱?chi_features_set
。
例子
library(tune)
library(modeldata)
library(rsample)
data(Chicago)
Chicago <- Chicago[1:1195,]
time_val_split <-
sliding_period(
Chicago,
date,
"month",
lookback = 38,
assess_stop = 1
)
chi_features_set
#> # A workflow set/tibble: 3 × 4
#> wflow_id info option result
#> <chr> <list> <list> <list>
#> 1 date_lm <tibble [1 × 4]> <opts[0]> <list [0]>
#> 2 plus_holidays_lm <tibble [1 × 4]> <opts[0]> <list [0]>
#> 3 plus_pca_lm <tibble [1 × 4]> <opts[0]> <list [0]>
chi_features_res_new <-
chi_features_set %>%
# note: must set `save_workflow = TRUE` to use `fit_best()`
option_add(control = control_grid(save_workflow = TRUE)) %>%
# evaluate with resamples
workflow_map(resamples = time_val_split, grid = 21, seed = 1, verbose = TRUE)
#> i No tuning parameters. `fit_resamples()` will be attempted
#> i 1 of 3 resampling: date_lm
#> → A | warning: prediction from a rank-deficient fit may be misleading
#> There were issues with some computations A: x1
#> There were issues with some computations A: x1
#>
#> ✔ 1 of 3 resampling: date_lm (662ms)
#> i No tuning parameters. `fit_resamples()` will be attempted
#> i 2 of 3 resampling: plus_holidays_lm
#> → A | warning: prediction from a rank-deficient fit may be misleading
#> There were issues with some computations A: x1
#> There were issues with some computations A: x1
#>
#> ✔ 2 of 3 resampling: plus_holidays_lm (693ms)
#> i 3 of 3 tuning: plus_pca_lm
#> → A | warning: prediction from a rank-deficient fit may be misleading
#> There were issues with some computations A: x4
#> There were issues with some computations A: x4
#>
#> ✔ 3 of 3 tuning: plus_pca_lm (2.3s)
chi_features_res_new
#> # A workflow set/tibble: 3 × 4
#> wflow_id info option result
#> <chr> <list> <list> <list>
#> 1 date_lm <tibble [1 × 4]> <opts[3]> <rsmp[+]>
#> 2 plus_holidays_lm <tibble [1 × 4]> <opts[3]> <rsmp[+]>
#> 3 plus_pca_lm <tibble [1 × 4]> <opts[3]> <tune[+]>
# sort models by performance metrics
rank_results(chi_features_res_new)
#> # A tibble: 12 × 9
#> wflow_id .config .metric mean std_err n preprocessor model rank
#> <chr> <chr> <chr> <dbl> <dbl> <int> <chr> <chr> <int>
#> 1 plus_pca_… Prepro… rmse 0.586 NA 1 recipe line… 1
#> 2 plus_pca_… Prepro… rsq 0.989 NA 1 recipe line… 1
#> 3 plus_pca_… Prepro… rmse 0.590 NA 1 recipe line… 2
#> 4 plus_pca_… Prepro… rsq 0.988 NA 1 recipe line… 2
#> 5 plus_pca_… Prepro… rmse 0.591 NA 1 recipe line… 3
#> 6 plus_pca_… Prepro… rsq 0.988 NA 1 recipe line… 3
#> 7 plus_pca_… Prepro… rmse 0.594 NA 1 recipe line… 4
#> 8 plus_pca_… Prepro… rsq 0.989 NA 1 recipe line… 4
#> 9 plus_holi… Prepro… rmse 0.646 NA 1 recipe line… 5
#> 10 plus_holi… Prepro… rsq 0.986 NA 1 recipe line… 5
#> 11 date_lm Prepro… rmse 0.733 NA 1 recipe line… 6
#> 12 date_lm Prepro… rsq 0.982 NA 1 recipe line… 6
# fit the numerically optimal configuration to the training set
chi_features_wf <- fit_best(chi_features_res_new)
chi_features_wf
#> ══ Workflow [trained] ════════════════════════════════════════════════════
#> Preprocessor: Recipe
#> Model: linear_reg()
#>
#> ── Preprocessor ──────────────────────────────────────────────────────────
#> 5 Recipe Steps
#>
#> • step_date()
#> • step_holiday()
#> • step_dummy()
#> • step_zv()
#> • step_pca()
#>
#> ── Model ─────────────────────────────────────────────────────────────────
#>
#> Call:
#> stats::lm(formula = ..y ~ ., data = data)
#>
#> Coefficients:
#> (Intercept) temp_min temp
#> 5.067e+02 -4.811e-04 6.885e-02
#> temp_max temp_change dew
#> 9.511e-04 NA -5.110e-02
#> humidity pressure pressure_change
#> 2.516e-02 6.921e-01 2.230e-02
#> wind wind_max gust
#> -1.642e-02 1.409e-04 3.146e-03
#> gust_max percip percip_max
#> 7.870e-03 -7.111e+00 2.199e-01
#> weather_rain weather_snow weather_cloud
#> -6.168e-01 -2.689e-01 -9.951e-02
#> weather_storm Blackhawks_Away Blackhawks_Home
#> 2.603e-01 -1.245e-01 -1.114e-01
#> Bulls_Away Bulls_Home Bears_Away
#> 9.407e-02 1.833e-01 3.306e-01
#> Bears_Home WhiteSox_Away WhiteSox_Home
#> 3.531e-01 -5.198e-01 NA
#> Cubs_Away Cubs_Home date_year
#> NA NA -2.638e-01
#> date_LaborDay date_NewYearsDay date_ChristmasDay
#> 5.166e-01 -1.275e+01 -1.308e+01
#> date_dow_Mon date_dow_Tue date_dow_Wed
#> 1.232e+01 1.345e+01 1.348e+01
#> date_dow_Thu date_dow_Fri date_dow_Sat
#> 1.325e+01 1.281e+01 9.855e-01
#> date_month_Feb date_month_Mar date_month_Apr
#> 4.218e-02 3.897e-01 5.472e-01
#> date_month_May date_month_Jun date_month_Jul
#> 2.842e-01 9.032e-01 3.897e-01
#> date_month_Aug date_month_Sep date_month_Oct
#> 4.855e-01 1.588e-01 6.197e-01
#> date_month_Nov date_month_Dec PC1
#> -4.350e-01 -8.359e-01 2.979e-02
#> PC2 PC3
#> 1.225e-01 -1.722e-01
#>
# to select optimal value based on a specific metric:
fit_best(chi_features_res_new, metric = "rmse")
#> ══ Workflow [trained] ════════════════════════════════════════════════════
#> Preprocessor: Recipe
#> Model: linear_reg()
#>
#> ── Preprocessor ──────────────────────────────────────────────────────────
#> 5 Recipe Steps
#>
#> • step_date()
#> • step_holiday()
#> • step_dummy()
#> • step_zv()
#> • step_pca()
#>
#> ── Model ─────────────────────────────────────────────────────────────────
#>
#> Call:
#> stats::lm(formula = ..y ~ ., data = data)
#>
#> Coefficients:
#> (Intercept) temp_min temp
#> 5.067e+02 -4.811e-04 6.885e-02
#> temp_max temp_change dew
#> 9.511e-04 NA -5.110e-02
#> humidity pressure pressure_change
#> 2.516e-02 6.921e-01 2.230e-02
#> wind wind_max gust
#> -1.642e-02 1.409e-04 3.146e-03
#> gust_max percip percip_max
#> 7.870e-03 -7.111e+00 2.199e-01
#> weather_rain weather_snow weather_cloud
#> -6.168e-01 -2.689e-01 -9.951e-02
#> weather_storm Blackhawks_Away Blackhawks_Home
#> 2.603e-01 -1.245e-01 -1.114e-01
#> Bulls_Away Bulls_Home Bears_Away
#> 9.407e-02 1.833e-01 3.306e-01
#> Bears_Home WhiteSox_Away WhiteSox_Home
#> 3.531e-01 -5.198e-01 NA
#> Cubs_Away Cubs_Home date_year
#> NA NA -2.638e-01
#> date_LaborDay date_NewYearsDay date_ChristmasDay
#> 5.166e-01 -1.275e+01 -1.308e+01
#> date_dow_Mon date_dow_Tue date_dow_Wed
#> 1.232e+01 1.345e+01 1.348e+01
#> date_dow_Thu date_dow_Fri date_dow_Sat
#> 1.325e+01 1.281e+01 9.855e-01
#> date_month_Feb date_month_Mar date_month_Apr
#> 4.218e-02 3.897e-01 5.472e-01
#> date_month_May date_month_Jun date_month_Jul
#> 2.842e-01 9.032e-01 3.897e-01
#> date_month_Aug date_month_Sep date_month_Oct
#> 4.855e-01 1.588e-01 6.197e-01
#> date_month_Nov date_month_Dec PC1
#> -4.350e-01 -8.359e-01 2.979e-02
#> PC2 PC3
#> 1.225e-01 -1.722e-01
#>
相關用法
- R workflowsets extract_workflow_set_result 提取工作流集的元素
- R workflowsets comment_add 為工作流程添加注釋和評論
- R workflowsets option_add 添加和編輯工作流程集中保存的選項
- R workflowsets leave_var_out_formulas 創建沒有每個預測變量的公式
- R workflowsets collect_metrics.workflow_set 獲取並格式化通過調整工作流集函數生成的結果
- R workflowsets workflow_map 處理一係列工作流程
- R workflowsets as_workflow_set 將現有對象轉換為工作流集
- R workflowsets option_list 製作一個分類的選項列表
- R workflowsets rank_results 按指標對結果進行排名
- R workflowsets workflow_set 從預處理和模型對象生成一組工作流對象
- R workflowsets pull_workflow_set_result 從工作流集中提取元素
- R workflowsets autoplot.workflow_set 繪製工作流程集的結果
- R workflowsets update_workflow_model 更新工作流集中的工作流組件
- R workflows add_model 將模型添加到工作流程
- R workflows workflow 創建工作流程
- R workflows extract-workflow 提取工作流程的元素
- R workflows add_variables 將變量添加到工作流程
- R workflows add_formula 將公式術語添加到工作流程
- R workflows predict-workflow 從工作流程進行預測
- R workflows augment.workflow 通過預測增強數據
- R workflows add_recipe 將配方添加到工作流程
- R workflows glance.workflow 工作流程模型一覽
- R workflows is_trained_workflow 確定工作流程是否經過訓練
- R workflows fit-workflow 適合工作流對象
- R workflows add_case_weights 將案例權重添加到工作流程
注:本文由純淨天空篩選整理自Max Kuhn等大神的英文原創作品 Fit a model to the numerically optimal configuration。非經特殊聲明,原始代碼版權歸原作者所有,本譯文未經允許或授權,請勿轉載或複製。