metric_set()
允許您將多個度量函數組合在一起形成一個新函數,該函數可以一次計算所有這些函數。
細節
所有函數必須是:
-
僅數字指標
-
類指標或類概率指標的混合
-
動態、綜合和靜態生存指標的組合
例如,rmse()
可以與 mae()
一起使用,因為它們是數字指標,但不能與 accuracy()
一起使用,因為它是分類指標。但 accuracy()
可以與 roc_auc()
一起使用。
返回的度量函數將具有不同的參數列表,具體取決於傳入的是數字度量還是類/概率度量的混合。
# Numeric metric set signature:
fn(
data,
truth,
estimate,
na_rm = TRUE,
case_weights = NULL,
...
)
# Class / prob metric set signature:
fn(
data,
truth,
...,
estimate,
estimator = NULL,
na_rm = TRUE,
event_level = yardstick_event_level(),
case_weights = NULL
)
# Dynamic / integrated / static survival metric set signature:
fn(
data,
truth,
...,
estimate,
na_rm = TRUE,
case_weights = NULL
)
混合類和類概率指標時,將硬預測(因子列)作為命名參數 estimate
傳遞,將軟預測(類概率列)作為裸列名稱或 tidyselect
選擇器傳遞到 ...
。
混合動態、集成和靜態生存指標時,將時間預測作為命名參數 estimate
傳遞,並將生存預測作為裸列名稱或 tidyselect
選擇器傳遞給 ...
。
如果 metric_tweak()
已用於 "tweak" 這些參數之一,例如 estimator
或 event_level
,則調整後的版本獲勝。這允許您逐個度量地設置估計器,並仍然在 metric_set()
中使用它。
例子
library(dplyr)
# Multiple regression metrics
multi_metric <- metric_set(rmse, rsq, ccc)
# The returned function has arguments:
# fn(data, truth, estimate, na_rm = TRUE, ...)
multi_metric(solubility_test, truth = solubility, estimate = prediction)
#> # A tibble: 3 × 3
#> .metric .estimator .estimate
#> <chr> <chr> <dbl>
#> 1 rmse standard 0.722
#> 2 rsq standard 0.879
#> 3 ccc standard 0.937
# Groups are respected on the new metric function
class_metrics <- metric_set(accuracy, kap)
hpc_cv %>%
group_by(Resample) %>%
class_metrics(obs, estimate = pred)
#> # A tibble: 20 × 4
#> Resample .metric .estimator .estimate
#> <chr> <chr> <chr> <dbl>
#> 1 Fold01 accuracy multiclass 0.726
#> 2 Fold02 accuracy multiclass 0.712
#> 3 Fold03 accuracy multiclass 0.758
#> 4 Fold04 accuracy multiclass 0.712
#> 5 Fold05 accuracy multiclass 0.712
#> 6 Fold06 accuracy multiclass 0.697
#> 7 Fold07 accuracy multiclass 0.675
#> 8 Fold08 accuracy multiclass 0.721
#> 9 Fold09 accuracy multiclass 0.673
#> 10 Fold10 accuracy multiclass 0.699
#> 11 Fold01 kap multiclass 0.533
#> 12 Fold02 kap multiclass 0.512
#> 13 Fold03 kap multiclass 0.594
#> 14 Fold04 kap multiclass 0.511
#> 15 Fold05 kap multiclass 0.514
#> 16 Fold06 kap multiclass 0.486
#> 17 Fold07 kap multiclass 0.454
#> 18 Fold08 kap multiclass 0.531
#> 19 Fold09 kap multiclass 0.454
#> 20 Fold10 kap multiclass 0.492
# ---------------------------------------------------------------------------
# If you need to set options for certain metrics,
# do so by wrapping the metric and setting the options inside the wrapper,
# passing along truth and estimate as quoted arguments.
# Then add on the function class of the underlying wrapped function,
# and the direction of optimization.
ccc_with_bias <- function(data, truth, estimate, na_rm = TRUE, ...) {
ccc(
data = data,
truth = !!rlang::enquo(truth),
estimate = !!rlang::enquo(estimate),
# set bias = TRUE
bias = TRUE,
na_rm = na_rm,
...
)
}
# Use `new_numeric_metric()` to formalize this new metric function
ccc_with_bias <- new_numeric_metric(ccc_with_bias, "maximize")
multi_metric2 <- metric_set(rmse, rsq, ccc_with_bias)
multi_metric2(solubility_test, truth = solubility, estimate = prediction)
#> # A tibble: 3 × 3
#> .metric .estimator .estimate
#> <chr> <chr> <dbl>
#> 1 rmse standard 0.722
#> 2 rsq standard 0.879
#> 3 ccc standard 0.937
# ---------------------------------------------------------------------------
# A class probability example:
# Note that, when given class or class prob functions,
# metric_set() returns a function with signature:
# fn(data, truth, ..., estimate)
# to be able to mix class and class prob metrics.
# You must provide the `estimate` column by explicitly naming
# the argument
class_and_probs_metrics <- metric_set(roc_auc, pr_auc, accuracy)
hpc_cv %>%
group_by(Resample) %>%
class_and_probs_metrics(obs, VF:L, estimate = pred)
#> # A tibble: 30 × 4
#> Resample .metric .estimator .estimate
#> <chr> <chr> <chr> <dbl>
#> 1 Fold01 accuracy multiclass 0.726
#> 2 Fold02 accuracy multiclass 0.712
#> 3 Fold03 accuracy multiclass 0.758
#> 4 Fold04 accuracy multiclass 0.712
#> 5 Fold05 accuracy multiclass 0.712
#> 6 Fold06 accuracy multiclass 0.697
#> 7 Fold07 accuracy multiclass 0.675
#> 8 Fold08 accuracy multiclass 0.721
#> 9 Fold09 accuracy multiclass 0.673
#> 10 Fold10 accuracy multiclass 0.699
#> # ℹ 20 more rows
相關用法
- R yardstick metric_tweak 調整度量函數
- R yardstick metrics 估計性能的通用函數
- R yardstick mn_log_loss 多項數據的平均對數損失
- R yardstick mae 平均絕對誤差
- R yardstick msd 平均符號偏差
- R yardstick mpe 平均百分比誤差
- R yardstick mape 平均絕對百分比誤差
- R yardstick mcc 馬修斯相關係數
- R yardstick mase 平均絕對比例誤差
- R yardstick pr_auc 查準率曲線下麵積
- R yardstick accuracy 準確性
- R yardstick gain_capture 增益捕獲
- R yardstick pr_curve 精確率召回曲線
- R yardstick conf_mat 分類數據的混淆矩陣
- R yardstick rpd 性能與偏差之比
- R yardstick detection_prevalence 檢測率
- R yardstick bal_accuracy 平衡的精度
- R yardstick rpiq 績效與四分位間的比率
- R yardstick roc_aunp 使用先驗類別分布,每個類別相對於其他類別的 ROC 曲線下麵積
- R yardstick roc_curve 接收者算子曲線
- R yardstick rsq R 平方
- R yardstick iic 相關性理想指數
- R yardstick recall 記起
- R yardstick roc_aunu 使用均勻類別分布,每個類別相對於其他類別的 ROC 曲線下麵積
- R yardstick npv 陰性預測值
注:本文由純淨天空篩選整理自Max Kuhn等大神的英文原創作品 Combine metric functions。非經特殊聲明,原始代碼版權歸原作者所有,本譯文未經允許或授權,請勿轉載或複製。