metric_set()
允许您将多个度量函数组合在一起形成一个新函数,该函数可以一次计算所有这些函数。
细节
所有函数必须是:
-
仅数字指标
-
类指标或类概率指标的混合
-
动态、综合和静态生存指标的组合
例如,rmse()
可以与 mae()
一起使用,因为它们是数字指标,但不能与 accuracy()
一起使用,因为它是分类指标。但 accuracy()
可以与 roc_auc()
一起使用。
返回的度量函数将具有不同的参数列表,具体取决于传入的是数字度量还是类/概率度量的混合。
# Numeric metric set signature:
fn(
data,
truth,
estimate,
na_rm = TRUE,
case_weights = NULL,
...
)
# Class / prob metric set signature:
fn(
data,
truth,
...,
estimate,
estimator = NULL,
na_rm = TRUE,
event_level = yardstick_event_level(),
case_weights = NULL
)
# Dynamic / integrated / static survival metric set signature:
fn(
data,
truth,
...,
estimate,
na_rm = TRUE,
case_weights = NULL
)
混合类和类概率指标时,将硬预测(因子列)作为命名参数 estimate
传递,将软预测(类概率列)作为裸列名称或 tidyselect
选择器传递到 ...
。
混合动态、集成和静态生存指标时,将时间预测作为命名参数 estimate
传递,并将生存预测作为裸列名称或 tidyselect
选择器传递给 ...
。
如果 metric_tweak()
已用于 "tweak" 这些参数之一,例如 estimator
或 event_level
,则调整后的版本获胜。这允许您逐个度量地设置估计器,并仍然在 metric_set()
中使用它。
例子
library(dplyr)
# Multiple regression metrics
multi_metric <- metric_set(rmse, rsq, ccc)
# The returned function has arguments:
# fn(data, truth, estimate, na_rm = TRUE, ...)
multi_metric(solubility_test, truth = solubility, estimate = prediction)
#> # A tibble: 3 × 3
#> .metric .estimator .estimate
#> <chr> <chr> <dbl>
#> 1 rmse standard 0.722
#> 2 rsq standard 0.879
#> 3 ccc standard 0.937
# Groups are respected on the new metric function
class_metrics <- metric_set(accuracy, kap)
hpc_cv %>%
group_by(Resample) %>%
class_metrics(obs, estimate = pred)
#> # A tibble: 20 × 4
#> Resample .metric .estimator .estimate
#> <chr> <chr> <chr> <dbl>
#> 1 Fold01 accuracy multiclass 0.726
#> 2 Fold02 accuracy multiclass 0.712
#> 3 Fold03 accuracy multiclass 0.758
#> 4 Fold04 accuracy multiclass 0.712
#> 5 Fold05 accuracy multiclass 0.712
#> 6 Fold06 accuracy multiclass 0.697
#> 7 Fold07 accuracy multiclass 0.675
#> 8 Fold08 accuracy multiclass 0.721
#> 9 Fold09 accuracy multiclass 0.673
#> 10 Fold10 accuracy multiclass 0.699
#> 11 Fold01 kap multiclass 0.533
#> 12 Fold02 kap multiclass 0.512
#> 13 Fold03 kap multiclass 0.594
#> 14 Fold04 kap multiclass 0.511
#> 15 Fold05 kap multiclass 0.514
#> 16 Fold06 kap multiclass 0.486
#> 17 Fold07 kap multiclass 0.454
#> 18 Fold08 kap multiclass 0.531
#> 19 Fold09 kap multiclass 0.454
#> 20 Fold10 kap multiclass 0.492
# ---------------------------------------------------------------------------
# If you need to set options for certain metrics,
# do so by wrapping the metric and setting the options inside the wrapper,
# passing along truth and estimate as quoted arguments.
# Then add on the function class of the underlying wrapped function,
# and the direction of optimization.
ccc_with_bias <- function(data, truth, estimate, na_rm = TRUE, ...) {
ccc(
data = data,
truth = !!rlang::enquo(truth),
estimate = !!rlang::enquo(estimate),
# set bias = TRUE
bias = TRUE,
na_rm = na_rm,
...
)
}
# Use `new_numeric_metric()` to formalize this new metric function
ccc_with_bias <- new_numeric_metric(ccc_with_bias, "maximize")
multi_metric2 <- metric_set(rmse, rsq, ccc_with_bias)
multi_metric2(solubility_test, truth = solubility, estimate = prediction)
#> # A tibble: 3 × 3
#> .metric .estimator .estimate
#> <chr> <chr> <dbl>
#> 1 rmse standard 0.722
#> 2 rsq standard 0.879
#> 3 ccc standard 0.937
# ---------------------------------------------------------------------------
# A class probability example:
# Note that, when given class or class prob functions,
# metric_set() returns a function with signature:
# fn(data, truth, ..., estimate)
# to be able to mix class and class prob metrics.
# You must provide the `estimate` column by explicitly naming
# the argument
class_and_probs_metrics <- metric_set(roc_auc, pr_auc, accuracy)
hpc_cv %>%
group_by(Resample) %>%
class_and_probs_metrics(obs, VF:L, estimate = pred)
#> # A tibble: 30 × 4
#> Resample .metric .estimator .estimate
#> <chr> <chr> <chr> <dbl>
#> 1 Fold01 accuracy multiclass 0.726
#> 2 Fold02 accuracy multiclass 0.712
#> 3 Fold03 accuracy multiclass 0.758
#> 4 Fold04 accuracy multiclass 0.712
#> 5 Fold05 accuracy multiclass 0.712
#> 6 Fold06 accuracy multiclass 0.697
#> 7 Fold07 accuracy multiclass 0.675
#> 8 Fold08 accuracy multiclass 0.721
#> 9 Fold09 accuracy multiclass 0.673
#> 10 Fold10 accuracy multiclass 0.699
#> # ℹ 20 more rows
相关用法
- R yardstick metric_tweak 调整度量函数
- R yardstick metrics 估计性能的通用函数
- R yardstick mn_log_loss 多项数据的平均对数损失
- R yardstick mae 平均绝对误差
- R yardstick msd 平均符号偏差
- R yardstick mpe 平均百分比误差
- R yardstick mape 平均绝对百分比误差
- R yardstick mcc 马修斯相关系数
- R yardstick mase 平均绝对比例误差
- R yardstick pr_auc 查准率曲线下面积
- R yardstick accuracy 准确性
- R yardstick gain_capture 增益捕获
- R yardstick pr_curve 精确率召回曲线
- R yardstick conf_mat 分类数据的混淆矩阵
- R yardstick rpd 性能与偏差之比
- R yardstick detection_prevalence 检测率
- R yardstick bal_accuracy 平衡的精度
- R yardstick rpiq 绩效与四分位间的比率
- R yardstick roc_aunp 使用先验类别分布,每个类别相对于其他类别的 ROC 曲线下面积
- R yardstick roc_curve 接收者算子曲线
- R yardstick rsq R 平方
- R yardstick iic 相关性理想指数
- R yardstick recall 记起
- R yardstick roc_aunu 使用均匀类别分布,每个类别相对于其他类别的 ROC 曲线下面积
- R yardstick npv 阴性预测值
注:本文由纯净天空筛选整理自Max Kuhn等大神的英文原创作品 Combine metric functions。非经特殊声明,原始代码版权归原作者所有,本译文未经允许或授权,请勿转载或复制。