R hardhat validate_prediction_size 确保预测具有正确的行数

验证 - 断言以下内容：

pred 的大小必须与 new_data 的大小相同。

检查 - 返回以下内容：

ok 逻辑。检查通过吗？
size_new_data 单个数字。 new_data 的大小。
size_pred 单个数字。 pred 的大小。

用法

validate_prediction_size(pred, new_data)

check_prediction_size(pred, new_data)

参数

pred: 一点点。从任何预测 type 返回的预测。这通常是使用 spruce 函数之一创建的，例如 spruce_numeric() 。
new_data: 新预测因子和可能结果的 DataFrame 架。

值

validate_prediction_size() 以不可见方式返回pred。

check_prediction_size() 返回三个组件的命名列表： ok 、 size_new_data 和 size_pred 。

细节

此验证函数更注重开发人员而不是用户。这是在从特定 predict() 方法返回值之前使用的最终检查，主要是 "good practice" 健全性检查，以确保您的预测蓝图始终返回与 new_data 相同的行数，其中是该包试图推广的建模约定之一。

验证

Hardhat 提供两个级别的验证函数。

check_*() ：检查条件，并返回列表。该列表始终包含至少一个元素 ok ，这是一个指定检查是否通过的逻辑。每个检查还在返回的列表中检查特定元素，可用于构造有意义的错误消息。
validate_*()：检查条件，如果不通过则出错。这些函数调用相应的检查函数，然后提供默认的错误消息。如果您作为开发人员想要不同的错误消息，请自行调用 check_*() 函数，并提供您自己的验证函数。

也可以看看

其他验证函数：validate_column_names()、validate_no_formula_duplication()、validate_outcomes_are_binary()、validate_outcomes_are_factors()、validate_outcomes_are_numeric()、validate_outcomes_are_univariate()、validate_predictors_are_numeric()

例子

# Say new_data has 5 rows
new_data <- mtcars[1:5, ]

# And somehow you generate predictions
# for those 5 rows
pred_vec <- 1:5

# Then you use `spruce_numeric()` to clean
# up these numeric predictions
pred <- spruce_numeric(pred_vec)

pred
#> # A tibble: 5 × 1
#>   .pred
#>   <int>
#> 1     1
#> 2     2
#> 3     3
#> 4     4
#> 5     5

# Use this check to ensure that
# the number of rows or pred match new_data
check_prediction_size(pred, new_data)
#> $ok
#> [1] TRUE
#> 
#> $size_new_data
#> [1] 5
#> 
#> $size_pred
#> [1] 5
#> 

# An informative error message is thrown
# if the rows are different
try(validate_prediction_size(spruce_numeric(1:4), new_data))
#> Error in validate_prediction_size(spruce_numeric(1:4), new_data) : 
#>   The size of `new_data` (5) must match the size of `pred` (4).

源代码：R/validation.R

相关用法

注：本文由纯净天空筛选整理自Davis Vaughan等大神的英文原创作品 Ensure that predictions have the correct number of rows。非经特殊声明，原始代码版权归原作者所有，本译文未经允许或授权，请勿转载或复制。