R hardhat mold 用於建模的模具數據

mold() 應用將訓練數據輸入模型所需的適當處理步驟。它通過使用各種藍圖來實現這一點，這些藍圖了解如何預處理各種形式的數據，例如公式或配方。

所有藍圖都具有與其他藍圖一致的返回值，但每個藍圖都足夠獨特，有自己的幫助頁麵。單擊下麵了解如何將每一個與 mold() 結合使用。

XY 方法 - default_xy_blueprint()
公式方法 - default_formula_blueprint()
食譜方法 - default_recipe_blueprint()

用法

mold(x, ...)

參數

x: 一個東西。有關更多信息，請參閱說明中鏈接的方法特定實現。
...: 不曾用過。

值

包含 4 個元素的命名列表：

predictors：包含要在模型中使用的模製預測變量的 tibble。
outcome：包含模型中使用的成型結果的標題。
blueprint ：進行預測時使用的特定於方法的"hardhat_blueprint" 對象。
extras ：如果藍圖不返回額外信息，則為NULL，或者包含額外信息的命名列表。

例子

# See the method specific documentation linked in Description
# for the details of each blueprint, and more examples.

# XY
mold(iris["Sepal.Width"], iris$Species)
#> $predictors
#> # A tibble: 150 × 1
#>    Sepal.Width
#>          <dbl>
#>  1         3.5
#>  2         3  
#>  3         3.2
#>  4         3.1
#>  5         3.6
#>  6         3.9
#>  7         3.4
#>  8         3.4
#>  9         2.9
#> 10         3.1
#> # ℹ 140 more rows
#> 
#> $outcomes
#> # A tibble: 150 × 1
#>    .outcome
#>    <fct>   
#>  1 setosa  
#>  2 setosa  
#>  3 setosa  
#>  4 setosa  
#>  5 setosa  
#>  6 setosa  
#>  7 setosa  
#>  8 setosa  
#>  9 setosa  
#> 10 setosa  
#> # ℹ 140 more rows
#> 
#> $blueprint
#> XY blueprint: 
#>  
#> # Predictors: 1 
#>   # Outcomes: 1 
#>    Intercept: FALSE 
#> Novel Levels: FALSE 
#>  Composition: tibble 
#> 
#> $extras
#> NULL
#> 

# Formula
mold(Species ~ Sepal.Width, iris)
#> $predictors
#> # A tibble: 150 × 1
#>    Sepal.Width
#>          <dbl>
#>  1         3.5
#>  2         3  
#>  3         3.2
#>  4         3.1
#>  5         3.6
#>  6         3.9
#>  7         3.4
#>  8         3.4
#>  9         2.9
#> 10         3.1
#> # ℹ 140 more rows
#> 
#> $outcomes
#> # A tibble: 150 × 1
#>    Species
#>    <fct>  
#>  1 setosa 
#>  2 setosa 
#>  3 setosa 
#>  4 setosa 
#>  5 setosa 
#>  6 setosa 
#>  7 setosa 
#>  8 setosa 
#>  9 setosa 
#> 10 setosa 
#> # ℹ 140 more rows
#> 
#> $blueprint
#> Formula blueprint: 
#>  
#> # Predictors: 1 
#>   # Outcomes: 1 
#>    Intercept: FALSE 
#> Novel Levels: FALSE 
#>  Composition: tibble 
#>   Indicators: traditional 
#> 
#> $extras
#> $extras$offset
#> NULL
#> 
#> 

# Recipe
library(recipes)
mold(recipe(Species ~ Sepal.Width, iris), iris)
#> $predictors
#> # A tibble: 150 × 1
#>    Sepal.Width
#>          <dbl>
#>  1         3.5
#>  2         3  
#>  3         3.2
#>  4         3.1
#>  5         3.6
#>  6         3.9
#>  7         3.4
#>  8         3.4
#>  9         2.9
#> 10         3.1
#> # ℹ 140 more rows
#> 
#> $outcomes
#> # A tibble: 150 × 1
#>    Species
#>    <fct>  
#>  1 setosa 
#>  2 setosa 
#>  3 setosa 
#>  4 setosa 
#>  5 setosa 
#>  6 setosa 
#>  7 setosa 
#>  8 setosa 
#>  9 setosa 
#> 10 setosa 
#> # ℹ 140 more rows
#> 
#> $blueprint
#> Recipe blueprint: 
#>  
#> # Predictors: 1 
#>   # Outcomes: 1 
#>    Intercept: FALSE 
#> Novel Levels: FALSE 
#>  Composition: tibble 
#> 
#> $extras
#> $extras$roles
#> NULL
#> 
#>

源代碼：R/mold.R

相關用法

注：本文由純淨天空篩選整理自Davis Vaughan等大神的英文原創作品 Mold data for modeling。非經特殊聲明，原始代碼版權歸原作者所有，本譯文未經允許或授權，請勿轉載或複製。