R dplyr case_when 通用向量化 if-else

此函数允许您对多个if_else() 语句进行向量化。按顺序评估每种情况，每个元素的第一个匹配确定输出向量中的相应值。如果没有匹配的情况，则 .default 将用作最终的 "else" 语句。

case_when() 是 SQL "searched" CASE WHEN 语句的 R 等效项。

用法

case_when(..., .default = NULL, .ptype = NULL, .size = NULL)

参数

...

< dynamic-dots > 一系列双边公式。左侧 (LHS) 确定哪些值与此情况匹配。右侧 (RHS) 提供重置值。

LHS 输入必须计算为逻辑向量。

RHS 输入将被强制为其通用类型。

所有输入将被回收到其常见大小。也就是说，我们鼓励所有 LHS 输入的大小相同。回收主要对 RHS 输入有用，您可以提供大小为 1 的输入，该输入将被回收到 LHS 输入的大小。

NULL 输入被忽略。

.default

当所有 LHS 输入返回 FALSE 或 NA 时使用的值。

.default 的大小必须为 1 或与从 ... 计算出的通用大小相同。

.default 参与具有 RHS 输入的公共类型的计算。

LHS 条件中的 NA 值的处理方式与 FALSE 类似，这意味着这些位置的结果将被分配为 .default 值。要以不同方式处理条件中的缺失值，您必须在它们落入 .default 之前使用另一个条件显式捕获它们。这通常涉及 is.na(x) ~ value 的一些变体，适合您对 case_when() 的使用。

如果是NULL(默认值)，则将使用缺失值。

.ptype

声明所需输出类型的可选原型。如果提供，它将覆盖 RHS 输入的通用类型。

.size

声明所需输出大小的可选大小。如果提供，这将覆盖从 ... 计算出的通用大小。

值

向量的大小与根据 ... 中的输入计算出的通用大小相同，且类型与 ... 中的 RHS 输入的通用类型相同。

也可以看看

case_match()

例子

x <- 1:70
case_when(
  x %% 35 == 0 ~ "fizz buzz",
  x %% 5 == 0 ~ "fizz",
  x %% 7 == 0 ~ "buzz",
  .default = as.character(x)
)
#>  [1] "1"         "2"         "3"         "4"         "fizz"     
#>  [6] "6"         "buzz"      "8"         "9"         "fizz"     
#> [11] "11"        "12"        "13"        "buzz"      "fizz"     
#> [16] "16"        "17"        "18"        "19"        "fizz"     
#> [21] "buzz"      "22"        "23"        "24"        "fizz"     
#> [26] "26"        "27"        "buzz"      "29"        "fizz"     
#> [31] "31"        "32"        "33"        "34"        "fizz buzz"
#> [36] "36"        "37"        "38"        "39"        "fizz"     
#> [41] "41"        "buzz"      "43"        "44"        "fizz"     
#> [46] "46"        "47"        "48"        "buzz"      "fizz"     
#> [51] "51"        "52"        "53"        "54"        "fizz"     
#> [56] "buzz"      "57"        "58"        "59"        "fizz"     
#> [61] "61"        "62"        "buzz"      "64"        "fizz"     
#> [66] "66"        "67"        "68"        "69"        "fizz buzz"

# Like an if statement, the arguments are evaluated in order, so you must
# proceed from the most specific to the most general. This won't work:
case_when(
  x %%  5 == 0 ~ "fizz",
  x %%  7 == 0 ~ "buzz",
  x %% 35 == 0 ~ "fizz buzz",
  .default = as.character(x)
)
#>  [1] "1"    "2"    "3"    "4"    "fizz" "6"    "buzz" "8"    "9"    "fizz"
#> [11] "11"   "12"   "13"   "buzz" "fizz" "16"   "17"   "18"   "19"   "fizz"
#> [21] "buzz" "22"   "23"   "24"   "fizz" "26"   "27"   "buzz" "29"   "fizz"
#> [31] "31"   "32"   "33"   "34"   "fizz" "36"   "37"   "38"   "39"   "fizz"
#> [41] "41"   "buzz" "43"   "44"   "fizz" "46"   "47"   "48"   "buzz" "fizz"
#> [51] "51"   "52"   "53"   "54"   "fizz" "buzz" "57"   "58"   "59"   "fizz"
#> [61] "61"   "62"   "buzz" "64"   "fizz" "66"   "67"   "68"   "69"   "fizz"

# If none of the cases match and no `.default` is supplied, NA is used:
case_when(
  x %% 35 == 0 ~ "fizz buzz",
  x %% 5 == 0 ~ "fizz",
  x %% 7 == 0 ~ "buzz",
)
#>  [1] NA          NA          NA          NA          "fizz"     
#>  [6] NA          "buzz"      NA          NA          "fizz"     
#> [11] NA          NA          NA          "buzz"      "fizz"     
#> [16] NA          NA          NA          NA          "fizz"     
#> [21] "buzz"      NA          NA          NA          "fizz"     
#> [26] NA          NA          "buzz"      NA          "fizz"     
#> [31] NA          NA          NA          NA          "fizz buzz"
#> [36] NA          NA          NA          NA          "fizz"     
#> [41] NA          "buzz"      NA          NA          "fizz"     
#> [46] NA          NA          NA          "buzz"      "fizz"     
#> [51] NA          NA          NA          NA          "fizz"     
#> [56] "buzz"      NA          NA          NA          "fizz"     
#> [61] NA          NA          "buzz"      NA          "fizz"     
#> [66] NA          NA          NA          NA          "fizz buzz"

# Note that `NA` values on the LHS are treated like `FALSE` and will be
# assigned the `.default` value. You must handle them explicitly if you
# want to use a different value. The exact way to handle missing values is
# dependent on the set of LHS conditions you use.
x[2:4] <- NA_real_
case_when(
  x %% 35 == 0 ~ "fizz buzz",
  x %% 5 == 0 ~ "fizz",
  x %% 7 == 0 ~ "buzz",
  is.na(x) ~ "nope",
  .default = as.character(x)
)
#>  [1] "1"         "nope"      "nope"      "nope"      "fizz"     
#>  [6] "6"         "buzz"      "8"         "9"         "fizz"     
#> [11] "11"        "12"        "13"        "buzz"      "fizz"     
#> [16] "16"        "17"        "18"        "19"        "fizz"     
#> [21] "buzz"      "22"        "23"        "24"        "fizz"     
#> [26] "26"        "27"        "buzz"      "29"        "fizz"     
#> [31] "31"        "32"        "33"        "34"        "fizz buzz"
#> [36] "36"        "37"        "38"        "39"        "fizz"     
#> [41] "41"        "buzz"      "43"        "44"        "fizz"     
#> [46] "46"        "47"        "48"        "buzz"      "fizz"     
#> [51] "51"        "52"        "53"        "54"        "fizz"     
#> [56] "buzz"      "57"        "58"        "59"        "fizz"     
#> [61] "61"        "62"        "buzz"      "64"        "fizz"     
#> [66] "66"        "67"        "68"        "69"        "fizz buzz"

# `case_when()` evaluates all RHS expressions, and then constructs its
# result by extracting the selected (via the LHS expressions) parts.
# In particular `NaN`s are produced in this case:
y <- seq(-2, 2, by = .5)
case_when(
  y >= 0 ~ sqrt(y),
  .default = y
)
#> Warning: NaNs produced
#> [1] -2.0000000 -1.5000000 -1.0000000 -0.5000000  0.0000000  0.7071068
#> [7]  1.0000000  1.2247449  1.4142136

# `case_when()` is particularly useful inside `mutate()` when you want to
# create a new variable that relies on a complex combination of existing
# variables
starwars %>%
  select(name:mass, gender, species) %>%
  mutate(
    type = case_when(
      height > 200 | mass > 200 ~ "large",
      species == "Droid" ~ "robot",
      .default = "other"
    )
  )
#> # A tibble: 87 × 6
#>    name               height  mass gender    species type 
#>    <chr>               <int> <dbl> <chr>     <chr>   <chr>
#>  1 Luke Skywalker        172    77 masculine Human   other
#>  2 C-3PO                 167    75 masculine Droid   robot
#>  3 R2-D2                  96    32 masculine Droid   robot
#>  4 Darth Vader           202   136 masculine Human   large
#>  5 Leia Organa           150    49 feminine  Human   other
#>  6 Owen Lars             178   120 masculine Human   other
#>  7 Beru Whitesun lars    165    75 feminine  Human   other
#>  8 R5-D4                  97    32 masculine Droid   robot
#>  9 Biggs Darklighter     183    84 masculine Human   other
#> 10 Obi-Wan Kenobi        182    77 masculine Human   other
#> # ℹ 77 more rows


# `case_when()` is not a tidy eval function. If you'd like to reuse
# the same patterns, extract the `case_when()` call in a normal
# function:
case_character_type <- function(height, mass, species) {
  case_when(
    height > 200 | mass > 200 ~ "large",
    species == "Droid" ~ "robot",
    .default = "other"
  )
}

case_character_type(150, 250, "Droid")
#> [1] "large"
case_character_type(150, 150, "Droid")
#> [1] "robot"

# Such functions can be used inside `mutate()` as well:
starwars %>%
  mutate(type = case_character_type(height, mass, species)) %>%
  pull(type)
#>  [1] "other" "robot" "robot" "large" "other" "other" "other" "robot"
#>  [9] "other" "other" "other" "other" "large" "other" "other" "large"
#> [17] "other" "other" "other" "other" "other" "robot" "other" "other"
#> [25] "other" "other" "other" "other" "other" "other" "other" "other"
#> [33] "other" "other" "large" "large" "other" "other" "other" "other"
#> [41] "other" "other" "other" "other" "other" "other" "other" "other"
#> [49] "other" "other" "other" "other" "other" "large" "other" "other"
#> [57] "other" "other" "other" "other" "other" "other" "other" "other"
#> [65] "other" "other" "other" "other" "large" "large" "other" "other"
#> [73] "robot" "other" "other" "other" "large" "large" "other" "other"
#> [81] "large" "other" "other" "other" "robot" "other" "other"

# `case_when()` ignores `NULL` inputs. This is useful when you'd
# like to use a pattern only under certain conditions. Here we'll
# take advantage of the fact that `if` returns `NULL` when there is
# no `else` clause:
case_character_type <- function(height, mass, species, robots = TRUE) {
  case_when(
    height > 200 | mass > 200 ~ "large",
    if (robots) species == "Droid" ~ "robot",
    .default = "other"
  )
}

starwars %>%
  mutate(type = case_character_type(height, mass, species, robots = FALSE)) %>%
  pull(type)
#>  [1] "other" "other" "other" "large" "other" "other" "other" "other"
#>  [9] "other" "other" "other" "other" "large" "other" "other" "large"
#> [17] "other" "other" "other" "other" "other" "other" "other" "other"
#> [25] "other" "other" "other" "other" "other" "other" "other" "other"
#> [33] "other" "other" "large" "large" "other" "other" "other" "other"
#> [41] "other" "other" "other" "other" "other" "other" "other" "other"
#> [49] "other" "other" "other" "other" "other" "large" "other" "other"
#> [57] "other" "other" "other" "other" "other" "other" "other" "other"
#> [65] "other" "other" "other" "other" "large" "large" "other" "other"
#> [73] "other" "other" "other" "other" "large" "large" "other" "other"
#> [81] "large" "other" "other" "other" "other" "other" "other"

源代码：R/case-when.R

相关用法

注：本文由纯净天空筛选整理自Hadley Wickham等大神的英文原创作品 A general vectorised if-else。非经特殊声明，原始代码版权归原作者所有，本译文未经允许或授权，请勿转载或复制。