當前位置: 首頁>>代碼示例 >>用法及示例精選 >>正文


R dplyr pick 選擇列的子集


pick() 提供了一種在 "data-masking" 函數(如 mutate()summarise() )內使用 select() 語義輕鬆從數據中選擇列子集的方法。 pick() 返回一個 DataFrame ,其中包含當前組的選定列。

pick()across() 互補:

  • 使用 pick() ,您通常將函數應用於完整數據幀。

  • 對於 across() ,您通常會對每一列應用一個函數。

用法

pick(...)

參數

...

<tidy-select>

可供選擇的列。

您無法選擇分組列,因為它們已經由動詞自動處理(即 summarise()mutate() )。

包含當前組的選定列的小標題。

細節

理論上, pick() 旨在可替換為對 tibble() 的等效調用。例如,pick(a, c) 可以替換為 tibble(a = a, c = c) ,並且帶有列 abc 的數據幀上的 pick(everything()) 可以替換為 tibble(a = a, b = b, c = c)pick() 通過返回 1 行、0 列小標題專門處理空選擇的情況,因此精確替換更像是:

size <- vctrs::vec_size_common(..., .absent = 1L)
out <- vctrs::vec_recycle_common(..., .size = size)
tibble::new_tibble(out, nrow = size)

也可以看看

例子

df <- tibble(
  x = c(3, 2, 2, 2, 1),
  y = c(0, 2, 1, 1, 4),
  z1 = c("a", "a", "a", "b", "a"),
  z2 = c("c", "d", "d", "a", "c")
)
df
#> # A tibble: 5 × 4
#>       x     y z1    z2   
#>   <dbl> <dbl> <chr> <chr>
#> 1     3     0 a     c    
#> 2     2     2 a     d    
#> 3     2     1 a     d    
#> 4     2     1 b     a    
#> 5     1     4 a     c    

# `pick()` provides a way to select a subset of your columns using
# tidyselect. It returns a data frame.
df %>% mutate(cols = pick(x, y))
#> # A tibble: 5 × 5
#>       x     y z1    z2    cols$x    $y
#>   <dbl> <dbl> <chr> <chr>  <dbl> <dbl>
#> 1     3     0 a     c          3     0
#> 2     2     2 a     d          2     2
#> 3     2     1 a     d          2     1
#> 4     2     1 b     a          2     1
#> 5     1     4 a     c          1     4

# This is useful for functions that take data frames as inputs.
# For example, you can compute a joint rank between `x` and `y`.
df %>% mutate(rank = dense_rank(pick(x, y)))
#> # A tibble: 5 × 5
#>       x     y z1    z2     rank
#>   <dbl> <dbl> <chr> <chr> <int>
#> 1     3     0 a     c         4
#> 2     2     2 a     d         3
#> 3     2     1 a     d         2
#> 4     2     1 b     a         2
#> 5     1     4 a     c         1

# `pick()` is also useful as a bridge between data-masking functions (like
# `mutate()` or `group_by()`) and functions with tidy-select behavior (like
# `select()`). For example, you can use `pick()` to create a wrapper around
# `group_by()` that takes a tidy-selection of columns to group on. For more
# bridge patterns, see
# https://rlang.r-lib.org/reference/topic-data-mask-programming.html#bridge-patterns.
my_group_by <- function(data, cols) {
  group_by(data, pick({{ cols }}))
}

df %>% my_group_by(c(x, starts_with("z")))
#> # A tibble: 5 × 4
#> # Groups:   x, z1, z2 [4]
#>       x     y z1    z2   
#>   <dbl> <dbl> <chr> <chr>
#> 1     3     0 a     c    
#> 2     2     2 a     d    
#> 3     2     1 a     d    
#> 4     2     1 b     a    
#> 5     1     4 a     c    

# Or you can use it to dynamically select columns to `count()` by
df %>% count(pick(starts_with("z")))
#> # A tibble: 3 × 3
#>   z1    z2        n
#>   <chr> <chr> <int>
#> 1 a     c         2
#> 2 a     d         2
#> 3 b     a         1
源代碼:R/pick.R

相關用法


注:本文由純淨天空篩選整理自Hadley Wickham等大神的英文原創作品 Select a subset of columns。非經特殊聲明,原始代碼版權歸原作者所有,本譯文未經允許或授權,請勿轉載或複製。