R dplyr rowwise 按行對輸入進行分組

rowwise() 允許您在數據幀上計算row-at-a-time。當向量化函數不存在時，這是最有用的。

大多數 dplyr 動詞保留按行分組。例外是 summarise() ，它返回 grouped_df 。您可以使用 ungroup() 或 as_tibble() 顯式取消分組，或使用 group_by() 轉換為 grouped_df 。

用法

rowwise(data, ...)

參數

data

輸入 DataFrame 。

...

< tidy-select > 調用 summarise() 時要保留的變量。這通常是一組變量，其組合唯一地標識每一行。

注意：與 group_by() 不同，您不能在此處創建新變量，但您可以使用(例如)everything() 選擇多個變量。

值

具有類 rowwise_df 的按行 DataFrame 。請注意， rowwise_df 隱式按行分組，但不是 grouped_df 。

列表列

因為 rowwise 每組隻有一行，所以它為處理列表列提供了一點便利。通常， summarise() 和 mutate() 會使用 [ 提取一組數據。但是當你以這種方式索引一個列表時，你會得到另一個列表。當您使用 rowwise tibble 時，dplyr 將使用 [[ 而不是 [ 讓您的生活更輕鬆。

也可以看看

nest_by() 用於使用嵌套數據創建行式 DataFrame 的便捷方法。

例子

df <- tibble(x = runif(6), y = runif(6), z = runif(6))
# Compute the mean of x, y, z in each row
df %>% rowwise() %>% mutate(m = mean(c(x, y, z)))
#> # A tibble: 6 × 4
#> # Rowwise: 
#>       x      y      z     m
#>   <dbl>  <dbl>  <dbl> <dbl>
#> 1 0.922 0.476  0.211  0.536
#> 2 0.139 0.552  0.0723 0.254
#> 3 0.197 0.879  0.611  0.563
#> 4 0.228 0.778  0.251  0.419
#> 5 0.960 0.0823 0.401  0.481
#> 6 0.283 0.968  0.551  0.601
# use c_across() to more easily select many variables
df %>% rowwise() %>% mutate(m = mean(c_across(x:z)))
#> # A tibble: 6 × 4
#> # Rowwise: 
#>       x      y      z     m
#>   <dbl>  <dbl>  <dbl> <dbl>
#> 1 0.922 0.476  0.211  0.536
#> 2 0.139 0.552  0.0723 0.254
#> 3 0.197 0.879  0.611  0.563
#> 4 0.228 0.778  0.251  0.419
#> 5 0.960 0.0823 0.401  0.481
#> 6 0.283 0.968  0.551  0.601

# Compute the minimum of x and y in each row
df %>% rowwise() %>% mutate(m = min(c(x, y, z)))
#> # A tibble: 6 × 4
#> # Rowwise: 
#>       x      y      z      m
#>   <dbl>  <dbl>  <dbl>  <dbl>
#> 1 0.922 0.476  0.211  0.211 
#> 2 0.139 0.552  0.0723 0.0723
#> 3 0.197 0.879  0.611  0.197 
#> 4 0.228 0.778  0.251  0.228 
#> 5 0.960 0.0823 0.401  0.0823
#> 6 0.283 0.968  0.551  0.283 
# In this case you can use an existing vectorised function:
df %>% mutate(m = pmin(x, y, z))
#> # A tibble: 6 × 4
#>       x      y      z      m
#>   <dbl>  <dbl>  <dbl>  <dbl>
#> 1 0.922 0.476  0.211  0.211 
#> 2 0.139 0.552  0.0723 0.0723
#> 3 0.197 0.879  0.611  0.197 
#> 4 0.228 0.778  0.251  0.228 
#> 5 0.960 0.0823 0.401  0.0823
#> 6 0.283 0.968  0.551  0.283 
# Where these functions exist they'll be much faster than rowwise
# so be on the lookout for them.

# rowwise() is also useful when doing simulations
params <- tribble(
 ~sim, ~n, ~mean, ~sd,
    1,  1,     1,   1,
    2,  2,     2,   4,
    3,  3,    -1,   2
)
# Here I supply variables to preserve after the computation
params %>%
  rowwise(sim) %>%
  reframe(z = rnorm(n, mean, sd))
#> # A tibble: 6 × 2
#>     sim      z
#>   <dbl>  <dbl>
#> 1     1  2.34 
#> 2     2 -1.41 
#> 3     2 -2.60 
#> 4     3  0.983
#> 5     3  2.00 
#> 6     3  0.394

# If you want one row per simulation, put the results in a list()
params %>%
  rowwise(sim) %>%
  summarise(z = list(rnorm(n, mean, sd)), .groups = "keep")
#> # A tibble: 3 × 2
#> # Groups:   sim [3]
#>     sim z        
#>   <dbl> <list>   
#> 1     1 <dbl [1]>
#> 2     2 <dbl [2]>
#> 3     3 <dbl [3]>

源代碼：R/rowwise.R

相關用法

注：本文由純淨天空篩選整理自Hadley Wickham等大神的英文原創作品 Group input by rows。非經特殊聲明，原始代碼版權歸原作者所有，本譯文未經允許或授權，請勿轉載或複製。