R dplyr consecutive_id 為連續組合生成唯一標識符

consecutive_id() 生成一個唯一標識符，每次變量(或變量組合)更改時該標識符都會遞增。受到data.table::rleid() 的啟發。

用法

consecutive_id(...)

參數

...: 未命名的向量。如果提供多個向量，那麽它們應該具有相同的長度。

值

與 ... 的最長元素長度相同的數值向量。

例子

consecutive_id(c(TRUE, TRUE, FALSE, FALSE, TRUE, FALSE, NA, NA))
#> [1] 1 1 2 2 3 4 5 5
consecutive_id(c(1, 1, 1, 2, 1, 1, 2, 2))
#> [1] 1 1 1 2 3 3 4 4

df <- data.frame(x = c(0, 0, 1, 0), y = c(2, 2, 2, 2))
df %>% group_by(x, y) %>% summarise(n = n())
#> `summarise()` has grouped output by 'x'. You can override using the
#> `.groups` argument.
#> # A tibble: 2 × 3
#> # Groups:   x [2]
#>       x     y     n
#>   <dbl> <dbl> <int>
#> 1     0     2     3
#> 2     1     2     1
df %>% group_by(id = consecutive_id(x, y), x, y) %>% summarise(n = n())
#> `summarise()` has grouped output by 'id', 'x'. You can override using the
#> `.groups` argument.
#> # A tibble: 3 × 4
#> # Groups:   id, x [3]
#>      id     x     y     n
#>   <int> <dbl> <dbl> <int>
#> 1     1     0     2     2
#> 2     2     1     2     1
#> 3     3     0     2     1

源代碼：R/consecutive-id.R

相關用法

注：本文由純淨天空篩選整理自Hadley Wickham等大神的英文原創作品 Generate a unique identifier for consecutive combinations。非經特殊聲明，原始代碼版權歸原作者所有，本譯文未經允許或授權，請勿轉載或複製。