R dplyr rows 操作单独的行

这些函数提供了使用第二个数据表修改表中的行的框架。这两个表与 by 一组关键变量相匹配，这些变量的值通常唯一标识每一行。这些函数的灵感来自 SQL 的 INSERT 、 UPDATE 和 DELETE ，并且可以选择针对选定的后端修改 in_place 。

rows_insert() 添加新行(如 INSERT )。默认情况下， y 中的键值不能存在于 x 中。
rows_append() 的工作方式与 rows_insert() 类似，但忽略键。
rows_update() 修改现有行(如 UPDATE )。 y 中的键值必须是唯一的，并且默认情况下，y 中的键值必须存在于 x 中。
rows_patch() 的工作方式与 rows_update() 类似，但仅覆盖 NA 值。
rows_upsert() 根据 y 中的键值是否已存在于 x 中进行插入或更新。 y 中的键值必须是唯一的。
rows_delete() 删除行(如 DELETE )。默认情况下， y 中的键值必须存在于 x 中。

用法

rows_insert(
  x,
  y,
  by = NULL,
  ...,
  conflict = c("error", "ignore"),
  copy = FALSE,
  in_place = FALSE
)

rows_append(x, y, ..., copy = FALSE, in_place = FALSE)

rows_update(
  x,
  y,
  by = NULL,
  ...,
  unmatched = c("error", "ignore"),
  copy = FALSE,
  in_place = FALSE
)

rows_patch(
  x,
  y,
  by = NULL,
  ...,
  unmatched = c("error", "ignore"),
  copy = FALSE,
  in_place = FALSE
)

rows_upsert(x, y, by = NULL, ..., copy = FALSE, in_place = FALSE)

rows_delete(
  x,
  y,
  by = NULL,
  ...,
  unmatched = c("error", "ignore"),
  copy = FALSE,
  in_place = FALSE
)

参数

x, y

一对数据帧或数据帧扩展(例如 tibble)。 y 必须与 x 具有相同的列或子集。

by

给出关键列的未命名字符向量。键列必须同时存在于 x 和 y 中。键通常唯一标识每一行，但这仅在使用 rows_update() 、 rows_patch() 或 rows_upsert() 时对 y 的键值强制执行。

默认情况下，我们使用 y 中的第一列，因为第一列是放置标识符变量的合理位置。

...

传递给方法的其他参数。

conflict

对于rows_insert()，y中的键与x中的键冲突应该如何处理？如果 y 中的键已存在于 x 中，则会发生冲突。

之一：

"error" (默认值)如果 y 中的任何键与 x 中的键冲突，将会出错。
"ignore" 将忽略 y 中与 x 中的键冲突的行。

copy

如果 x 和 y 不是来自同一个数据源，并且 copy 是 TRUE ，则 y 将被复制到与 x 相同的源中。这允许您跨 src 连接表，但这是一项潜在昂贵的操作，因此您必须选择它。

in_place

是否应该就地修改x？此参数仅与可变后端相关(例如数据库、data.tables)。

当 TRUE 时，隐形返回 x 的修改版本；当 FALSE 时，返回一个表示结果更改的新对象。

unmatched

对于 rows_update() 、 rows_patch() 和 rows_delete() ，应如何处理 y 中与 x 中的键不匹配的键？

之一：

"error" (默认值)如果 y 中的任何键与 x 中的键不匹配，则会出错。
"ignore" 将忽略 y 中的行，这些行的键与 x 中的键不匹配。

值

与 x 类型相同的对象。 x的行和列的顺序

被尽可能地保留下来。输出具有以下属性：

rows_update() 和 rows_patch() 保留行数； rows_insert() 、 rows_append() 和 rows_upsert() 返回所有现有行和可能的新行； rows_delete() 返回行的子集。
尽管数据可能会更新，但不会添加、删除或重新定位列。
组取自x。
数据帧属性取自x。

如果是 in_place = TRUE ，结果会以不可见的方式返回。

方法

这些函数是泛型函数，这意味着包可以为其他类提供实现(方法)。有关额外参数和行为差异，请参阅各个方法的文档。

当前加载的包中可用的方法：

rows_insert()：dbplyr(tbl_lazy)、dplyr(data.frame)。
rows_append()：dbplyr(tbl_lazy)、dplyr(data.frame)。
rows_update()：dbplyr(tbl_lazy)、dplyr(data.frame)。
rows_patch()：dbplyr(tbl_lazy)、dplyr(data.frame)。
rows_upsert()：dbplyr(tbl_lazy)、dplyr(data.frame)。
rows_delete()：dbplyr(tbl_lazy)、dplyr(data.frame)。

例子

data <- tibble(a = 1:3, b = letters[c(1:2, NA)], c = 0.5 + 0:2)
data
#> # A tibble: 3 × 3
#>       a b         c
#>   <int> <chr> <dbl>
#> 1     1 a       0.5
#> 2     2 b       1.5
#> 3     3 NA      2.5

# Insert
rows_insert(data, tibble(a = 4, b = "z"))
#> Matching, by = "a"
#> # A tibble: 4 × 3
#>       a b         c
#>   <int> <chr> <dbl>
#> 1     1 a       0.5
#> 2     2 b       1.5
#> 3     3 NA      2.5
#> 4     4 z      NA  

# By default, if a key in `y` matches a key in `x`, then it can't be inserted
# and will throw an error. Alternatively, you can ignore rows in `y`
# containing keys that conflict with keys in `x` with `conflict = "ignore"`,
# or you can use `rows_append()` to ignore keys entirely.
try(rows_insert(data, tibble(a = 3, b = "z")))
#> Matching, by = "a"
#> Error in rows_insert(data, tibble(a = 3, b = "z")) : 
#>   `y` can't contain keys that already exist in `x`.
#> ℹ The following rows in `y` have keys that already exist in `x`: `c(1)`.
#> ℹ Use `conflict = "ignore"` if you want to ignore these `y` rows.
rows_insert(data, tibble(a = 3, b = "z"), conflict = "ignore")
#> Matching, by = "a"
#> # A tibble: 3 × 3
#>       a b         c
#>   <int> <chr> <dbl>
#> 1     1 a       0.5
#> 2     2 b       1.5
#> 3     3 NA      2.5
rows_append(data, tibble(a = 3, b = "z"))
#> # A tibble: 4 × 3
#>       a b         c
#>   <int> <chr> <dbl>
#> 1     1 a       0.5
#> 2     2 b       1.5
#> 3     3 NA      2.5
#> 4     3 z      NA  

# Update
rows_update(data, tibble(a = 2:3, b = "z"))
#> Matching, by = "a"
#> # A tibble: 3 × 3
#>       a b         c
#>   <int> <chr> <dbl>
#> 1     1 a       0.5
#> 2     2 z       1.5
#> 3     3 z       2.5
rows_update(data, tibble(b = "z", a = 2:3), by = "a")
#> # A tibble: 3 × 3
#>       a b         c
#>   <int> <chr> <dbl>
#> 1     1 a       0.5
#> 2     2 z       1.5
#> 3     3 z       2.5

# Variants: patch and upsert
rows_patch(data, tibble(a = 2:3, b = "z"))
#> Matching, by = "a"
#> # A tibble: 3 × 3
#>       a b         c
#>   <int> <chr> <dbl>
#> 1     1 a       0.5
#> 2     2 b       1.5
#> 3     3 z       2.5
rows_upsert(data, tibble(a = 2:4, b = "z"))
#> Matching, by = "a"
#> # A tibble: 4 × 3
#>       a b         c
#>   <int> <chr> <dbl>
#> 1     1 a       0.5
#> 2     2 z       1.5
#> 3     3 z       2.5
#> 4     4 z      NA  

# Delete and truncate
rows_delete(data, tibble(a = 2:3))
#> Matching, by = "a"
#> # A tibble: 1 × 3
#>       a b         c
#>   <int> <chr> <dbl>
#> 1     1 a       0.5
rows_delete(data, tibble(a = 2:3, b = "b"))
#> Matching, by = "a"
#> Ignoring extra `y` columns: b
#> # A tibble: 1 × 3
#>       a b         c
#>   <int> <chr> <dbl>
#> 1     1 a       0.5

# By default, for update, patch, and delete it is an error if a key in `y`
# doesn't exist in `x`. You can ignore rows in `y` that have unmatched keys
# with `unmatched = "ignore"`.
y <- tibble(a = 3:4, b = "z")
try(rows_update(data, y, by = "a"))
#> Error in rows_update(data, y, by = "a") : 
#>   `y` must contain keys that already exist in `x`.
#> ℹ The following rows in `y` have keys that don't exist in `x`: `c(2)`.
#> ℹ Use `unmatched = "ignore"` if you want to ignore these `y` rows.
rows_update(data, y, by = "a", unmatched = "ignore")
#> # A tibble: 3 × 3
#>       a b         c
#>   <int> <chr> <dbl>
#> 1     1 a       0.5
#> 2     2 b       1.5
#> 3     3 z       2.5
rows_patch(data, y, by = "a", unmatched = "ignore")
#> # A tibble: 3 × 3
#>       a b         c
#>   <int> <chr> <dbl>
#> 1     1 a       0.5
#> 2     2 b       1.5
#> 3     3 z       2.5
rows_delete(data, y, by = "a", unmatched = "ignore")
#> Ignoring extra `y` columns: b
#> # A tibble: 2 × 3
#>       a b         c
#>   <int> <chr> <dbl>
#> 1     1 a       0.5
#> 2     2 b       1.5

源代码：R/rows.R

相关用法

注：本文由纯净天空筛选整理自Hadley Wickham等大神的英文原创作品 Manipulate individual rows。非经特殊声明，原始代码版权归原作者所有，本译文未经允许或授权，请勿转载或复制。