使用下一个或上一个条目填充选定列中的缺失值。这在通用输出格式中非常有用,其中值不重复,并且仅在它们更改时才记录。
参数
- data
-
一个 DataFrame 。
- ...
-
<
tidy-select
> 要填充的列。 - .direction
-
填充缺失值的方向。目前是"down"(默认)、"up"、"downup"(即先向下然后向上)或"updown"(先向上然后向下)。
分组 DataFrame
对于 dplyr::group_by()
创建的分组 DataFrame ,fill()
将应用于每个组内,这意味着它不会填充跨越组边界。
例子
# direction = "down" --------------------------------------------------------
# Value (year) is recorded only when it changes
sales <- tibble::tribble(
~quarter, ~year, ~sales,
"Q1", 2000, 66013,
"Q2", NA, 69182,
"Q3", NA, 53175,
"Q4", NA, 21001,
"Q1", 2001, 46036,
"Q2", NA, 58842,
"Q3", NA, 44568,
"Q4", NA, 50197,
"Q1", 2002, 39113,
"Q2", NA, 41668,
"Q3", NA, 30144,
"Q4", NA, 52897,
"Q1", 2004, 32129,
"Q2", NA, 67686,
"Q3", NA, 31768,
"Q4", NA, 49094
)
# `fill()` defaults to replacing missing data from top to bottom
sales %>% fill(year)
#> # A tibble: 16 × 3
#> quarter year sales
#> <chr> <dbl> <dbl>
#> 1 Q1 2000 66013
#> 2 Q2 2000 69182
#> 3 Q3 2000 53175
#> 4 Q4 2000 21001
#> 5 Q1 2001 46036
#> 6 Q2 2001 58842
#> 7 Q3 2001 44568
#> 8 Q4 2001 50197
#> 9 Q1 2002 39113
#> 10 Q2 2002 41668
#> 11 Q3 2002 30144
#> 12 Q4 2002 52897
#> 13 Q1 2004 32129
#> 14 Q2 2004 67686
#> 15 Q3 2004 31768
#> 16 Q4 2004 49094
# direction = "up" ----------------------------------------------------------
# Value (pet_type) is missing above
tidy_pets <- tibble::tribble(
~rank, ~pet_type, ~breed,
1L, NA, "Boston Terrier",
2L, NA, "Retrievers (Labrador)",
3L, NA, "Retrievers (Golden)",
4L, NA, "French Bulldogs",
5L, NA, "Bulldogs",
6L, "Dog", "Beagles",
1L, NA, "Persian",
2L, NA, "Maine Coon",
3L, NA, "Ragdoll",
4L, NA, "Exotic",
5L, NA, "Siamese",
6L, "Cat", "American Short"
)
# For values that are missing above you can use `.direction = "up"`
tidy_pets %>%
fill(pet_type, .direction = "up")
#> # A tibble: 12 × 3
#> rank pet_type breed
#> <int> <chr> <chr>
#> 1 1 Dog Boston Terrier
#> 2 2 Dog Retrievers (Labrador)
#> 3 3 Dog Retrievers (Golden)
#> 4 4 Dog French Bulldogs
#> 5 5 Dog Bulldogs
#> 6 6 Dog Beagles
#> 7 1 Cat Persian
#> 8 2 Cat Maine Coon
#> 9 3 Cat Ragdoll
#> 10 4 Cat Exotic
#> 11 5 Cat Siamese
#> 12 6 Cat American Short
# direction = "downup" ------------------------------------------------------
# Value (n_squirrels) is missing above and below within a group
squirrels <- tibble::tribble(
~group, ~name, ~role, ~n_squirrels,
1, "Sam", "Observer", NA,
1, "Mara", "Scorekeeper", 8,
1, "Jesse", "Observer", NA,
1, "Tom", "Observer", NA,
2, "Mike", "Observer", NA,
2, "Rachael", "Observer", NA,
2, "Sydekea", "Scorekeeper", 14,
2, "Gabriela", "Observer", NA,
3, "Derrick", "Observer", NA,
3, "Kara", "Scorekeeper", 9,
3, "Emily", "Observer", NA,
3, "Danielle", "Observer", NA
)
# The values are inconsistently missing by position within the group
# Use .direction = "downup" to fill missing values in both directions
squirrels %>%
dplyr::group_by(group) %>%
fill(n_squirrels, .direction = "downup") %>%
dplyr::ungroup()
#> # A tibble: 12 × 4
#> group name role n_squirrels
#> <dbl> <chr> <chr> <dbl>
#> 1 1 Sam Observer 8
#> 2 1 Mara Scorekeeper 8
#> 3 1 Jesse Observer 8
#> 4 1 Tom Observer 8
#> 5 2 Mike Observer 14
#> 6 2 Rachael Observer 14
#> 7 2 Sydekea Scorekeeper 14
#> 8 2 Gabriela Observer 14
#> 9 3 Derrick Observer 9
#> 10 3 Kara Scorekeeper 9
#> 11 3 Emily Observer 9
#> 12 3 Danielle Observer 9
# Using `.direction = "updown"` accomplishes the same goal in this example
相关用法
- R tidyr full_seq 在向量中创建完整的值序列
- R tidyr separate_rows 将折叠的列分成多行
- R tidyr extract 使用正则表达式组将字符列提取为多列
- R tidyr chop 砍伐和砍伐
- R tidyr pivot_longer_spec 使用规范将数据从宽转为长
- R tidyr unnest_longer 将列表列取消嵌套到行中
- R tidyr uncount “计数” DataFrame
- R tidyr cms_patient_experience 来自医疗保险和医疗补助服务中心的数据
- R tidyr pivot_wider_spec 使用规范将数据从长轴转向宽轴
- R tidyr replace_na 将 NA 替换为指定值
- R tidyr unnest_wider 将列表列取消嵌套到列中
- R tidyr nest 将行嵌套到 DataFrame 的列表列中
- R tidyr separate 使用正则表达式或数字位置将字符列分成多列
- R tidyr pivot_wider 将数据从长轴转向宽轴
- R tidyr nest_legacy Nest() 和 unnest() 的旧版本
- R tidyr separate_longer_delim 将字符串拆分为行
- R tidyr gather 将列收集到键值对中
- R tidyr hoist 将值提升到列表列之外
- R tidyr pivot_longer 将数据从宽转为长
- R tidyr pack 打包和拆包
- R tidyr separate_wider_delim 将字符串拆分为列
- R tidyr drop_na 删除包含缺失值的行
- R tidyr tidyr_legacy 旧名称修复
- R tidyr complete 完成缺少数据组合的 DataFrame
- R tidyr expand 扩展 DataFrame 以包含所有可能的值组合
注:本文由纯净天空筛选整理自Hadley Wickham等大神的英文原创作品 Fill in missing values with previous or next value。非经特殊声明,原始代码版权归原作者所有,本译文未经允许或授权,请勿转载或复制。