R tidyr fill 用上一个或下一个值填充缺失值

使用下一个或上一个条目填充选定列中的缺失值。这在通用输出格式中非常有用，其中值不重复，并且仅在它们更改时才记录。

用法

fill(data, ..., .direction = c("down", "up", "downup", "updown"))

参数

data: 一个 DataFrame 。
...: < tidy-select > 要填充的列。
.direction: 填充缺失值的方向。目前是"down"(默认)、"up"、"downup"(即先向下然后向上)或"updown"(先向上然后向下)。

细节

缺失值在原子向量中被替换； NULL 在列表中被替换。

分组 DataFrame

对于 dplyr::group_by() 创建的分组 DataFrame ，fill() 将应用于每个组内，这意味着它不会填充跨越组边界。

例子

# direction = "down" --------------------------------------------------------
# Value (year) is recorded only when it changes
sales <- tibble::tribble(
  ~quarter, ~year, ~sales,
  "Q1",    2000,    66013,
  "Q2",      NA,    69182,
  "Q3",      NA,    53175,
  "Q4",      NA,    21001,
  "Q1",    2001,    46036,
  "Q2",      NA,    58842,
  "Q3",      NA,    44568,
  "Q4",      NA,    50197,
  "Q1",    2002,    39113,
  "Q2",      NA,    41668,
  "Q3",      NA,    30144,
  "Q4",      NA,    52897,
  "Q1",    2004,    32129,
  "Q2",      NA,    67686,
  "Q3",      NA,    31768,
  "Q4",      NA,    49094
)
# `fill()` defaults to replacing missing data from top to bottom
sales %>% fill(year)
#> # A tibble: 16 × 3
#>    quarter  year sales
#>    <chr>   <dbl> <dbl>
#>  1 Q1       2000 66013
#>  2 Q2       2000 69182
#>  3 Q3       2000 53175
#>  4 Q4       2000 21001
#>  5 Q1       2001 46036
#>  6 Q2       2001 58842
#>  7 Q3       2001 44568
#>  8 Q4       2001 50197
#>  9 Q1       2002 39113
#> 10 Q2       2002 41668
#> 11 Q3       2002 30144
#> 12 Q4       2002 52897
#> 13 Q1       2004 32129
#> 14 Q2       2004 67686
#> 15 Q3       2004 31768
#> 16 Q4       2004 49094

# direction = "up" ----------------------------------------------------------
# Value (pet_type) is missing above
tidy_pets <- tibble::tribble(
  ~rank, ~pet_type, ~breed,
  1L,        NA,    "Boston Terrier",
  2L,        NA,    "Retrievers (Labrador)",
  3L,        NA,    "Retrievers (Golden)",
  4L,        NA,    "French Bulldogs",
  5L,        NA,    "Bulldogs",
  6L,     "Dog",    "Beagles",
  1L,        NA,    "Persian",
  2L,        NA,    "Maine Coon",
  3L,        NA,    "Ragdoll",
  4L,        NA,    "Exotic",
  5L,        NA,    "Siamese",
  6L,     "Cat",    "American Short"
)

# For values that are missing above you can use `.direction = "up"`
tidy_pets %>%
  fill(pet_type, .direction = "up")
#> # A tibble: 12 × 3
#>     rank pet_type breed                
#>    <int> <chr>    <chr>                
#>  1     1 Dog      Boston Terrier       
#>  2     2 Dog      Retrievers (Labrador)
#>  3     3 Dog      Retrievers (Golden)  
#>  4     4 Dog      French Bulldogs      
#>  5     5 Dog      Bulldogs             
#>  6     6 Dog      Beagles              
#>  7     1 Cat      Persian              
#>  8     2 Cat      Maine Coon           
#>  9     3 Cat      Ragdoll              
#> 10     4 Cat      Exotic               
#> 11     5 Cat      Siamese              
#> 12     6 Cat      American Short       

# direction = "downup" ------------------------------------------------------
# Value (n_squirrels) is missing above and below within a group
squirrels <- tibble::tribble(
  ~group,    ~name,     ~role,     ~n_squirrels,
  1,      "Sam",    "Observer",   NA,
  1,     "Mara", "Scorekeeper",    8,
  1,    "Jesse",    "Observer",   NA,
  1,      "Tom",    "Observer",   NA,
  2,     "Mike",    "Observer",   NA,
  2,  "Rachael",    "Observer",   NA,
  2,  "Sydekea", "Scorekeeper",   14,
  2, "Gabriela",    "Observer",   NA,
  3,  "Derrick",    "Observer",   NA,
  3,     "Kara", "Scorekeeper",    9,
  3,    "Emily",    "Observer",   NA,
  3, "Danielle",    "Observer",   NA
)

# The values are inconsistently missing by position within the group
# Use .direction = "downup" to fill missing values in both directions
squirrels %>%
  dplyr::group_by(group) %>%
  fill(n_squirrels, .direction = "downup") %>%
  dplyr::ungroup()
#> # A tibble: 12 × 4
#>    group name     role        n_squirrels
#>    <dbl> <chr>    <chr>             <dbl>
#>  1     1 Sam      Observer              8
#>  2     1 Mara     Scorekeeper           8
#>  3     1 Jesse    Observer              8
#>  4     1 Tom      Observer              8
#>  5     2 Mike     Observer             14
#>  6     2 Rachael  Observer             14
#>  7     2 Sydekea  Scorekeeper          14
#>  8     2 Gabriela Observer             14
#>  9     3 Derrick  Observer              9
#> 10     3 Kara     Scorekeeper           9
#> 11     3 Emily    Observer              9
#> 12     3 Danielle Observer              9

# Using `.direction = "updown"` accomplishes the same goal in this example

源代码：R/fill.R

相关用法

注：本文由纯净天空筛选整理自Hadley Wickham等大神的英文原创作品 Fill in missing values with previous or next value。非经特殊声明，原始代码版权归原作者所有，本译文未经允许或授权，请勿转载或复制。