当前位置: 首页>>代码示例 >>用法及示例精选 >>正文


R stringr str_extract 提取完整的匹配项


str_extract() 从每个字符串中提取第一个完整匹配项,str_extract_all() 从每个字符串中提取所有匹配项。

用法

str_extract(string, pattern, group = NULL)

str_extract_all(string, pattern, simplify = FALSE)

参数

string

输入向量。或者是一个字符向量,或者是可强制转换为一个的东西。

pattern

要寻找的模式。

默认解释是正则表达式,如 vignette("regular-expressions") 中所述。使用regex() 可以更好地控制匹配行为。

使用 fixed() 匹配固定字符串(即仅比较字节)。这很快,但是是近似值。一般来说,为了匹配人类文本,您需要coll(),它尊重指定区域设置的字符匹配规则。

将字符、单词、行和句子边界与 boundary() 匹配。空模式“”相当于 boundary("character")

group

如果提供,将返回指定捕获组中的匹配文本,而不是返回完整的匹配项。

simplify

一个布尔值。

  • FALSE(默认值):返回字符向量列表。

  • TRUE :返回字符矩阵。

  • str_extract() :与 string /pattern 长度相同的字符向量。

  • str_extract_all() :与 string /pattern 长度相同的字符向量列表。

也可以看看

str_match() 提取匹配组; stringi::stri_extract() 用于底层实现。

例子

shopping_list <- c("apples x4", "bag of flour", "bag of sugar", "milk x2")
str_extract(shopping_list, "\\d")
#> [1] "4" NA  NA  "2"
str_extract(shopping_list, "[a-z]+")
#> [1] "apples" "bag"    "bag"    "milk"  
str_extract(shopping_list, "[a-z]{1,4}")
#> [1] "appl" "bag"  "bag"  "milk"
str_extract(shopping_list, "\\b[a-z]{1,4}\\b")
#> [1] NA     "bag"  "bag"  "milk"

str_extract(shopping_list, "([a-z]+) of ([a-z]+)")
#> [1] NA             "bag of flour" "bag of sugar" NA            
str_extract(shopping_list, "([a-z]+) of ([a-z]+)", group = 1)
#> [1] NA    "bag" "bag" NA   
str_extract(shopping_list, "([a-z]+) of ([a-z]+)", group = 2)
#> [1] NA      "flour" "sugar" NA     

# Extract all matches
str_extract_all(shopping_list, "[a-z]+")
#> [[1]]
#> [1] "apples" "x"     
#> 
#> [[2]]
#> [1] "bag"   "of"    "flour"
#> 
#> [[3]]
#> [1] "bag"   "of"    "sugar"
#> 
#> [[4]]
#> [1] "milk" "x"   
#> 
str_extract_all(shopping_list, "\\b[a-z]+\\b")
#> [[1]]
#> [1] "apples"
#> 
#> [[2]]
#> [1] "bag"   "of"    "flour"
#> 
#> [[3]]
#> [1] "bag"   "of"    "sugar"
#> 
#> [[4]]
#> [1] "milk"
#> 
str_extract_all(shopping_list, "\\d")
#> [[1]]
#> [1] "4"
#> 
#> [[2]]
#> character(0)
#> 
#> [[3]]
#> character(0)
#> 
#> [[4]]
#> [1] "2"
#> 

# Simplify results into character matrix
str_extract_all(shopping_list, "\\b[a-z]+\\b", simplify = TRUE)
#>      [,1]     [,2] [,3]   
#> [1,] "apples" ""   ""     
#> [2,] "bag"    "of" "flour"
#> [3,] "bag"    "of" "sugar"
#> [4,] "milk"   ""   ""     
str_extract_all(shopping_list, "\\d", simplify = TRUE)
#>      [,1]
#> [1,] "4" 
#> [2,] ""  
#> [3,] ""  
#> [4,] "2" 

# Extract all words
str_extract_all("This is, suprisingly, a sentence.", boundary("word"))
#> [[1]]
#> [1] "This"        "is"          "suprisingly" "a"           "sentence"   
#> 
源代码:R/extract.R

相关用法


注:本文由纯净天空筛选整理自Hadley Wickham等大神的英文原创作品 Extract the complete match。非经特殊声明,原始代码版权归原作者所有,本译文未经允许或授权,请勿转载或复制。