R stringr str_extract 提取完整的匹配項

str_extract() 從每個字符串中提取第一個完整匹配項，str_extract_all() 從每個字符串中提取所有匹配項。

用法

str_extract(string, pattern, group = NULL)

str_extract_all(string, pattern, simplify = FALSE)

參數

string

輸入向量。或者是一個字符向量，或者是可強製轉換為一個的東西。

pattern

要尋找的模式。

默認解釋是正則表達式，如 vignette("regular-expressions") 中所述。使用regex() 可以更好地控製匹配行為。

使用 fixed() 匹配固定字符串(即僅比較字節)。這很快，但是是近似值。一般來說，為了匹配人類文本，您需要coll()，它尊重指定區域設置的字符匹配規則。

將字符、單詞、行和句子邊界與 boundary() 匹配。空模式“”相當於 boundary("character") 。

group

如果提供，將返回指定捕獲組中的匹配文本，而不是返回完整的匹配項。

simplify

一個布爾值。

FALSE(默認值)：返回字符向量列表。
TRUE ：返回字符矩陣。

值

str_extract() ：與 string /pattern 長度相同的字符向量。
str_extract_all() ：與 string /pattern 長度相同的字符向量列表。

也可以看看

str_match() 提取匹配組； stringi::stri_extract() 用於底層實現。

例子

shopping_list <- c("apples x4", "bag of flour", "bag of sugar", "milk x2")
str_extract(shopping_list, "\\d")
#> [1] "4" NA  NA  "2"
str_extract(shopping_list, "[a-z]+")
#> [1] "apples" "bag"    "bag"    "milk"  
str_extract(shopping_list, "[a-z]{1,4}")
#> [1] "appl" "bag"  "bag"  "milk"
str_extract(shopping_list, "\\b[a-z]{1,4}\\b")
#> [1] NA     "bag"  "bag"  "milk"

str_extract(shopping_list, "([a-z]+) of ([a-z]+)")
#> [1] NA             "bag of flour" "bag of sugar" NA            
str_extract(shopping_list, "([a-z]+) of ([a-z]+)", group = 1)
#> [1] NA    "bag" "bag" NA   
str_extract(shopping_list, "([a-z]+) of ([a-z]+)", group = 2)
#> [1] NA      "flour" "sugar" NA     

# Extract all matches
str_extract_all(shopping_list, "[a-z]+")
#> [[1]]
#> [1] "apples" "x"     
#> 
#> [[2]]
#> [1] "bag"   "of"    "flour"
#> 
#> [[3]]
#> [1] "bag"   "of"    "sugar"
#> 
#> [[4]]
#> [1] "milk" "x"   
#> 
str_extract_all(shopping_list, "\\b[a-z]+\\b")
#> [[1]]
#> [1] "apples"
#> 
#> [[2]]
#> [1] "bag"   "of"    "flour"
#> 
#> [[3]]
#> [1] "bag"   "of"    "sugar"
#> 
#> [[4]]
#> [1] "milk"
#> 
str_extract_all(shopping_list, "\\d")
#> [[1]]
#> [1] "4"
#> 
#> [[2]]
#> character(0)
#> 
#> [[3]]
#> character(0)
#> 
#> [[4]]
#> [1] "2"
#> 

# Simplify results into character matrix
str_extract_all(shopping_list, "\\b[a-z]+\\b", simplify = TRUE)
#>      [,1]     [,2] [,3]   
#> [1,] "apples" ""   ""     
#> [2,] "bag"    "of" "flour"
#> [3,] "bag"    "of" "sugar"
#> [4,] "milk"   ""   ""     
str_extract_all(shopping_list, "\\d", simplify = TRUE)
#>      [,1]
#> [1,] "4" 
#> [2,] ""  
#> [3,] ""  
#> [4,] "2" 

# Extract all words
str_extract_all("This is, suprisingly, a sentence.", boundary("word"))
#> [[1]]
#> [1] "This"        "is"          "suprisingly" "a"           "sentence"   
#>

源代碼：R/extract.R

相關用法

注：本文由純淨天空篩選整理自Hadley Wickham等大神的英文原創作品 Extract the complete match。非經特殊聲明，原始代碼版權歸原作者所有，本譯文未經允許或授權，請勿轉載或複製。