當前位置: 首頁>>代碼示例 >>用法及示例精選 >>正文


R stringr str_split 將字符串分成幾段


這些函數的主要區別在於其輸入和輸出類型:

  • str_split() 接受一個字符向量並返回一個列表。

  • str_split_1() 接受單個字符串並返回字符向量。

  • str_split_fixed() 接受一個字符向量並返回一個矩陣。

  • str_split_i() 接受一個字符向量並返回一個字符向量。

用法

str_split(string, pattern, n = Inf, simplify = FALSE)

str_split_1(string, pattern)

str_split_fixed(string, pattern, n)

str_split_i(string, pattern, i)

參數

string

輸入向量。或者是一個字符向量,或者是可強製轉換為一個的東西。

pattern

要尋找的模式。

默認解釋是正則表達式,如 vignette("regular-expressions") 中所述。使用regex() 可以更好地控製匹配行為。

使用 fixed() 匹配固定字符串(即僅比較字節)。這很快,但是是近似值。一般來說,為了匹配人類文本,您需要coll(),它尊重指定區域設置的字符匹配規則。

將字符、單詞、行和句子邊界與 boundary() 匹配。空模式“”相當於 boundary("character")

n

返回的最大件數。默認 (Inf) 使用所有可能的分割位置。

對於 split_split() ,這決定了輸出的每個元素的最大長度。對於 str_split_fixed() ,這決定了輸出中的列數;如果輸入太短,結果將用 "" 填充。

simplify

一個布爾值。

  • FALSE(默認值):返回字符向量列表。

  • TRUE :返回字符矩陣。

i

要返回的元素。使用負值從右側開始計數。

  • str_split_1():字符向量。

  • str_split() :與 string /pattern 長度相同的列表,包含字符向量。

  • str_split_fixed() :具有 n 列和與 string /pattern 長度相同的行數的字符矩陣。

  • str_split_i() :與 string /pattern 長度相同的字符向量。

也可以看看

stri_split() 用於底層實現。

例子

fruits <- c(
  "apples and oranges and pears and bananas",
  "pineapples and mangos and guavas"
)

str_split(fruits, " and ")
#> [[1]]
#> [1] "apples"  "oranges" "pears"   "bananas"
#> 
#> [[2]]
#> [1] "pineapples" "mangos"     "guavas"    
#> 
str_split(fruits, " and ", simplify = TRUE)
#>      [,1]         [,2]      [,3]     [,4]     
#> [1,] "apples"     "oranges" "pears"  "bananas"
#> [2,] "pineapples" "mangos"  "guavas" ""       

# If you want to split a single string, use `str_split1`
str_split_1(fruits[[1]], " and ")
#> [1] "apples"  "oranges" "pears"   "bananas"

# Specify n to restrict the number of possible matches
str_split(fruits, " and ", n = 3)
#> [[1]]
#> [1] "apples"            "oranges"           "pears and bananas"
#> 
#> [[2]]
#> [1] "pineapples" "mangos"     "guavas"    
#> 
str_split(fruits, " and ", n = 2)
#> [[1]]
#> [1] "apples"                        "oranges and pears and bananas"
#> 
#> [[2]]
#> [1] "pineapples"        "mangos and guavas"
#> 
# If n greater than number of pieces, no padding occurs
str_split(fruits, " and ", n = 5)
#> [[1]]
#> [1] "apples"  "oranges" "pears"   "bananas"
#> 
#> [[2]]
#> [1] "pineapples" "mangos"     "guavas"    
#> 

# Use fixed to return a character matrix
str_split_fixed(fruits, " and ", 3)
#>      [,1]         [,2]      [,3]               
#> [1,] "apples"     "oranges" "pears and bananas"
#> [2,] "pineapples" "mangos"  "guavas"           
str_split_fixed(fruits, " and ", 4)
#>      [,1]         [,2]      [,3]     [,4]     
#> [1,] "apples"     "oranges" "pears"  "bananas"
#> [2,] "pineapples" "mangos"  "guavas" ""       

# str_split_i extracts only a single piece from a string
str_split_i(fruits, " and ", 1)
#> [1] "apples"     "pineapples"
str_split_i(fruits, " and ", 4)
#> [1] "bananas" NA       
# use a negative number to select from the end
str_split_i(fruits, " and ", -1)
#> [1] "bananas" "guavas" 
源代碼:R/split.R

相關用法


注:本文由純淨天空篩選整理自Hadley Wickham等大神的英文原創作品 Split up a string into pieces。非經特殊聲明,原始代碼版權歸原作者所有,本譯文未經允許或授權,請勿轉載或複製。