Python tf.keras.preprocessing.text.text_to_word_sequence用法及代碼示例

將文本轉換為單詞序列(或標記)。

用法

tf.keras.preprocessing.text.text_to_word_sequence(
    input_text,
    filters='!"#$%&()*+,-./:;<=>?@[\\]^_`{|}~\t\n',
    lower=True, split=' '
)

input_text 輸入文本(字符串)。
filters 要過濾掉的字符列表(或連接)，例如標點符號。默認值：'!"#$%&()*+,-./:;<=>?@[\]^_`{|}~\t\n'，包括基本標點符號、製表符和換行符。
lower 布爾值。是否將輸入轉換為小寫。
split str. 用於分詞的分隔符。

此函數將文本字符串轉換為單詞列表，同時忽略默認情況下包含標點符號的filters。

sample_text = 'This is a sample sentence.'
tf.keras.preprocessing.text.text_to_word_sequence(sample_text)
['this', 'is', 'a', 'sample', 'sentence']

相關用法

注：本文由純淨天空篩選整理自tensorflow.org大神的英文原創作品 tf.keras.preprocessing.text.text_to_word_sequence。非經特殊聲明，原始代碼版權歸原作者所有，本譯文未經允許或授權，請勿轉載或複製。