当前位置: 首页>>代码示例>>Python>>正文


Python text.ENGLISH_STOP_WORDS属性代码示例

本文整理汇总了Python中sklearn.feature_extraction.text.ENGLISH_STOP_WORDS属性的典型用法代码示例。如果您正苦于以下问题:Python text.ENGLISH_STOP_WORDS属性的具体用法?Python text.ENGLISH_STOP_WORDS怎么用?Python text.ENGLISH_STOP_WORDS使用的例子?那么恭喜您, 这里精选的属性代码示例或许可以为您提供帮助。您也可以进一步了解该属性所在sklearn.feature_extraction.text的用法示例。


在下文中一共展示了text.ENGLISH_STOP_WORDS属性的5个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Python代码示例。

示例1: test_get_stoplisted_unigram_corpus

# 需要导入模块: from sklearn.feature_extraction import text [as 别名]
# 或者: from sklearn.feature_extraction.text import ENGLISH_STOP_WORDS [as 别名]
def test_get_stoplisted_unigram_corpus(self):
        tdm = make_a_test_term_doc_matrix()
        uni_tdm = tdm.get_stoplisted_unigram_corpus()
        term_df = tdm.get_term_freq_df()
        uni_term_df = uni_tdm.get_term_freq_df()
        self.assertEqual(set(term for term in term_df.index
                             if ' ' not in term
                             and "'" not in term
                             and term not in ENGLISH_STOP_WORDS),
                         set(uni_term_df.index)), 
开发者ID:JasonKessler,项目名称:scattertext,代码行数:12,代码来源:test_TermDocMat.py

示例2: test_allow_single_quotes_in_unigrams

# 需要导入模块: from sklearn.feature_extraction import text [as 别名]
# 或者: from sklearn.feature_extraction.text import ENGLISH_STOP_WORDS [as 别名]
def test_allow_single_quotes_in_unigrams(self):
        tdm = make_a_test_term_doc_matrix()
        self.assertEqual(type(tdm.allow_single_quotes_in_unigrams()), type(tdm))
        uni_tdm = tdm.get_stoplisted_unigram_corpus()
        term_df = tdm.get_term_freq_df()
        uni_term_df = uni_tdm.get_term_freq_df()
        self.assertEqual(set(term for term in term_df.index
                             if ' ' not in term
                             and term not in ENGLISH_STOP_WORDS),
                         set(uni_term_df.index)), 
开发者ID:JasonKessler,项目名称:scattertext,代码行数:12,代码来源:test_TermDocMat.py

示例3: _assert_stoplisted_minus_joe

# 需要导入模块: from sklearn.feature_extraction import text [as 别名]
# 或者: from sklearn.feature_extraction.text import ENGLISH_STOP_WORDS [as 别名]
def _assert_stoplisted_minus_joe(self, tdm, uni_tdm):
        term_df = tdm.get_term_freq_df()
        uni_term_df = uni_tdm.get_term_freq_df()
        self.assertEqual(set(term for term in term_df.index
                             if ' ' not in term
                             and 'joe' != term.lower()
                             and "'" not in term
                             and term not in ENGLISH_STOP_WORDS),
                         set(uni_term_df.index)), 
开发者ID:JasonKessler,项目名称:scattertext,代码行数:11,代码来源:test_TermDocMat.py

示例4: test_countvectorizer_stop_words

# 需要导入模块: from sklearn.feature_extraction import text [as 别名]
# 或者: from sklearn.feature_extraction.text import ENGLISH_STOP_WORDS [as 别名]
def test_countvectorizer_stop_words():
    cv = CountVectorizer()
    cv.set_params(stop_words='english')
    assert_equal(cv.get_stop_words(), ENGLISH_STOP_WORDS)
    cv.set_params(stop_words='_bad_str_stop_')
    assert_raises(ValueError, cv.get_stop_words)
    cv.set_params(stop_words='_bad_unicode_stop_')
    assert_raises(ValueError, cv.get_stop_words)
    stoplist = ['some', 'other', 'words']
    cv.set_params(stop_words=stoplist)
    assert_equal(cv.get_stop_words(), set(stoplist)) 
开发者ID:PacktPublishing,项目名称:Mastering-Elasticsearch-7.0,代码行数:13,代码来源:test_text.py

示例5: _build_stop_words

# 需要导入模块: from sklearn.feature_extraction import text [as 别名]
# 或者: from sklearn.feature_extraction.text import ENGLISH_STOP_WORDS [as 别名]
def _build_stop_words(self) -> Set[str]:
        additional_stop_words = self.field.get_vectorizer_stop_words()
        if additional_stop_words:
            stop_words = set(ENGLISH_STOP_WORDS)
            stop_words.update(additional_stop_words)
            return stop_words
        else:
            return ENGLISH_STOP_WORDS 
开发者ID:LexPredict,项目名称:lexpredict-contraxsuite,代码行数:10,代码来源:field_types.py


注:本文中的sklearn.feature_extraction.text.ENGLISH_STOP_WORDS属性示例由纯净天空整理自Github/MSDocs等开源代码及文档管理平台,相关代码片段筛选自各路编程大神贡献的开源项目,源码版权归原作者所有,传播和使用请参考对应项目的License;未经允许,请勿转载。