当前位置: 首页>>代码示例>>Python>>正文


Python Phrases.export_phrases方法代码示例

本文整理汇总了Python中gensim.models.phrases.Phrases.export_phrases方法的典型用法代码示例。如果您正苦于以下问题:Python Phrases.export_phrases方法的具体用法?Python Phrases.export_phrases怎么用?Python Phrases.export_phrases使用的例子?那么恭喜您, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在gensim.models.phrases.Phrases的用法示例。


在下文中一共展示了Phrases.export_phrases方法的7个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Python代码示例。

示例1: testScoringDefault

# 需要导入模块: from gensim.models.phrases import Phrases [as 别名]
# 或者: from gensim.models.phrases.Phrases import export_phrases [as 别名]
    def testScoringDefault(self):
        """ test the default scoring, from the mikolov word2vec paper """
        bigram = Phrases(self.sentences, min_count=1, threshold=1, common_terms=self.common_terms)

        seen_scores = set()

        test_sentences = [['data', 'and', 'graph', 'survey', 'for', 'human', 'interface']]
        for phrase, score in bigram.export_phrases(test_sentences):
            seen_scores.add(round(score, 3))

        min_count = float(bigram.min_count)
        len_vocab = float(len(bigram.vocab))
        graph = float(bigram.vocab[b"graph"])
        data = float(bigram.vocab[b"data"])
        data_and_graph = float(bigram.vocab[b"data_and_graph"])
        human = float(bigram.vocab[b"human"])
        interface = float(bigram.vocab[b"interface"])
        human_interface = float(bigram.vocab[b"human_interface"])

        assert seen_scores == set([
            # score for data and graph
            round((data_and_graph - min_count) / data / graph * len_vocab, 3),
            # score for human interface
            round((human_interface - min_count) / human / interface * len_vocab, 3),
        ])
开发者ID:lopusz,项目名称:gensim,代码行数:27,代码来源:test_phrases.py

示例2: testMultipleBigramsSingleEntry

# 需要导入模块: from gensim.models.phrases import Phrases [as 别名]
# 或者: from gensim.models.phrases.Phrases import export_phrases [as 别名]
    def testMultipleBigramsSingleEntry(self):
        """ a single entry should produce multiple bigrams. """
        bigram = Phrases(self.sentences, min_count=1, threshold=1)
        seen_bigrams = set()

        test_sentences = [['graph', 'minors', 'survey', 'human', 'interface']]
        for phrase, score in bigram.export_phrases(test_sentences):
            seen_bigrams.add(phrase)

        assert seen_bigrams == {b'graph minors', b'human interface'}
开发者ID:lopusz,项目名称:gensim,代码行数:12,代码来源:test_phrases.py

示例3: testExportPhrases

# 需要导入模块: from gensim.models.phrases import Phrases [as 别名]
# 或者: from gensim.models.phrases.Phrases import export_phrases [as 别名]
    def testExportPhrases(self):
        """Test Phrases bigram export_phrases functionality."""
        bigram = Phrases(sentences, min_count=1, threshold=1)

        seen_bigrams = set()

        for phrase, score in bigram.export_phrases(sentences):
            seen_bigrams.add(phrase)

        assert seen_bigrams == {b'response time', b'graph minors', b'human interface'}
开发者ID:rmalouf,项目名称:gensim,代码行数:12,代码来源:test_phrases.py

示例4: testCustomScorer

# 需要导入模块: from gensim.models.phrases import Phrases [as 别名]
# 或者: from gensim.models.phrases.Phrases import export_phrases [as 别名]
    def testCustomScorer(self):
        """ test using a custom scoring function """

        bigram = Phrases(self.sentences, min_count=1, threshold=.001, scoring=dumb_scorer)

        seen_scores = []
        test_sentences = [['graph', 'minors', 'survey', 'human', 'interface', 'system']]
        for phrase, score in bigram.export_phrases(test_sentences):
            seen_scores.append(score)

        assert all(seen_scores)  # all scores 1
        assert len(seen_scores) == 3  # 'graph minors' and 'survey human' and 'interface system'
开发者ID:lopusz,项目名称:gensim,代码行数:14,代码来源:test_phrases.py

示例5: testScoringNpmi

# 需要导入模块: from gensim.models.phrases import Phrases [as 别名]
# 或者: from gensim.models.phrases.Phrases import export_phrases [as 别名]
    def testScoringNpmi(self):
        """ test normalized pointwise mutual information scoring """
        bigram = Phrases(self.sentences, min_count=1, threshold=.5, scoring='npmi')

        seen_scores = set()
        test_sentences = [['graph', 'minors', 'survey', 'human', 'interface']]
        for phrase, score in bigram.export_phrases(test_sentences):
            seen_scores.add(round(score, 3))

        assert seen_scores == {
            .882,  # score for graph minors
            .714  # score for human interface
        }
开发者ID:lopusz,项目名称:gensim,代码行数:15,代码来源:test_phrases.py

示例6: testExportPhrases

# 需要导入模块: from gensim.models.phrases import Phrases [as 别名]
# 或者: from gensim.models.phrases.Phrases import export_phrases [as 别名]
    def testExportPhrases(self):
        """Test Phrases bigram export_phrases functionality."""
        bigram = Phrases(self.sentences, min_count=1, threshold=1, common_terms=self.common_terms)

        seen_bigrams = set()

        for phrase, score in bigram.export_phrases(self.sentences):
            seen_bigrams.add(phrase)

        assert seen_bigrams == set([
            b'human interface',
            b'graph of trees',
            b'data and graph',
            b'lack of interest',
        ])
开发者ID:lopusz,项目名称:gensim,代码行数:17,代码来源:test_phrases.py

示例7: testExportPhrases

# 需要导入模块: from gensim.models.phrases import Phrases [as 别名]
# 或者: from gensim.models.phrases.Phrases import export_phrases [as 别名]
 def testExportPhrases(self):
     """Test Phrases bigram export_phrases functionality."""
     bigram = Phrases(sentences, min_count=1, threshold=1)
     
     # with this setting we should get response_time and graph_minors
     bigram1_seen = False
     bigram2_seen = False
     
     for phrase, score in bigram.export_phrases(sentences):
         if not bigram1_seen and b'response time' == phrase:
             bigram1_seen = True
         elif not bigram2_seen and b'graph minors' == phrase:
             bigram2_seen = True
         if bigram1_seen and bigram2_seen:
             break
     
     self.assertTrue(bigram1_seen)
     self.assertTrue(bigram2_seen)
开发者ID:ArifAhmed1995,项目名称:gensim,代码行数:20,代码来源:test_phrases.py


注:本文中的gensim.models.phrases.Phrases.export_phrases方法示例由纯净天空整理自Github/MSDocs等开源代码及文档管理平台,相关代码片段筛选自各路编程大神贡献的开源项目,源码版权归原作者所有,传播和使用请参考对应项目的License;未经允许,请勿转载。