Python BigramAssocMeasures.raw_freq方法代码示例

本文整理汇总了Python中nltk.metrics.BigramAssocMeasures.raw_freq方法的典型用法代码示例。如果您正苦于以下问题：Python BigramAssocMeasures.raw_freq方法的具体用法？Python BigramAssocMeasures.raw_freq怎么用？Python BigramAssocMeasures.raw_freq使用的例子？那么, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在类nltk.metrics.BigramAssocMeasures的用法示例。

在下文中一共展示了BigramAssocMeasures.raw_freq方法的2个代码示例，这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞，您的评价将有助于系统推荐出更棒的Python代码示例。

示例1: demo

# 需要导入模块: from nltk.metrics import BigramAssocMeasures [as 别名]
# 或者: from nltk.metrics.BigramAssocMeasures import raw_freq [as 别名]
def demo(scorer=None, compare_scorer=None):
    """Finds bigram collocations in the files of the WebText corpus."""
    from nltk.metrics import BigramAssocMeasures, spearman_correlation, ranks_from_scores

    if scorer is None:
        scorer = BigramAssocMeasures.likelihood_ratio
    if compare_scorer is None:
        compare_scorer = BigramAssocMeasures.raw_freq

    from nltk.corpus import stopwords, webtext

    ignored_words = stopwords.words('english')
    word_filter = lambda w: len(w) < 3 or w.lower() in ignored_words

    for file in webtext.fileids():
        words = [word.lower()
                 for word in webtext.words(file)]

        cf = BigramCollocationFinder.from_words(words)
        cf.apply_freq_filter(3)
        cf.apply_word_filter(word_filter)

        corr = spearman_correlation(ranks_from_scores(cf.score_ngrams(scorer)),
                                    ranks_from_scores(cf.score_ngrams(compare_scorer)))
        print(file)
        print('\t', [' '.join(tup) for tup in cf.nbest(scorer, 15)])
        print('\t Correlation to %s: %0.4f' % (compare_scorer.__name__, corr))

# Slows down loading too much
# bigram_measures = BigramAssocMeasures()
# trigram_measures = TrigramAssocMeasures()

开发者ID:Thejas-1，项目名称:Price-Comparator，代码行数:33，代码来源:collocations.py

示例2: demo

# 需要导入模块: from nltk.metrics import BigramAssocMeasures [as 别名]
# 或者: from nltk.metrics.BigramAssocMeasures import raw_freq [as 别名]
def demo(scorer=None, compare_scorer=None):
    """Finds bigram collocations in the files of the WebText corpus."""
    from nltk.metrics import BigramAssocMeasures, spearman_correlation, ranks_from_scores

    if scorer is None:
        scorer = BigramAssocMeasures.likelihood_ratio
    if compare_scorer is None:
        compare_scorer = BigramAssocMeasures.raw_freq

    from nltk.corpus import stopwords, webtext

    ignored_words = stopwords.words('english')
    word_filter = lambda w: len(w) < 3 or w.lower() in ignored_words

    for file in webtext.fileids():
        words = [word.lower()
                 for word in webtext.words(file)]

        cf = BigramCollocationFinder.from_words(words)
        cf.apply_freq_filter(3)
        cf.apply_word_filter(word_filter)

        print(file)
        print('\t', [' '.join(tup) for tup in cf.nbest(scorer, 15)])
        print('\t Correlation to %s: %0.4f' % (compare_scorer.__name__,
                                               spearman_correlation(
                                                   ranks_from_scores(cf.score_ngrams(scorer)),
                                                   ranks_from_scores(cf.score_ngrams(compare_scorer)))))

# Slows down loading too much
# bigram_measures = BigramAssocMeasures()
# trigram_measures = TrigramAssocMeasures()

开发者ID:EastonLee，项目名称:FancyWord，代码行数:34，代码来源:collocations.py

注：本文中的nltk.metrics.BigramAssocMeasures.raw_freq方法示例由纯净天空整理自Github/MSDocs等开源代码及文档管理平台，相关代码片段筛选自各路编程大神贡献的开源项目，源码版权归原作者所有，传播和使用请参考对应项目的License；未经允许，请勿转载。