当前位置: 首页>>代码示例>>Python>>正文


Python Corpus.full_targets方法代码示例

本文整理汇总了Python中corpus.Corpus.full_targets方法的典型用法代码示例。如果您正苦于以下问题:Python Corpus.full_targets方法的具体用法?Python Corpus.full_targets怎么用?Python Corpus.full_targets使用的例子?那么恭喜您, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在corpus.Corpus的用法示例。


在下文中一共展示了Corpus.full_targets方法的1个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Python代码示例。

示例1: process_corpus

# 需要导入模块: from corpus import Corpus [as 别名]
# 或者: from corpus.Corpus import full_targets [as 别名]
def process_corpus(tr_in_filename, te_in_filename, u_in_filename,
                   tr_out_filename, te_out_filename, u_out_filename):
    input_f = open(tr_in_filename, 'r')
    tr_original_corpus = pickle.load(input_f)
    input_f.close()

    input_f = open(te_in_filename, 'r')
    te_original_corpus = pickle.load(input_f)
    input_f.close()

    input_f = open(u_in_filename, 'r')
    u_original_corpus = pickle.load(input_f)
    input_f.close()
    tr_instances = [d['question'] for d in tr_original_corpus
                    if '' not in d['target']]
    te_instances = [d['question'] for d in te_original_corpus
                    if '' not in d['target']]
    u_instances = [d['question'] for d in u_original_corpus
                   if ((not 'target' in d) or '' not in d['target'])]

    vect = get_features()
    vect.fit(tr_instances + te_instances + u_instances)
    v_instances = vect.transform(tr_instances + te_instances + u_instances)
    v_instances = csr_matrix(v_instances > 0, dtype=int8)
    print v_instances.shape

    tr_corpus = Corpus()
    tr_corpus.instances = v_instances[:len(tr_instances)]
    tr_corpus.full_targets = [d['target'] for d in tr_original_corpus
                              if '' not in d['target']]
    tr_corpus.representations = [_get_repr(i[0]) for i in tr_instances]
    tr_corpus._features_vectorizer = vect
    tr_corpus.save_to_file(tr_out_filename)

    te_corpus = Corpus()
    te_corpus.instances = v_instances[:len(te_instances)]
    te_corpus.full_targets = [d['target'] for d in te_original_corpus
                              if '' not in d['target']]
    te_corpus.representations = [_get_repr(i[0]) for i in te_instances]
    te_corpus._features_vectorizer = vect
    te_corpus.save_to_file(te_out_filename)

    u_corpus = Corpus()
    u_corpus.instances = v_instances[:len(u_instances)]
    u_corpus.full_targets = [d['target']
                             if ('target' in d and '' not in d['target']) else []
                             for d in u_original_corpus]
    u_corpus.representations = [_get_repr(i[0]) for i in u_instances]
    u_corpus._features_vectorizer = vect
    u_corpus.save_to_file(u_out_filename)
开发者ID:lucianosilvi,项目名称:mit0110_tesis,代码行数:52,代码来源:build_corpus.py


注:本文中的corpus.Corpus.full_targets方法示例由纯净天空整理自Github/MSDocs等开源代码及文档管理平台,相关代码片段筛选自各路编程大神贡献的开源项目,源码版权归原作者所有,传播和使用请参考对应项目的License;未经允许,请勿转载。