当前位置: 首页>>代码示例>>Python>>正文


Python Feature.tfidf方法代码示例

本文整理汇总了Python中feature.Feature.tfidf方法的典型用法代码示例。如果您正苦于以下问题:Python Feature.tfidf方法的具体用法?Python Feature.tfidf怎么用?Python Feature.tfidf使用的例子?那么恭喜您, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在feature.Feature的用法示例。


在下文中一共展示了Feature.tfidf方法的1个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Python代码示例。

示例1: showResults

# 需要导入模块: from feature import Feature [as 别名]
# 或者: from feature.Feature import tfidf [as 别名]
def showResults(request):
    global QUERY
    global RET_ANS
    query = request.GET['query']
    query = query.encode('UTF-8')
    if query == QUERY:
        return JsonResponse(RET_ANS, safe=False)
    else:
        QUERY = query
        #words = jieba.cut_for_search(query) #搜索分词
        ch_q = jieba.cut(query) #精准模式的分词
        kw_ch = [i for i in ch_q]
        tag_obj = TagDict.objects.filter(tag_ch__in = kw_ch) #
        cujiansuo = sum([tag.tag_class for tag in tag_obj], [])
        kw_en = [tag.tag_en for tag in tag_obj] #存储关键词;
        cujiansuo_res = sorted(set(cujiansuo), key=cujiansuo.index)
        qa_obj = QuestionAnswer.objects.filter(id__in=cujiansuo_res)
        print len(qa_obj)
        kw_en_len = len(kw_en)
        count_en = [0]*kw_en_len
        res_list = [] #最终返回的列表;
        kw = kw_en
        for item in qa_obj:
            q = item.question.lower()
            a = item.answer.lower()
            for i in range(kw_en_len):
                k = kw_en[i]
                if k in q or k in a:
                    count_en[i] += 1
            if Is_rela(q, kw_en):
                item_t = [item.id, item.question, ret_em(kw, item.answer), item.answer]
                res_list.append(item_t)
        D = len(res_list)
        Idf = []
        if not D == 0:
            Idf = [abs(math.log(D/float(t+1))) for t in count_en]
        theta1 = 1.0
        theta2 = 1.0
        theta3 = 1.0
        mmax = 0.0
        an_b = None
        for item in res_list:
            ans_sen = nltk.sent_tokenize(item[3])
            en_a = sum([nltk.word_tokenize(t) for t in ans_sen],[])
            en_q =  nltk.word_tokenize(item[1])
            socre_f = Feature(kw_ch, en_q, kw, en_a, Idf)
            sorce = theta1 * socre_f.length_feature() + sum(map(lambda(x):x*theta2, socre_f.word_feature())) + sum(map(lambda(x):x*theta3, socre_f.tfidf()))
            if mmax < sorce:
                mmax = sorce
                an_b = item
        RET_ANS = res_list
        write_file(res_list, query)
        return JsonResponse(RET_ANS, safe=False)
开发者ID:GrittyChen,项目名称:HealthSearch,代码行数:55,代码来源:views.py


注:本文中的feature.Feature.tfidf方法示例由纯净天空整理自Github/MSDocs等开源代码及文档管理平台,相关代码片段筛选自各路编程大神贡献的开源项目,源码版权归原作者所有,传播和使用请参考对应项目的License;未经允许,请勿转载。