当前位置: 首页>>代码示例>>Python>>正文


Python Recommender.calc_neighbors方法代码示例

本文整理汇总了Python中recommender.Recommender.calc_neighbors方法的典型用法代码示例。如果您正苦于以下问题:Python Recommender.calc_neighbors方法的具体用法?Python Recommender.calc_neighbors怎么用?Python Recommender.calc_neighbors使用的例子?那么恭喜您, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在recommender.Recommender的用法示例。


在下文中一共展示了Recommender.calc_neighbors方法的2个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Python代码示例。

示例1: run

# 需要导入模块: from recommender import Recommender [as 别名]
# 或者: from recommender.Recommender import calc_neighbors [as 别名]
def run(source, target, num_topics = 100, passes = 20, lang = 'en', distance_measure = euclidean, percentage = 0.05):
	"""
	Main entry point for this package. Contains and executes the whole data pipeline. 

	Arguments:
	source -- The path string to the source file containing all reviews
	target -- The path string to the target directory where the neighbors for all users will be saved

	Keyword arguments:
	num_topics -- The number of topics LDA is supposed to discover (default 100)
	passes -- The number of iterations for the statistical inference algorithm (default 20)
	lang -- The language the reviews shall be sorted by (default 'en')
	distance_measure -- A python function that measures the distance between two vectors in a num_topics-dimensional vector space. 
				Must take two numpy arrays and return a float. (default euclidean)
	percentage -- The cutoff for being a close neighbor, i.e. two users are close if their distance is 
			within the closest percentage percent of all distances (default 0.05) 
	"""
	with open(source) as f:
		all_reviews = []
		for line in f:
			all_reviews.append(json.loads(line))

	reviews = filter_by_language(all_reviews, lang)

	rt = ReviewTokenizer(reviews)
	rt.tokenize()

	db = DictionaryBuilder(rt.tokenized_docs)
	db.build()

	dtmb = DTMBuilder(db.dictionary, db.srcTexts)
	dtmb.build()

	ldaw = LDAWrapper(dtmb.dtm, db.dictionary)
	ldaw.run(num_topics = num_topics, passes = passes)

	modelwrapper = LDAModelWrapper(ldaw.ldamodel, db.dictionary, sortByUsers(rt.tokenized_docs))
	posteriors = modelwrapper.get_all_posteriors()

	means = {}
	for key, value in posteriors.iteritems():
		means[key] = mean(value).tolist()

	x = Recommender(means)
	y = x.calc_distances(distance_measure)

	threshhold = fivePercent(y, percentage)

	for user in means.iterkeys():
		z = x.calc_neighbors(user, distance_measure, threshhold = threshhold)
		if len(target) > 0:
			fileName = target + '/' + user + '.json'
		else:
			fileName = user + '.json'
		with open(fileName, 'w') as g:
			json.dump(z, g) 
开发者ID:koschr,项目名称:ldaforyelpchallenge,代码行数:58,代码来源:__init__.py

示例2: euclidean

# 需要导入模块: from recommender import Recommender [as 别名]
# 或者: from recommender.Recommender import calc_neighbors [as 别名]
import json
from ldamodelwrapper import LdaModelWrapper as LMW
from gensim import corpora
import os
import numpy as np
from recommender import Recommender

def euclidean(x,y):   
    return np.sqrt(np.sum((x-y)**2))

userCurrPart = []
with open('parts/part5.json') as f:
	for line in f:
		dct = json.loads(line)
		key = dct.keys()[0]
		userCurrPart.append(key)

with open('means.json') as f:
	means = json.loads(f.read())

x = Recommender(means)
for user in userCurrPart:	
	y = x.calc_neighbors(user, euclidean, threshhold = 0.21)
	with open('close_neighbors/close_neighbors_neighbors_' + user + '.json', 'w') as f:
		json.dump(y, f)
	#neighbors[user] = y


开发者ID:julien-bergner,项目名称:yelp-challenge-api,代码行数:28,代码来源:build_neighbors_parts.py


注:本文中的recommender.Recommender.calc_neighbors方法示例由纯净天空整理自Github/MSDocs等开源代码及文档管理平台,相关代码片段筛选自各路编程大神贡献的开源项目,源码版权归原作者所有,传播和使用请参考对应项目的License;未经允许,请勿转载。