本文整理汇总了Python中document.Document.tokenize方法的典型用法代码示例。如果您正苦于以下问题:Python Document.tokenize方法的具体用法?Python Document.tokenize怎么用?Python Document.tokenize使用的例子?那么恭喜您, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在类document.Document
的用法示例。
在下文中一共展示了Document.tokenize方法的1个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Python代码示例。
示例1: analyze
# 需要导入模块: from document import Document [as 别名]
# 或者: from document.Document import tokenize [as 别名]
def analyze(self, filename=None, text=None):
# analyze a new document using the stored values
# if there is a filename given, create a new Document object
if filename != None:
doc = Document(None, filename)
words = doc.tokenize()
# otherwise, analyze the given text
elif text != None:
words = Util.tokenize(text)
# if both are None, return error
else:
print "Analyzer requires a filename or text to analyze. Please try again."
return
# store dict of log value sums
log_sums = {}
# for every heuristic...
for key in self.log_values:
# initialize a value to 0
current_sum = 0.0
# iterate over words
for word in words:
current_sum += self.log_values[key].get(word)
# store new sum
log_sums[key] = current_sum
# calculate largest log sum; this can be improved by doing this inside above loop
# for clarity, we will add an extra loop here
# track largest sum
largest = -1.0
# track largest key
largest_heuristic = ""
# iterate through all the keys
for key in log_sums:
# if the new value is larger...
if log_sums[key] > largest:
# update values
largest = log_sums[key]
largest_heuristic = key
# return best key
return largest_heuristic