本文整理匯總了Python中disco.core.Job.params["fit_model"]方法的典型用法代碼示例。如果您正苦於以下問題:Python Job.params["fit_model"]方法的具體用法?Python Job.params["fit_model"]怎麽用?Python Job.params["fit_model"]使用的例子?那麽, 這裏精選的方法代碼示例或許可以為您提供幫助。您也可以進一步了解該方法所在類disco.core.Job
的用法示例。
在下文中一共展示了Job.params["fit_model"]方法的1個代碼示例,這些例子默認根據受歡迎程度排序。您可以為喜歡或者感覺有用的代碼點讚,您的評價將有助於係統推薦出更棒的Python代碼示例。
示例1: predict
# 需要導入模塊: from disco.core import Job [as 別名]
# 或者: from disco.core.Job import params["fit_model"] [as 別名]
def predict(dataset, fitmodel_url, m=1, save_results=True, show=False):
"""
Function starts a job that makes predictions to input data with a given model
Parameters
----------
input - dataset object with input urls and other parameters
fitmodel_url - model created in fit phase
m - m estimate is used with discrete features
save_results - save results to ddfs
show - show info about job execution
Returns
-------
Urls of predictions on ddfs
"""
from disco.worker.pipeline.worker import Worker, Stage
from disco.core import Job, result_iterator
import numpy as np
try:
m = float(m)
except ValueError:
raise Exception("Parameter m should be numerical.")
if "naivebayes_fitmodel" in fitmodel_url:
# fit model is loaded from ddfs
fit_model = dict((k, v) for k, v in result_iterator(fitmodel_url["naivebayes_fitmodel"]))
if len(fit_model["y_labels"]) < 2:
print "There is only one class in training data."
return []
else:
raise Exception("Incorrect fit model.")
if dataset.params["X_meta"].count("d") > 0: # if there are discrete features in the model
# code calculates logarithms to optimize predict phase as opposed to calculation by every mapped.
np.seterr(divide='ignore')
for iv in fit_model["iv"]:
dist = [fit_model.pop((y,) + iv, 0) for y in fit_model["y_labels"]]
fit_model[iv] = np.nan_to_num(
np.log(np.true_divide(np.array(dist) + m * fit_model["prior"], np.sum(dist) + m))) - fit_model[
"prior_log"]
del (fit_model["iv"])
# define a job and set save of results to ddfs
job = Job(worker=Worker(save_results=save_results))
# job parallelizes execution of mappers
job.pipeline = [
("split", Stage("map", input_chain=dataset.params["input_chain"], init=simple_init, process=map_predict))]
job.params = dataset.params # job parameters (dataset object)
job.params["fit_model"] = fit_model
# define name of a job and input data urls
job.run(name="naivebayes_predict", input=dataset.params["data_tag"])
results = job.wait(show=show)
return results