當前位置: 首頁>>代碼示例>>Python>>正文


Python TfidfVectorizer.predict方法代碼示例

本文整理匯總了Python中sklearn.feature_extraction.text.TfidfVectorizer.predict方法的典型用法代碼示例。如果您正苦於以下問題:Python TfidfVectorizer.predict方法的具體用法?Python TfidfVectorizer.predict怎麽用?Python TfidfVectorizer.predict使用的例子?那麽, 這裏精選的方法代碼示例或許可以為您提供幫助。您也可以進一步了解該方法所在sklearn.feature_extraction.text.TfidfVectorizer的用法示例。


在下文中一共展示了TfidfVectorizer.predict方法的2個代碼示例,這些例子默認根據受歡迎程度排序。您可以為喜歡或者感覺有用的代碼點讚,您的評價將有助於係統推薦出更棒的Python代碼示例。

示例1: getHistoricalVolatility

# 需要導入模塊: from sklearn.feature_extraction.text import TfidfVectorizer [as 別名]
# 或者: from sklearn.feature_extraction.text.TfidfVectorizer import predict [as 別名]
    sp_df      = getHistoricalVolatility()
    content_df = getScrapedContent()
    X, y       = combineHistVolColumn(content_df, sp_df)

    # vectorize text
    clf  = TfidfVectorizer(stop_words='english')
    clfv = clf.fit_transform(X)

    # cross validation
    X_train, X_test, y_train, y_test = train_test_split(clfv, y, test_size=0.2, random_state=42)

    # use naive bayes
    clf = LinearRegression()
    clf.fit(X_train, y_train)
    
    y_pred = clf.predict(X_test)

    ipdb.set_trace()
    # 1 estimator score method
    print "Estimator score method: ", clf.score(X_test, y_test)
    # 2 scoring parameter
    scores = cross_val_score(clf, X_train, y_train, cv=5, scoring='accuracy')

    print "Scoring parameter 'accuracy' from cross val: %0.2f (+/- %0.2f)" % (scores.mean(), scores.std() / 2)

    # 3 scoring via metric functions
    # print average_precision_score(y_test, y_pred)
    print confusion_matrix(y_test, y_pred)


    
開發者ID:OspreyX,項目名稱:VolatilityPrediction,代碼行數:30,代碼來源:sentimentmodel.py

示例2: _alpha_grid

# 需要導入模塊: from sklearn.feature_extraction.text import TfidfVectorizer [as 別名]
# 或者: from sklearn.feature_extraction.text.TfidfVectorizer import predict [as 別名]
alphas = _alpha_grid(X, y, n_alphas=20)
for alpha in alphas:

    r_time4, r_iter4, r_score4, r_time8, r_iter8, r_score8 = 0, 0, 0, 0, 0, 0
    c_time4, c_iter4, c_score4, c_time8, c_iter8, c_score8 = 0, 0, 0, 0, 0, 0

    for n_iter in [0, 1, 2]:
        X_train, X_test, y_train, y_test= train_test_split(X, y, test_size=0.33, random_state=n_iter)

        clf = ElasticNet(max_iter=500000, alpha=alpha, tol=1e-4)
        print("......") + str(alpha)
        t = time()
        clf.fit(X_train, y_train)
        c_time4 += time() - t
        y_pred = np.sign(clf.predict(X_test))
        c_iter4 += clf.n_iter_
        c_score4 += accuracy_score(y_test, y_pred)
        print c_iter4
        print c_time4
        print c_score4

        clf = ElasticNet(max_iter=500000, alpha=alpha, tol=1e-4, random_state=0, selection='random')
        print("......") + str(alpha)
        t = time()
        clf.fit(X_train, y_train)
        r_time4 += time() - t
        y_pred = np.sign(clf.predict(X_test))
        r_iter4 += clf.n_iter_
        r_score4 += accuracy_score(y_test, y_pred)
        print r_iter4
開發者ID:MechCoder,項目名稱:Sklearn_benchmarks,代碼行數:32,代碼來源:newsgroup_random_vs_cyclic.py


注:本文中的sklearn.feature_extraction.text.TfidfVectorizer.predict方法示例由純淨天空整理自Github/MSDocs等開源代碼及文檔管理平台,相關代碼片段篩選自各路編程大神貢獻的開源項目,源碼版權歸原作者所有,傳播和使用請參考對應項目的License;未經允許,請勿轉載。