当前位置: 首页>>代码示例>>Python>>正文


Python Normalizer.toarray方法代码示例

本文整理汇总了Python中sklearn.preprocessing.Normalizer.toarray方法的典型用法代码示例。如果您正苦于以下问题:Python Normalizer.toarray方法的具体用法?Python Normalizer.toarray怎么用?Python Normalizer.toarray使用的例子?那么恭喜您, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在sklearn.preprocessing.Normalizer的用法示例。


在下文中一共展示了Normalizer.toarray方法的4个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Python代码示例。

示例1: time

# 需要导入模块: from sklearn.preprocessing import Normalizer [as 别名]
# 或者: from sklearn.preprocessing.Normalizer import toarray [as 别名]
print

# Extract features
print "Extracting features from the training dataset using a sparse vectorizer"
t0 = time()
vectorizer = Vectorizer(max_features=10000)
X = vectorizer.fit_transform(data_set.data)
X = Normalizer(norm="l2", copy=False).transform(X)

y = data_set.target

# feature selection
ch2 = SelectKBest(chi2, k = 1800)
X = ch2.fit_transform(X, y)

X = X.toarray()

n_samples, n_features = X.shape
print "done in %fs" % (time() - t0)
print "n_samples: %d, n_features: %d" % (n_samples, n_features)
print


###############################################################################
# Test a classifier using K-fold Cross Validation

# Setup 10 fold cross validation
num_fold = 10
kf = KFold(n_samples, k=num_fold, indices=True)

# Note: NBs are not working
开发者ID:YuanhaoSun,项目名称:PPLearn,代码行数:33,代码来源:06_k_fold_cross_validation.py

示例2: print

# 需要导入模块: from sklearn.preprocessing import Normalizer [as 别名]
# 或者: from sklearn.preprocessing.Normalizer import toarray [as 别名]
print("V-measure: %0.3f" % metrics.v_measure_score(labels, km.labels_))

#adjusted rand-index: function that measures the similarity of the two assignments, ignoring permutations
print("Adjusted Rand-Index: %.3f" % metrics.adjusted_rand_score(labels, km.labels_))

#silhouette coefficient: a higher score relates to a model with better defined clusters
print("Silhouette Coefficient: %0.3f" % metrics.silhouette_score(X, labels, sample_size=1000))

###############################################################################
# Visualize the results on PCA-reduced data

if(opts.print_visualization):
    np.random.seed(42)
    sample_size = 300

    data = X.toarray()
    n_digits = source_num
    n_samples, n_features = data.shape

    reduced_data = PCA(n_components=2).fit_transform(data)
    kmeans = KMeans(init='k-means++', n_clusters=n_digits, n_init=10)
    kmeans.fit(reduced_data)

    # Step size of the mesh. Decrease to increase the quality of the VQ.
    h = .02     # point in the mesh [x_min, m_max]x[y_min, y_max].

    # Plot the decision boundary. For that, we will assign a color to each
    x_min, x_max = reduced_data[:, 0].min(), reduced_data[:, 0].max()
    y_min, y_max = reduced_data[:, 1].min(), reduced_data[:, 1].max()
    xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h))
    # Obtain labels for each point in mesh. Use last trained model.
开发者ID:2dpodcast,项目名称:CS109-1,代码行数:33,代码来源:document_clustering.py

示例3: Normalizer

# 需要导入模块: from sklearn.preprocessing import Normalizer [as 别名]
# 或者: from sklearn.preprocessing.Normalizer import toarray [as 别名]
X = vectorizer.fit_transform(data_set.data)
X = Normalizer(norm="l2", copy=False).transform(X)

y = data_set.target

# # Feature selection
# select_chi2 = 1900
# print ("Extracting %d best features by a chi-squared test" % select_chi2)
# t0 = time()
# ch2 = SelectKBest(chi2, k = select_chi2)
# X = ch2.fit_transform(X, y)
# print "Done in %fs" % (time() - t0)
# print "L1:      n_samples: %d, n_features: %d" % X.shape
# print

X_den = X.toarray()

n_samples, n_features = X.shape
print "done in %fs" % (time() - t0)
print "n_samples: %d, n_features: %d" % (n_samples, n_features)
print


###############################################################################
# Setup part
# 
# Notation:
# N: number for training examples; K: number of models in level 0
# X: feature matrix; y: result array; z_k: prediction result array for k's model
# 
开发者ID:YuanhaoSun,项目名称:PPLearn,代码行数:32,代码来源:20_ensemble_stacking_prob.py

示例4: len

# 需要导入模块: from sklearn.preprocessing import Normalizer [as 别名]
# 或者: from sklearn.preprocessing.Normalizer import toarray [as 别名]
print len(data_train.data)
print len(data_test.data)
print

# Extract features
print "Extracting features from the training dataset using a sparse vectorizer"
t0 = time()

vectorizer = Vectorizer(max_features=10000)
X_test = vectorizer.fit_transform(data_test.data)
X_test = Normalizer(norm="l2", copy=False).transform(X_test)

X = vectorizer.transform(data_train.data)
X = Normalizer(norm="l2", copy=False).transform(X)

X = X.toarray()
X_test = X_test.toarray()

n_samples, n_features = X.shape
test_samples, test_features = X_test.shape
print "done in %fs" % (time() - t0)
print "Train set - n_samples: %d, n_features: %d" % (n_samples, n_features)
print "Test set  - n_samples: %d, n_features: %d" % (test_samples, test_features)
print


# fit the model
# when nu=0.01, gamma=0.0034607 is the smallest to generate >0 result
clf = OneClassSVM(nu=0.01, kernel="rbf", gamma=0.05) 
clf.fit(X)
# predit on X_test
开发者ID:YuanhaoSun,项目名称:PPLearn,代码行数:33,代码来源:detect_oneclass_svm.py


注:本文中的sklearn.preprocessing.Normalizer.toarray方法示例由纯净天空整理自Github/MSDocs等开源代码及文档管理平台,相关代码片段筛选自各路编程大神贡献的开源项目,源码版权归原作者所有,传播和使用请参考对应项目的License;未经允许,请勿转载。