本文简要介绍
pyspark.ml.classification.LinearSVC
的用法。用法:
class pyspark.ml.classification.LinearSVC(*, featuresCol='features', labelCol='label', predictionCol='prediction', maxIter=100, regParam=0.0, tol=1e-06, rawPredictionCol='rawPrediction', fitIntercept=True, standardization=True, threshold=0.0, weightCol=None, aggregationDepth=2, maxBlockSizeInMB=0.0)
这个二元分类器使用 OWLQN 优化器优化铰链损失。目前仅支持 L2 正则化。
2.2.0 版中的新函数。
注意:
例子:
>>> from pyspark.sql import Row >>> from pyspark.ml.linalg import Vectors >>> df = sc.parallelize([ ... Row(label=1.0, features=Vectors.dense(1.0, 1.0, 1.0)), ... Row(label=0.0, features=Vectors.dense(1.0, 2.0, 3.0))]).toDF() >>> svm = LinearSVC() >>> svm.getMaxIter() 100 >>> svm.setMaxIter(5) LinearSVC... >>> svm.getMaxIter() 5 >>> svm.getRegParam() 0.0 >>> svm.setRegParam(0.01) LinearSVC... >>> svm.getRegParam() 0.01 >>> model = svm.fit(df) >>> model.setPredictionCol("newPrediction") LinearSVCModel... >>> model.getPredictionCol() 'newPrediction' >>> model.setThreshold(0.5) LinearSVCModel... >>> model.getThreshold() 0.5 >>> model.getMaxBlockSizeInMB() 0.0 >>> model.coefficients DenseVector([0.0, -1.0319, -0.5159]) >>> model.intercept 2.579645978780695 >>> model.numClasses 2 >>> model.numFeatures 3 >>> test0 = sc.parallelize([Row(features=Vectors.dense(-1.0, -1.0, -1.0))]).toDF() >>> model.predict(test0.head().features) 1.0 >>> model.predictRaw(test0.head().features) DenseVector([-4.1274, 4.1274]) >>> result = model.transform(test0).head() >>> result.newPrediction 1.0 >>> result.rawPrediction DenseVector([-4.1274, 4.1274]) >>> svm_path = temp_path + "/svm" >>> svm.save(svm_path) >>> svm2 = LinearSVC.load(svm_path) >>> svm2.getMaxIter() 5 >>> model_path = temp_path + "/svm_model" >>> model.save(model_path) >>> model2 = LinearSVCModel.load(model_path) >>> model.coefficients[0] == model2.coefficients[0] True >>> model.intercept == model2.intercept True >>> model.transform(test0).take(1) == model2.transform(test0).take(1) True
相关用法
- Python pyspark LinearRegressionModel用法及代码示例
- Python pyspark LinearRegression用法及代码示例
- Python pyspark LDA.setLearningDecay用法及代码示例
- Python pyspark LogisticRegressionWithLBFGS.train用法及代码示例
- Python pyspark LDA.setDocConcentration用法及代码示例
- Python pyspark LDA用法及代码示例
- Python pyspark LDAModel用法及代码示例
- Python pyspark LDA.setOptimizer用法及代码示例
- Python pyspark LDA.setK用法及代码示例
- Python pyspark LDA.setLearningOffset用法及代码示例
- Python pyspark LDA.setTopicDistributionCol用法及代码示例
- Python pyspark LassoModel用法及代码示例
- Python pyspark LogisticRegressionModel用法及代码示例
- Python pyspark LogisticRegression用法及代码示例
- Python pyspark LDA.setKeepLastCheckpoint用法及代码示例
- Python pyspark LDA.setSubsamplingRate用法及代码示例
- Python pyspark LDA.setTopicConcentration用法及代码示例
- Python pyspark LDA.setOptimizeDocConcentration用法及代码示例
- Python pyspark create_map用法及代码示例
- Python pyspark date_add用法及代码示例
- Python pyspark DataFrame.to_latex用法及代码示例
- Python pyspark DataStreamReader.schema用法及代码示例
- Python pyspark MultiIndex.size用法及代码示例
- Python pyspark arrays_overlap用法及代码示例
- Python pyspark Series.asof用法及代码示例
注:本文由纯净天空筛选整理自spark.apache.org大神的英文原创作品 pyspark.ml.classification.LinearSVC。非经特殊声明,原始代码版权归原作者所有,本译文未经允许或授权,请勿转载或复制。