本文简要介绍
pyspark.ml.evaluation.RegressionEvaluator
的用法。用法:
class pyspark.ml.evaluation.RegressionEvaluator(*, predictionCol='prediction', labelCol='label', metricName='rmse', weightCol=None, throughOrigin=False)
回归评估器,它需要输入列预测、标签和可选的权重列。
1.4.0 版中的新函数。
例子:
>>> scoreAndLabels = [(-28.98343821, -27.0), (20.21491975, 21.5), ... (-25.98418959, -22.0), (30.69731842, 33.0), (74.69283752, 71.0)] >>> dataset = spark.createDataFrame(scoreAndLabels, ["raw", "label"]) ... >>> evaluator = RegressionEvaluator() >>> evaluator.setPredictionCol("raw") RegressionEvaluator... >>> evaluator.evaluate(dataset) 2.842... >>> evaluator.evaluate(dataset, {evaluator.metricName: "r2"}) 0.993... >>> evaluator.evaluate(dataset, {evaluator.metricName: "mae"}) 2.649... >>> re_path = temp_path + "/re" >>> evaluator.save(re_path) >>> evaluator2 = RegressionEvaluator.load(re_path) >>> str(evaluator2.getPredictionCol()) 'raw' >>> scoreAndLabelsAndWeight = [(-28.98343821, -27.0, 1.0), (20.21491975, 21.5, 0.8), ... (-25.98418959, -22.0, 1.0), (30.69731842, 33.0, 0.6), (74.69283752, 71.0, 0.2)] >>> dataset = spark.createDataFrame(scoreAndLabelsAndWeight, ["raw", "label", "weight"]) ... >>> evaluator = RegressionEvaluator(predictionCol="raw", weightCol="weight") >>> evaluator.evaluate(dataset) 2.740... >>> evaluator.getThroughOrigin() False
相关用法
- Python pyspark RegressionMetrics用法及代码示例
- Python pyspark RegexTokenizer用法及代码示例
- Python pyspark RDD.saveAsTextFile用法及代码示例
- Python pyspark RDD.keyBy用法及代码示例
- Python pyspark RDD.sumApprox用法及代码示例
- Python pyspark RowMatrix.numCols用法及代码示例
- Python pyspark RowMatrix.computePrincipalComponents用法及代码示例
- Python pyspark RDD.lookup用法及代码示例
- Python pyspark RDD.zipWithIndex用法及代码示例
- Python pyspark RDD.sampleByKey用法及代码示例
- Python pyspark Rolling.mean用法及代码示例
- Python pyspark Rolling.max用法及代码示例
- Python pyspark RDD.coalesce用法及代码示例
- Python pyspark RDD.subtract用法及代码示例
- Python pyspark RDD.count用法及代码示例
- Python pyspark RankingEvaluator用法及代码示例
- Python pyspark RandomRDDs.uniformRDD用法及代码示例
- Python pyspark RDD.groupWith用法及代码示例
- Python pyspark RDD.distinct用法及代码示例
- Python pyspark RDD.treeAggregate用法及代码示例
- Python pyspark RowMatrix.computeSVD用法及代码示例
- Python pyspark RowMatrix.multiply用法及代码示例
- Python pyspark RandomForest.trainRegressor用法及代码示例
- Python pyspark RandomRDDs.exponentialRDD用法及代码示例
- Python pyspark RDD.mapPartitionsWithIndex用法及代码示例
注:本文由纯净天空筛选整理自spark.apache.org大神的英文原创作品 pyspark.ml.evaluation.RegressionEvaluator。非经特殊声明,原始代码版权归原作者所有,本译文未经允许或授权,请勿转载或复制。