本文简要介绍
pyspark.ml.evaluation.MulticlassClassificationEvaluator
的用法。用法:
class pyspark.ml.evaluation.MulticlassClassificationEvaluator(*, predictionCol='prediction', labelCol='label', metricName='f1', weightCol=None, metricLabel=0.0, beta=1.0, probabilityCol='probability', eps=1e-15)
多类分类评估器,它需要输入列:预测、标签、权重(可选)和probabilityCol(仅用于logLoss)。
1.5.0 版中的新函数。
例子:
>>> scoreAndLabels = [(0.0, 0.0), (0.0, 1.0), (0.0, 0.0), ... (1.0, 0.0), (1.0, 1.0), (1.0, 1.0), (1.0, 1.0), (2.0, 2.0), (2.0, 0.0)] >>> dataset = spark.createDataFrame(scoreAndLabels, ["prediction", "label"]) >>> evaluator = MulticlassClassificationEvaluator() >>> evaluator.setPredictionCol("prediction") MulticlassClassificationEvaluator... >>> evaluator.evaluate(dataset) 0.66... >>> evaluator.evaluate(dataset, {evaluator.metricName: "accuracy"}) 0.66... >>> evaluator.evaluate(dataset, {evaluator.metricName: "truePositiveRateByLabel", ... evaluator.metricLabel: 1.0}) 0.75... >>> evaluator.setMetricName("hammingLoss") MulticlassClassificationEvaluator... >>> evaluator.evaluate(dataset) 0.33... >>> mce_path = temp_path + "/mce" >>> evaluator.save(mce_path) >>> evaluator2 = MulticlassClassificationEvaluator.load(mce_path) >>> str(evaluator2.getPredictionCol()) 'prediction' >>> scoreAndLabelsAndWeight = [(0.0, 0.0, 1.0), (0.0, 1.0, 1.0), (0.0, 0.0, 1.0), ... (1.0, 0.0, 1.0), (1.0, 1.0, 1.0), (1.0, 1.0, 1.0), (1.0, 1.0, 1.0), ... (2.0, 2.0, 1.0), (2.0, 0.0, 1.0)] >>> dataset = spark.createDataFrame(scoreAndLabelsAndWeight, ["prediction", "label", "weight"]) >>> evaluator = MulticlassClassificationEvaluator(predictionCol="prediction", ... weightCol="weight") >>> evaluator.evaluate(dataset) 0.66... >>> evaluator.evaluate(dataset, {evaluator.metricName: "accuracy"}) 0.66... >>> predictionAndLabelsWithProbabilities = [ ... (1.0, 1.0, 1.0, [0.1, 0.8, 0.1]), (0.0, 2.0, 1.0, [0.9, 0.05, 0.05]), ... (0.0, 0.0, 1.0, [0.8, 0.2, 0.0]), (1.0, 1.0, 1.0, [0.3, 0.65, 0.05])] >>> dataset = spark.createDataFrame(predictionAndLabelsWithProbabilities, ["prediction", ... "label", "weight", "probability"]) >>> evaluator = MulticlassClassificationEvaluator(predictionCol="prediction", ... probabilityCol="probability") >>> evaluator.setMetricName("logLoss") MulticlassClassificationEvaluator... >>> evaluator.evaluate(dataset) 0.9682...
相关用法
- Python pyspark MulticlassMetrics用法及代码示例
- Python pyspark MultiIndex.size用法及代码示例
- Python pyspark MultiIndex.hasnans用法及代码示例
- Python pyspark MultiIndex.to_numpy用法及代码示例
- Python pyspark MultiIndex.levshape用法及代码示例
- Python pyspark MultiIndex.max用法及代码示例
- Python pyspark MultiIndex.drop用法及代码示例
- Python pyspark MultiIndex.min用法及代码示例
- Python pyspark MultiIndex.unique用法及代码示例
- Python pyspark MultiIndex.rename用法及代码示例
- Python pyspark MultiIndex.value_counts用法及代码示例
- Python pyspark MultiIndex.values用法及代码示例
- Python pyspark MultiIndex.difference用法及代码示例
- Python pyspark MultiIndex.sort_values用法及代码示例
- Python pyspark MultiIndex.spark.transform用法及代码示例
- Python pyspark MultiIndex.T用法及代码示例
- Python pyspark MultiIndex用法及代码示例
- Python pyspark MultiIndex.ndim用法及代码示例
- Python pyspark MultiIndex.copy用法及代码示例
- Python pyspark MultiIndex.to_frame用法及代码示例
- Python pyspark MultiIndex.shape用法及代码示例
- Python pyspark MultilabelClassificationEvaluator用法及代码示例
- Python pyspark MultiIndex.equals用法及代码示例
- Python pyspark MultiIndex.empty用法及代码示例
- Python pyspark MultiIndex.to_series用法及代码示例
注:本文由纯净天空筛选整理自spark.apache.org大神的英文原创作品 pyspark.ml.evaluation.MulticlassClassificationEvaluator。非经特殊声明,原始代码版权归原作者所有,本译文未经允许或授权,请勿转载或复制。