本文简要介绍
pyspark.ml.classification.FMClassifier
的用法。用法:
class pyspark.ml.classification.FMClassifier(*, featuresCol='features', labelCol='label', predictionCol='prediction', probabilityCol='probability', rawPredictionCol='rawPrediction', factorSize=8, fitIntercept=True, fitLinear=True, regParam=0.0, miniBatchFraction=1.0, initStd=0.01, maxIter=100, stepSize=1.0, tol=1e-06, solver='adamW', thresholds=None, seed=None)
用于分类的分解机器学习算法。
求解器支持:
gd(正常的小批量梯度下降)
adamW(默认)
3.0.0 版中的新函数。
例子:
>>> from pyspark.ml.linalg import Vectors >>> from pyspark.ml.classification import FMClassifier >>> df = spark.createDataFrame([ ... (1.0, Vectors.dense(1.0)), ... (0.0, Vectors.sparse(1, [], []))], ["label", "features"]) >>> fm = FMClassifier(factorSize=2) >>> fm.setSeed(11) FMClassifier... >>> model = fm.fit(df) >>> model.getMaxIter() 100 >>> test0 = spark.createDataFrame([ ... (Vectors.dense(-1.0),), ... (Vectors.dense(0.5),), ... (Vectors.dense(1.0),), ... (Vectors.dense(2.0),)], ["features"]) >>> model.predictRaw(test0.head().features) DenseVector([22.13..., -22.13...]) >>> model.predictProbability(test0.head().features) DenseVector([1.0, 0.0]) >>> model.transform(test0).select("features", "probability").show(10, False) +--------+------------------------------------------+ |features|probability | +--------+------------------------------------------+ |[-1.0] |[0.9999999997574736,2.425264676902229E-10]| |[0.5] |[0.47627851732981163,0.5237214826701884] | |[1.0] |[5.491554426243495E-4,0.9994508445573757] | |[2.0] |[2.005766663870645E-10,0.9999999997994233]| +--------+------------------------------------------+ ... >>> model.intercept -7.316665276826291 >>> model.linear DenseVector([14.8232]) >>> model.factors DenseMatrix(1, 2, [0.0163, -0.0051], 1) >>> model_path = temp_path + "/fm_model" >>> model.save(model_path) >>> model2 = FMClassificationModel.load(model_path) >>> model2.intercept -7.316665276826291 >>> model2.linear DenseVector([14.8232]) >>> model2.factors DenseMatrix(1, 2, [0.0163, -0.0051], 1) >>> model.transform(test0).take(1) == model2.transform(test0).take(1) True
相关用法
- Python pyspark FMRegressor用法及代码示例
- Python pyspark FPGrowth用法及代码示例
- Python pyspark Float64Index用法及代码示例
- Python pyspark FeatureHasher用法及代码示例
- Python pyspark FPGrowthModel用法及代码示例
- Python pyspark create_map用法及代码示例
- Python pyspark date_add用法及代码示例
- Python pyspark DataFrame.to_latex用法及代码示例
- Python pyspark DataStreamReader.schema用法及代码示例
- Python pyspark MultiIndex.size用法及代码示例
- Python pyspark arrays_overlap用法及代码示例
- Python pyspark Series.asof用法及代码示例
- Python pyspark DataFrame.align用法及代码示例
- Python pyspark Index.is_monotonic_decreasing用法及代码示例
- Python pyspark IsotonicRegression用法及代码示例
- Python pyspark DataFrame.plot.bar用法及代码示例
- Python pyspark DataFrame.to_delta用法及代码示例
- Python pyspark element_at用法及代码示例
- Python pyspark explode用法及代码示例
- Python pyspark MultiIndex.hasnans用法及代码示例
- Python pyspark Series.to_frame用法及代码示例
- Python pyspark DataFrame.quantile用法及代码示例
- Python pyspark Column.withField用法及代码示例
- Python pyspark Index.values用法及代码示例
- Python pyspark Index.drop_duplicates用法及代码示例
注:本文由纯净天空筛选整理自spark.apache.org大神的英文原创作品 pyspark.ml.classification.FMClassifier。非经特殊声明,原始代码版权归原作者所有,本译文未经允许或授权,请勿转载或复制。