当前位置: 首页>>代码示例>>Python>>正文


Python VectorAssembler.getOutputCol方法代码示例

本文整理汇总了Python中pyspark.ml.feature.VectorAssembler.getOutputCol方法的典型用法代码示例。如果您正苦于以下问题:Python VectorAssembler.getOutputCol方法的具体用法?Python VectorAssembler.getOutputCol怎么用?Python VectorAssembler.getOutputCol使用的例子?那么恭喜您, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在pyspark.ml.feature.VectorAssembler的用法示例。


在下文中一共展示了VectorAssembler.getOutputCol方法的1个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Python代码示例。

示例1: OneHotEncoder

# 需要导入模块: from pyspark.ml.feature import VectorAssembler [as 别名]
# 或者: from pyspark.ml.feature.VectorAssembler import getOutputCol [as 别名]
        onehotenc = OneHotEncoder(inputCol=c, outputCol=c+"-onehot", dropLast=False)
        newdf = onehotenc.transform(newdf).drop(c)
        newdf = newdf.withColumnRenamed(c+"-onehot", c)
    return newdf

dfhot = oneHotEncodeColumns(dfnumeric, ["Take-out","GoodFor_lunch", "GoodFor_dinner", "GoodFor_breakfast"])

dfhot.show(5)

# Taining set
assembler = VectorAssembler(inputCols = list(set(dfhot.columns) | set(['stars','review_count'])), outputCol="features")
train = assembler.transform(dfhot)

# Kmeans set for 5 clusters
knum = 5
kmeans = KMeans(featuresCol=assembler.getOutputCol(), predictionCol="cluster", k=knum, seed=0)
model = kmeans.fit(train)
print "Model Created!"

# See cluster centers:
centers = model.clusterCenters()
print("Cluster Centers: ")
for center in centers:
    print(center)
    
# Apply the clustering model to our data:
prediction = model.transform(train)
prediction.groupBy("cluster").count().orderBy("cluster").show()

# Look at the features of each cluster
customerCluster = {}
开发者ID:raul-arrabales,项目名称:Spark-Hands-on,代码行数:33,代码来源:Session6.py


注:本文中的pyspark.ml.feature.VectorAssembler.getOutputCol方法示例由纯净天空整理自Github/MSDocs等开源代码及文档管理平台,相关代码片段筛选自各路编程大神贡献的开源项目,源码版权归原作者所有,传播和使用请参考对应项目的License;未经允许,请勿转载。