本文整理汇总了Java中org.apache.spark.api.java.JavaPairRDD.values方法的典型用法代码示例。如果您正苦于以下问题:Java JavaPairRDD.values方法的具体用法?Java JavaPairRDD.values怎么用?Java JavaPairRDD.values使用的例子?那么恭喜您, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在类org.apache.spark.api.java.JavaPairRDD
的用法示例。
在下文中一共展示了JavaPairRDD.values方法的2个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Java代码示例。
示例1: getSVDMatrix
import org.apache.spark.api.java.JavaPairRDD; //导入方法依赖的package包/类
/**
* GetSVDMatrix: Create SVD matrix csv file from original csv file.
*
* @param csvFileName each row is a term, and each column is a document.
* @param svdDimention Dimension of SVD matrix
* @param svdMatrixFileName CSV file name of SVD matrix
*/
public void getSVDMatrix(String csvFileName, int svdDimention, String svdMatrixFileName) {
JavaPairRDD<String, Vector> importRDD = MatrixUtil.loadVectorFromCSV(spark, csvFileName, 1);
JavaRDD<Vector> vectorRDD = importRDD.values();
RowMatrix wordDocMatrix = new RowMatrix(vectorRDD.rdd());
RowMatrix tfidfMatrix = MatrixUtil.createTFIDFMatrix(wordDocMatrix);
RowMatrix svdMatrix = MatrixUtil.buildSVDMatrix(tfidfMatrix, svdDimention);
List<String> rowKeys = importRDD.keys().collect();
List<String> colKeys = new ArrayList<>();
for (int i = 0; i < svdDimention; i++) {
colKeys.add("dimension" + i);
}
MatrixUtil.exportToCSV(svdMatrix, rowKeys, colKeys, svdMatrixFileName);
}
示例2: buildSVDMatrix
import org.apache.spark.api.java.JavaPairRDD; //导入方法依赖的package包/类
/**
* Build svd matrix from CSV file.
*
* @param tfidfCSVfile tf-idf matrix csv file
* @param svdDimension: Dimension of matrix after singular value decomposition
* @return row matrix
*/
public RowMatrix buildSVDMatrix(String tfidfCSVfile, int svdDimension) {
RowMatrix svdMatrix = null;
JavaPairRDD<String, Vector> tfidfRDD = MatrixUtil.loadVectorFromCSV(spark, tfidfCSVfile, 2);
JavaRDD<Vector> vectorRDD = tfidfRDD.values();
svdMatrix = MatrixUtil.buildSVDMatrix(vectorRDD, svdDimension);
this.svdMatrix = svdMatrix;
this.wordRDD = tfidfRDD.keys();
return svdMatrix;
}