当前位置: 首页>>代码示例>>Java>>正文


Java JavaPairRDD.keys方法代码示例

本文整理汇总了Java中org.apache.spark.api.java.JavaPairRDD.keys方法的典型用法代码示例。如果您正苦于以下问题:Java JavaPairRDD.keys方法的具体用法?Java JavaPairRDD.keys怎么用?Java JavaPairRDD.keys使用的例子?那么恭喜您, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在org.apache.spark.api.java.JavaPairRDD的用法示例。


在下文中一共展示了JavaPairRDD.keys方法的3个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Java代码示例。

示例1: parallizeUsers

import org.apache.spark.api.java.JavaPairRDD; //导入方法依赖的package包/类
public JavaRDD<String> parallizeUsers(Map<String, Double> userDocs) {

    // prepare list for parallize
    List<Tuple2<String, Double>> list = new ArrayList<>();
    for (String user : userDocs.keySet()) {
      list.add(new Tuple2<String, Double>(user, userDocs.get(user)));
    }

    // group users
    ThePartitionProblemSolver solution = new KGreedyPartitionSolver();
    Map<String, Integer> userGroups = solution.solve(userDocs, this.partition);

    JavaPairRDD<String, Double> pairRdd = spark.sc.parallelizePairs(list);
    JavaPairRDD<String, Double> userPairRDD = pairRdd.partitionBy(new logPartitioner(userGroups, this.partition));

    // repartitioned user RDD
    return userPairRDD.keys();
  }
 
开发者ID:apache,项目名称:incubator-sdap-mudrod,代码行数:19,代码来源:LogAbstract.java

示例2: buildSVDMatrix

import org.apache.spark.api.java.JavaPairRDD; //导入方法依赖的package包/类
/**
 * Build svd matrix from CSV file.
 *
 * @param tfidfCSVfile  tf-idf matrix csv file
 * @param svdDimension: Dimension of matrix after singular value decomposition
 * @return row matrix
 */
public RowMatrix buildSVDMatrix(String tfidfCSVfile, int svdDimension) {
  RowMatrix svdMatrix = null;
  JavaPairRDD<String, Vector> tfidfRDD = MatrixUtil.loadVectorFromCSV(spark, tfidfCSVfile, 2);
  JavaRDD<Vector> vectorRDD = tfidfRDD.values();

  svdMatrix = MatrixUtil.buildSVDMatrix(vectorRDD, svdDimension);
  this.svdMatrix = svdMatrix;

  this.wordRDD = tfidfRDD.keys();

  return svdMatrix;
}
 
开发者ID:apache,项目名称:incubator-sdap-mudrod,代码行数:20,代码来源:SVDUtil.java

示例3: calTermSimfromMatrix

import org.apache.spark.api.java.JavaPairRDD; //导入方法依赖的package包/类
/**
 * Calculate term similarity from CSV matrix.
 *
 * @param csvFileName csv file of matrix, each row is a term, and each column is a
 *                    dimension in feature space
 * @param skipRow number of rows to skip in input CSV file e.g. header
 * @return Linkage triple list
 */
public List<LinkageTriple> calTermSimfromMatrix(String csvFileName, int skipRow) {

  JavaPairRDD<String, Vector> importRDD = MatrixUtil.loadVectorFromCSV(spark, csvFileName, skipRow);
  if (importRDD == null || importRDD.values().first().size() == 0) {
    return null;
  }

  CoordinateMatrix simMatrix = SimilarityUtil.calculateSimilarityFromVector(importRDD.values());
  JavaRDD<String> rowKeyRDD = importRDD.keys();
  return SimilarityUtil.matrixToTriples(rowKeyRDD, simMatrix);
}
 
开发者ID:apache,项目名称:incubator-sdap-mudrod,代码行数:20,代码来源:SemanticAnalyzer.java


注:本文中的org.apache.spark.api.java.JavaPairRDD.keys方法示例由纯净天空整理自Github/MSDocs等开源代码及文档管理平台,相关代码片段筛选自各路编程大神贡献的开源项目,源码版权归原作者所有,传播和使用请参考对应项目的License;未经允许,请勿转载。