當前位置: 首頁>>代碼示例>>Java>>正文


Java JavaRDD.collect方法代碼示例

本文整理匯總了Java中org.apache.spark.api.java.JavaRDD.collect方法的典型用法代碼示例。如果您正苦於以下問題:Java JavaRDD.collect方法的具體用法?Java JavaRDD.collect怎麽用?Java JavaRDD.collect使用的例子?那麽, 這裏精選的方法代碼示例或許可以為您提供幫助。您也可以進一步了解該方法所在org.apache.spark.api.java.JavaRDD的用法示例。


在下文中一共展示了JavaRDD.collect方法的3個代碼示例,這些例子默認根據受歡迎程度排序。您可以為喜歡或者感覺有用的代碼點讚,您的評價將有助於係統推薦出更棒的Java代碼示例。

示例1: sparkTrain

import org.apache.spark.api.java.JavaRDD; //導入方法依賴的package包/類
public boolean sparkTrain(JavaRDD<String> rdd) {
    JavaRDD<String> repartition = rdd.repartition(slaveNum);
    JavaRDD<Boolean> partRDD = repartition.mapPartitionsWithIndex(trainFunc, true);
    List<Boolean> res = partRDD.collect();
    for (boolean result : res) {
        if (!result) {
            return false;
        }
    }
    return true;
}
 
開發者ID:yuantiku,項目名稱:ytk-learn,代碼行數:12,代碼來源:SparkTrainWorker.java

示例2: writeMatrixToFileInHDFS

import org.apache.spark.api.java.JavaRDD; //導入方法依賴的package包/類
public static void writeMatrixToFileInHDFS(String file, DistributedMatrix matrix, Configuration conf){

		try {
			List<IndexedRow> localRows;
			long numRows = 0;
			long numCols = 0;

			FileSystem fs = FileSystem.get(conf);

			Path pt = new Path(file);

			//FileSystem fileSystem = FileSystem.get(context.getConfiguration());
			BufferedWriter bw = new BufferedWriter(new OutputStreamWriter(fs.create(pt, true)));

			JavaRDD<IndexedRow> rows;

			if( matrix.getClass() == IndexedRowMatrix.class) {
				rows = ((IndexedRowMatrix) matrix).rows().toJavaRDD();
			}
			else if (matrix.getClass() == CoordinateMatrix.class) {
				rows = ((CoordinateMatrix)matrix).toIndexedRowMatrix().rows().toJavaRDD();
			}
			else if (matrix.getClass() == BlockMatrix.class){
				rows = ((BlockMatrix)matrix).toIndexedRowMatrix().rows().toJavaRDD();
			}
			else {
				rows = null;
			}

			localRows = rows.collect();

			Vector vectors[] = new Vector[localRows.size()];

			for(int i = 0; i< localRows.size(); i++) {
				vectors[(int)localRows.get(i).index()] = localRows.get(i).vector();
			}

			numRows = matrix.numRows();
			numCols = matrix.numCols();

			bw.write("%%MatrixMarket matrix array real general");
			bw.newLine();
			bw.write(numRows+" "+numCols+" "+(numRows * numCols));
			bw.newLine();

			for(int i = 0; i< vectors.length; i++) {
				bw.write(i+":");
				for(int j = 0; j< vectors[i].size(); j++) {
					bw.write(String.valueOf(vectors[i].apply(j))+",");
				}

				bw.newLine();
			}

			bw.close();
			//fs.close();


		} catch (IOException e) {
			LOG.error("Error in " + IO.class.getName() + ": " + e.getMessage());
			e.printStackTrace();
			System.exit(1);
		}

	}
 
開發者ID:jmabuin,項目名稱:BLASpark,代碼行數:66,代碼來源:IO.java

示例3: calTermSimfromMatrix

import org.apache.spark.api.java.JavaRDD; //導入方法依賴的package包/類
/**
 * Calculate term similarity from CSV matrix.
 *
 * @param csvFileName csv file of matrix, each row is a term, and each column is a
 *                    dimension in feature space
 * @param simType the type of similary calculation to execute e.g.
 * <ul>
 * <li>{@link org.apache.sdap.mudrod.utils.SimilarityUtil#SIM_COSINE} - 3,</li>
 * <li>{@link org.apache.sdap.mudrod.utils.SimilarityUtil#SIM_HELLINGER} - 2,</li>
 * <li>{@link org.apache.sdap.mudrod.utils.SimilarityUtil#SIM_PEARSON} - 1</li>
 * </ul>
 * @param skipRow number of rows to skip in input CSV file e.g. header
 * @return Linkage triple list
 */
public List<LinkageTriple> calTermSimfromMatrix(String csvFileName, int simType, int skipRow) {

  JavaPairRDD<String, Vector> importRDD = MatrixUtil.loadVectorFromCSV(spark, csvFileName, skipRow);
  if (importRDD.values().first().size() == 0) {
    return null;
  }

  JavaRDD<LinkageTriple> triples = SimilarityUtil.calculateSimilarityFromVector(importRDD, simType);

  return triples.collect();
}
 
開發者ID:apache,項目名稱:incubator-sdap-mudrod,代碼行數:26,代碼來源:SemanticAnalyzer.java


注:本文中的org.apache.spark.api.java.JavaRDD.collect方法示例由純淨天空整理自Github/MSDocs等開源代碼及文檔管理平台,相關代碼片段篩選自各路編程大神貢獻的開源項目,源碼版權歸原作者所有,傳播和使用請參考對應項目的License;未經允許,請勿轉載。