當前位置: 首頁>>代碼示例>>Java>>正文


Java JavaRDD.cartesian方法代碼示例

本文整理匯總了Java中org.apache.spark.api.java.JavaRDD.cartesian方法的典型用法代碼示例。如果您正苦於以下問題:Java JavaRDD.cartesian方法的具體用法?Java JavaRDD.cartesian怎麽用?Java JavaRDD.cartesian使用的例子?那麽, 這裏精選的方法代碼示例或許可以為您提供幫助。您也可以進一步了解該方法所在org.apache.spark.api.java.JavaRDD的用法示例。


在下文中一共展示了JavaRDD.cartesian方法的1個代碼示例,這些例子默認根據受歡迎程度排序。您可以為喜歡或者感覺有用的代碼點讚,您的評價將有助於係統推薦出更棒的Java代碼示例。

示例1: calculateSimilarityFromVector

import org.apache.spark.api.java.JavaRDD; //導入方法依賴的package包/類
/**
 * Calculate term similarity from vector.
 *
 * @param importRDD the {@link org.apache.spark.api.java.JavaPairRDD}
 *                  data structure containing the vectors.
 * @param simType   the similarity calculation to execute e.g. 
 * <ul>
 * <li>{@link org.apache.sdap.mudrod.utils.SimilarityUtil#SIM_COSINE} - 3,</li>
 * <li>{@link org.apache.sdap.mudrod.utils.SimilarityUtil#SIM_HELLINGER} - 2,</li>
 * <li>{@link org.apache.sdap.mudrod.utils.SimilarityUtil#SIM_PEARSON} - 1</li>
 * </ul>
 * @return a new {@link org.apache.spark.api.java.JavaPairRDD}
 */
public static JavaRDD<LinkageTriple> calculateSimilarityFromVector(JavaPairRDD<String, Vector> importRDD, int simType) {
  JavaRDD<Tuple2<String, Vector>> importRDD1 = importRDD.map(f -> new Tuple2<String, Vector>(f._1, f._2));
  JavaPairRDD<Tuple2<String, Vector>, Tuple2<String, Vector>> cartesianRDD = importRDD1.cartesian(importRDD1);

  return cartesianRDD.map(new Function<Tuple2<Tuple2<String, Vector>, Tuple2<String, Vector>>, LinkageTriple>() {

    /**
     *
     */
    private static final long serialVersionUID = 1L;

    @Override
    public LinkageTriple call(Tuple2<Tuple2<String, Vector>, Tuple2<String, Vector>> arg) {
      String keyA = arg._1._1;
      String keyB = arg._2._1;

      if (keyA.equals(keyB)) {
        return null;
      }

      Vector vecA = arg._1._2;
      Vector vecB = arg._2._2;
      Double weight = 0.0;

      if (simType == SimilarityUtil.SIM_PEARSON) {
        weight = SimilarityUtil.pearsonDistance(vecA, vecB);
      } else if (simType == SimilarityUtil.SIM_HELLINGER) {
        weight = SimilarityUtil.hellingerDistance(vecA, vecB);
      }

      LinkageTriple triple = new LinkageTriple();
      triple.keyA = keyA;
      triple.keyB = keyB;
      triple.weight = weight;
      return triple;
    }
  }).filter(new Function<LinkageTriple, Boolean>() {
    /**
     *
     */
    private static final long serialVersionUID = 1L;

    @Override
    public Boolean call(LinkageTriple arg0) throws Exception {
      if (arg0 == null) {
        return false;
      }
      return true;
    }
  });
}
 
開發者ID:apache,項目名稱:incubator-sdap-mudrod,代碼行數:65,代碼來源:SimilarityUtil.java


注:本文中的org.apache.spark.api.java.JavaRDD.cartesian方法示例由純淨天空整理自Github/MSDocs等開源代碼及文檔管理平台,相關代碼片段篩選自各路編程大神貢獻的開源項目,源碼版權歸原作者所有,傳播和使用請參考對應項目的License;未經允許,請勿轉載。