当前位置: 首页>>代码示例>>Java>>正文


Java PCollection.getTypeFamily方法代码示例

本文整理汇总了Java中org.apache.crunch.PCollection.getTypeFamily方法的典型用法代码示例。如果您正苦于以下问题:Java PCollection.getTypeFamily方法的具体用法?Java PCollection.getTypeFamily怎么用?Java PCollection.getTypeFamily使用的例子?那么, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在org.apache.crunch.PCollection的用法示例。


在下文中一共展示了PCollection.getTypeFamily方法的4个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Java代码示例。

示例1: lloydsAlgorithm

import org.apache.crunch.PCollection; //导入方法依赖的package包/类
/**
 * Runs Lloyd's algorithm on the given points for a given number of iterations, returning the final
 * centers that result.
 * 
 * @param points The data points to cluster
 * @param centers The list of initial centers
 * @param numIterations The number of iterations to run, with each iteration corresponding to a MapReduce job
 * @param approx Whether to use random projection for assigning points to centers
 */
public <V extends RealVector> List<Centers> lloydsAlgorithm(PCollection<V> points, List<Centers> centers,
    int numIterations, boolean approx) {
  PTypeFamily tf = points.getTypeFamily();
  PTableType<Pair<Integer, Integer>, Pair<V, Long>> ptt = tf.tableOf(tf.pairs(tf.ints(), tf.ints()),
      tf.pairs(points.getPType(), tf.longs()));
  Aggregator<Pair<V, Long>> agg = new SumVectorsAggregator<V>();
  for (int i = 0; i < numIterations; i++) {
    KSketchIndex index = new KSketchIndex(centers, projectionBits, projectionSamples, seed);
    LloydsMapFn<V> mapFn = new LloydsMapFn<V>(index, approx);
    centers = new LloydsCenters<V>(points.parallelDo("lloyds-" + i, mapFn, ptt)
        .groupByKey()
        .combineValues(agg), centers.size()).getValue();
  }
  return centers;
}
 
开发者ID:apsaltis,项目名称:oryx,代码行数:25,代码来源:KMeansParallel.java

示例2: apply

import org.apache.crunch.PCollection; //导入方法依赖的package包/类
public <T> PCollection<Pair<Integer, T>> apply(PCollection<T> pcollect) {
  PTypeFamily ptf = pcollect.getTypeFamily();
  PType<Pair<Integer, T>> pt = ptf.pairs(ptf.ints(), pcollect.getPType());
  return pcollect.parallelDo("crossfold", new MapFn<T, Pair<Integer, T>>() {
    private transient RandomGenerator rand;
    
    @Override
    public void initialize() {
      if (rand == null) {
        this.rand = RandomManager.getSeededRandom(seed);
      }
    }
    
    @Override
    public Pair<Integer, T> map(T t) {
      return Pair.of(rand.nextInt(numFolds), t);
    }
    
  }, pt);
}
 
开发者ID:apsaltis,项目名称:oryx,代码行数:21,代码来源:Crossfold.java

示例3: sample

import org.apache.crunch.PCollection; //导入方法依赖的package包/类
public static <T> PCollection<T> sample(
    PCollection<T> input,
    int sampleSize,
    RandomGenerator random) {
  PTypeFamily ptf = input.getTypeFamily();
  PType<Pair<T, Integer>> ptype = ptf.pairs(input.getPType(), ptf.ints());
  return weightedSample(
      input.parallelDo(new MapFn<T, Pair<T, Integer>>() {
        @Override
        public Pair<T, Integer> map(T t) { return Pair.of(t, 1); }
      }, ptype),
      sampleSize,
      random);
}
 
开发者ID:apsaltis,项目名称:oryx,代码行数:15,代码来源:ReservoirSampling.java

示例4: weightedSample

import org.apache.crunch.PCollection; //导入方法依赖的package包/类
public static <T, N extends Number> PCollection<T> weightedSample(
    PCollection<Pair<T, N>> input,
    int sampleSize,
    RandomGenerator random) {
  PTypeFamily ptf = input.getTypeFamily();
  PTable<Integer, Pair<T, N>> groupedIn = input.parallelDo(
      new MapFn<Pair<T, N>, Pair<Integer, Pair<T, N>>>() {
        @Override
        public Pair<Integer, Pair<T, N>> map(Pair<T, N> p) {
          return Pair.of(0, p);
        }
      }, ptf.tableOf(ptf.ints(), input.getPType()));
  return groupedWeightedSample(groupedIn, sampleSize, random).values();
}
 
开发者ID:apsaltis,项目名称:oryx,代码行数:15,代码来源:ReservoirSampling.java


注:本文中的org.apache.crunch.PCollection.getTypeFamily方法示例由纯净天空整理自Github/MSDocs等开源代码及文档管理平台,相关代码片段筛选自各路编程大神贡献的开源项目,源码版权归原作者所有,传播和使用请参考对应项目的License;未经允许,请勿转载。