当前位置: 首页>>代码示例>>Java>>正文


Java DirichletClusteringPolicy类代码示例

本文整理汇总了Java中org.apache.mahout.clustering.iterator.DirichletClusteringPolicy的典型用法代码示例。如果您正苦于以下问题:Java DirichletClusteringPolicy类的具体用法?Java DirichletClusteringPolicy怎么用?Java DirichletClusteringPolicy使用的例子?那么, 这里精选的类代码示例或许可以为您提供帮助。


DirichletClusteringPolicy类属于org.apache.mahout.clustering.iterator包,在下文中一共展示了DirichletClusteringPolicy类的2个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Java代码示例。

示例1: buildClusters

import org.apache.mahout.clustering.iterator.DirichletClusteringPolicy; //导入依赖的package包/类
/**
 * Iterate over the input vectors to produce cluster directories for each iteration
 * 
 * @param conf
 *          the hadoop configuration
 * @param input
 *          the directory Path for input points
 * @param output
 *          the directory Path for output points
 * @param description
 *          model distribution parameters
 * @param numClusters
 *          the number of models to iterate over
 * @param maxIterations
 *          the maximum number of iterations
 * @param alpha0
 *          the alpha_0 value for the DirichletDistribution
 * @param runSequential
 *          execute sequentially if true
 * 
 * @return the Path of the final clusters directory
 */
public static Path buildClusters(Configuration conf, Path input, Path output, DistributionDescription description,
    int numClusters, int maxIterations, double alpha0, boolean runSequential) throws IOException,
    ClassNotFoundException, InterruptedException {
  Path clustersIn = new Path(output, Cluster.INITIAL_CLUSTERS_DIR);
  ModelDistribution<VectorWritable> modelDist = description.createModelDistribution(conf);
  
  List<Cluster> models = Lists.newArrayList();
  for (Model<VectorWritable> cluster : modelDist.sampleFromPrior(numClusters)) {
    models.add((Cluster) cluster);
  }
  
  ClusterClassifier prior = new ClusterClassifier(models, new DirichletClusteringPolicy(numClusters, alpha0));
  prior.writeToSeqFiles(clustersIn);
  
  if (runSequential) {
    ClusterIterator.iterateSeq(conf, input, clustersIn, output, maxIterations);
  } else {
    ClusterIterator.iterateMR(conf, input, clustersIn, output, maxIterations);
  }
  return output;
  
}
 
开发者ID:saradelrio,项目名称:Chi-FRBCS-BigDataCS,代码行数:45,代码来源:DirichletDriver.java

示例2: clusterData

import org.apache.mahout.clustering.iterator.DirichletClusteringPolicy; //导入依赖的package包/类
/**
 * Run the job using supplied arguments
 * 
 * @param conf
 * @param input
 *          the directory pathname for input points
 * @param stateIn
 *          the directory pathname for input state
 * @param output
 *          the directory pathname for output points
 * @param emitMostLikely
 *          a boolean if true emit only most likely cluster for each point
 * @param threshold
 *          a double threshold value emits all clusters having greater pdf (emitMostLikely = false)
 * @param runSequential
 *          execute sequentially if true
 */
public static void clusterData(Configuration conf, Path input, Path stateIn, Path output, double alpha0,
    int numModels, boolean emitMostLikely, double threshold, boolean runSequential) throws IOException,
    InterruptedException, ClassNotFoundException {
  ClusterClassifier.writePolicy(new DirichletClusteringPolicy(numModels, alpha0), stateIn);
  ClusterClassificationDriver.run(conf, input, output, new Path(output, PathDirectory.CLUSTERED_POINTS_DIRECTORY), threshold,
      emitMostLikely, runSequential);
}
 
开发者ID:saradelrio,项目名称:Chi-FRBCS-BigDataCS,代码行数:25,代码来源:DirichletDriver.java


注:本文中的org.apache.mahout.clustering.iterator.DirichletClusteringPolicy类示例由纯净天空整理自Github/MSDocs等开源代码及文档管理平台,相关代码片段筛选自各路编程大神贡献的开源项目,源码版权归原作者所有,传播和使用请参考对应项目的License;未经允许,请勿转载。