當前位置: 首頁>>代碼示例>>Java>>正文


Java DirichletClusteringPolicy類代碼示例

本文整理匯總了Java中org.apache.mahout.clustering.iterator.DirichletClusteringPolicy的典型用法代碼示例。如果您正苦於以下問題:Java DirichletClusteringPolicy類的具體用法?Java DirichletClusteringPolicy怎麽用?Java DirichletClusteringPolicy使用的例子?那麽, 這裏精選的類代碼示例或許可以為您提供幫助。


DirichletClusteringPolicy類屬於org.apache.mahout.clustering.iterator包,在下文中一共展示了DirichletClusteringPolicy類的2個代碼示例,這些例子默認根據受歡迎程度排序。您可以為喜歡或者感覺有用的代碼點讚,您的評價將有助於係統推薦出更棒的Java代碼示例。

示例1: buildClusters

import org.apache.mahout.clustering.iterator.DirichletClusteringPolicy; //導入依賴的package包/類
/**
 * Iterate over the input vectors to produce cluster directories for each iteration
 * 
 * @param conf
 *          the hadoop configuration
 * @param input
 *          the directory Path for input points
 * @param output
 *          the directory Path for output points
 * @param description
 *          model distribution parameters
 * @param numClusters
 *          the number of models to iterate over
 * @param maxIterations
 *          the maximum number of iterations
 * @param alpha0
 *          the alpha_0 value for the DirichletDistribution
 * @param runSequential
 *          execute sequentially if true
 * 
 * @return the Path of the final clusters directory
 */
public static Path buildClusters(Configuration conf, Path input, Path output, DistributionDescription description,
    int numClusters, int maxIterations, double alpha0, boolean runSequential) throws IOException,
    ClassNotFoundException, InterruptedException {
  Path clustersIn = new Path(output, Cluster.INITIAL_CLUSTERS_DIR);
  ModelDistribution<VectorWritable> modelDist = description.createModelDistribution(conf);
  
  List<Cluster> models = Lists.newArrayList();
  for (Model<VectorWritable> cluster : modelDist.sampleFromPrior(numClusters)) {
    models.add((Cluster) cluster);
  }
  
  ClusterClassifier prior = new ClusterClassifier(models, new DirichletClusteringPolicy(numClusters, alpha0));
  prior.writeToSeqFiles(clustersIn);
  
  if (runSequential) {
    ClusterIterator.iterateSeq(conf, input, clustersIn, output, maxIterations);
  } else {
    ClusterIterator.iterateMR(conf, input, clustersIn, output, maxIterations);
  }
  return output;
  
}
 
開發者ID:saradelrio,項目名稱:Chi-FRBCS-BigDataCS,代碼行數:45,代碼來源:DirichletDriver.java

示例2: clusterData

import org.apache.mahout.clustering.iterator.DirichletClusteringPolicy; //導入依賴的package包/類
/**
 * Run the job using supplied arguments
 * 
 * @param conf
 * @param input
 *          the directory pathname for input points
 * @param stateIn
 *          the directory pathname for input state
 * @param output
 *          the directory pathname for output points
 * @param emitMostLikely
 *          a boolean if true emit only most likely cluster for each point
 * @param threshold
 *          a double threshold value emits all clusters having greater pdf (emitMostLikely = false)
 * @param runSequential
 *          execute sequentially if true
 */
public static void clusterData(Configuration conf, Path input, Path stateIn, Path output, double alpha0,
    int numModels, boolean emitMostLikely, double threshold, boolean runSequential) throws IOException,
    InterruptedException, ClassNotFoundException {
  ClusterClassifier.writePolicy(new DirichletClusteringPolicy(numModels, alpha0), stateIn);
  ClusterClassificationDriver.run(conf, input, output, new Path(output, PathDirectory.CLUSTERED_POINTS_DIRECTORY), threshold,
      emitMostLikely, runSequential);
}
 
開發者ID:saradelrio,項目名稱:Chi-FRBCS-BigDataCS,代碼行數:25,代碼來源:DirichletDriver.java


注:本文中的org.apache.mahout.clustering.iterator.DirichletClusteringPolicy類示例由純淨天空整理自Github/MSDocs等開源代碼及文檔管理平台,相關代碼片段篩選自各路編程大神貢獻的開源項目,源碼版權歸原作者所有,傳播和使用請參考對應項目的License;未經允許,請勿轉載。