当前位置: 首页>>代码示例>>Java>>正文


Java InfoGain.getValueAtRank方法代码示例

本文整理汇总了Java中cc.mallet.types.InfoGain.getValueAtRank方法的典型用法代码示例。如果您正苦于以下问题:Java InfoGain.getValueAtRank方法的具体用法?Java InfoGain.getValueAtRank怎么用?Java InfoGain.getValueAtRank使用的例子?那么恭喜您, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在cc.mallet.types.InfoGain的用法示例。


在下文中一共展示了InfoGain.getValueAtRank方法的1个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Java代码示例。

示例1: labelFeatures

import cc.mallet.types.InfoGain; //导入方法依赖的package包/类
/**
 * Label features using heuristic described in 
 * "Learning from Labeled Features using Generalized Expectation Criteria"
 * Gregory Druck, Gideon Mann, Andrew McCallum.
 * 
 * @param list InstanceList used to compute statistics for labeling features.
 * @param features List of features to label.
 * @param reject Whether to reject labeling features.
 * @return Labeled features, HashMap mapping feature indices to list of labels.
 */
public static HashMap<Integer, ArrayList<Integer>> labelFeatures(InstanceList list, ArrayList<Integer> features, boolean reject) {
  HashMap<Integer,ArrayList<Integer>> labeledFeatures = new HashMap<Integer,ArrayList<Integer>>();
  
  double[][] featureLabelCounts = getFeatureLabelCounts(list,true);
  
  int numLabels = list.getTargetAlphabet().size();
  
  int minRank = 100 * numLabels;
  
  InfoGain infogain = new InfoGain(list);
  double sum = 0;
  for (int rank = 0; rank < minRank; rank++) {
    sum += infogain.getValueAtRank(rank);
  }
  double mean = sum / minRank;
  
  for (int i = 0; i < features.size(); i++) {
    int fi = features.get(i);
    
    // reject features with infogain
    // less than cutoff
    if (reject && infogain.value(fi) < mean) {
      //System.err.println("Oracle labeler rejected labeling: " + list.getDataAlphabet().lookupObject(fi));
      logger.info("Oracle labeler rejected labeling: " + list.getDataAlphabet().lookupObject(fi));
      continue;
    }
    
    double[] prob = featureLabelCounts[fi];
    MatrixOps.plusEquals(prob,1e-8);
    MatrixOps.timesEquals(prob, 1./MatrixOps.sum(prob));
    int[] sortedIndices = getMaxIndices(prob);
    ArrayList<Integer> labels = new ArrayList<Integer>();

    if (numLabels > 2) {
      // take anything within a factor of 2 of the best
      // but no more than numLabels/2
      boolean discard = false;
      double threshold = prob[sortedIndices[0]] / 2;
      for (int li = 0; li < numLabels; li++) {
        if (prob[li] > threshold) {
          labels.add(li);
        }
        if (reject && labels.size() > (numLabels / 2)) {
          //System.err.println("Oracle labeler rejected labeling: " + list.getDataAlphabet().lookupObject(fi));
          logger.info("Oracle labeler rejected labeling: " + list.getDataAlphabet().lookupObject(fi));
          discard = true;
          break;
        }
      }
      if (discard) {
        continue;
      }
    }
    else {
      labels.add(sortedIndices[0]);
    }
    
    labeledFeatures.put(fi, labels);
  }
  return labeledFeatures;
}
 
开发者ID:kostagiolasn,项目名称:NucleosomePatternClassifier,代码行数:72,代码来源:FeatureConstraintUtil.java


注:本文中的cc.mallet.types.InfoGain.getValueAtRank方法示例由纯净天空整理自Github/MSDocs等开源代码及文档管理平台,相关代码片段筛选自各路编程大神贡献的开源项目,源码版权归原作者所有,传播和使用请参考对应项目的License;未经允许,请勿转载。