当前位置: 首页>>代码示例>>Java>>正文


Java Treebank.iterator方法代码示例

本文整理汇总了Java中edu.stanford.nlp.trees.Treebank.iterator方法的典型用法代码示例。如果您正苦于以下问题:Java Treebank.iterator方法的具体用法?Java Treebank.iterator怎么用?Java Treebank.iterator使用的例子?那么恭喜您, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在edu.stanford.nlp.trees.Treebank的用法示例。


在下文中一共展示了Treebank.iterator方法的3个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Java代码示例。

示例1: getSegmentedWordLengthDistribution

import edu.stanford.nlp.trees.Treebank; //导入方法依赖的package包/类
private Distribution<Integer> getSegmentedWordLengthDistribution(Treebank tb) {
  // CharacterLevelTagExtender ext = new CharacterLevelTagExtender();
  ClassicCounter<Integer> c = new ClassicCounter<Integer>();
  for (Iterator iterator = tb.iterator(); iterator.hasNext();) {
    Tree gold = (Tree) iterator.next();
    StringBuilder goldChars = new StringBuilder();
    Sentence goldYield = gold.yield();
    for (Iterator wordIter = goldYield.iterator(); wordIter.hasNext();) {
      Word word = (Word) wordIter.next();
      goldChars.append(word);
    }
    Sentence ourWords = segmentWords(goldChars.toString());
    for (int i = 0; i < ourWords.size(); i++) {
      c.incrementCount(Integer.valueOf(ourWords.get(i).toString().length()));
    }
  }
  return Distribution.getDistribution(c);
}
 
开发者ID:FabianFriedrich,项目名称:Text2Process,代码行数:19,代码来源:ChineseMarkovWordSegmenter.java

示例2: getSegmentedWordLengthDistribution

import edu.stanford.nlp.trees.Treebank; //导入方法依赖的package包/类
private Distribution<Integer> getSegmentedWordLengthDistribution(Treebank tb) {
  // CharacterLevelTagExtender ext = new CharacterLevelTagExtender();
  ClassicCounter<Integer> c = new ClassicCounter<Integer>();
  for (Iterator iterator = tb.iterator(); iterator.hasNext();) {
    Tree gold = (Tree) iterator.next();
    StringBuilder goldChars = new StringBuilder();
    ArrayList goldYield = gold.yield();
    for (Iterator wordIter = goldYield.iterator(); wordIter.hasNext();) {
      Word word = (Word) wordIter.next();
      goldChars.append(word);
    }
    List<HasWord> ourWords = segment(goldChars.toString());
    for (int i = 0; i < ourWords.size(); i++) {
      c.incrementCount(Integer.valueOf(ourWords.get(i).word().length()));
    }
  }
  return Distribution.getDistribution(c);
}
 
开发者ID:amark-india,项目名称:eventspotter,代码行数:19,代码来源:ChineseMarkovWordSegmenter.java

示例3: main

import edu.stanford.nlp.trees.Treebank; //导入方法依赖的package包/类
/**
 * Execute with no arguments for usage.
 */
public static void main(String[] args) {

  if(!validateCommandLine(args)) {
    System.err.println(usage);
    System.exit(-1);
  }

  final TreebankLangParserParams tlpp = Languages.getLanguageParams(LANGUAGE);
  final PrintWriter pwOut = tlpp.pw();

  final Treebank guessTreebank = tlpp.diskTreebank();
  guessTreebank.loadPath(guessFile);
  pwOut.println("GUESS TREEBANK:");
  pwOut.println(guessTreebank.textualSummary());

  final Treebank goldTreebank = tlpp.diskTreebank();
  goldTreebank.loadPath(goldFile);
  pwOut.println("GOLD TREEBANK:");
  pwOut.println(goldTreebank.textualSummary());

  final LeafAncestorEval metric = new LeafAncestorEval("LeafAncestor");

  final TreeTransformer tc = tlpp.collinizer();

  //The evalb ref implementation assigns status for each tree pair as follows:
  //
  //   0 - Ok (yields match)
  //   1 - length mismatch
  //   2 - null parse e.g. (()).
  //
  //In the cases of 1,2, evalb does not include the tree pair in the LP/LR computation.
  final Iterator<Tree> goldItr = goldTreebank.iterator();
  final Iterator<Tree> guessItr = guessTreebank.iterator();
  int goldLineId = 0;
  int guessLineId = 0;
  int skippedGuessTrees = 0;
  while( guessItr.hasNext() && goldItr.hasNext() ) {
    Tree guessTree = guessItr.next();
    List<Label> guessYield = guessTree.yield();
    guessLineId++;

    Tree goldTree = goldItr.next();
    List<Label> goldYield = goldTree.yield();
    goldLineId++;

    // Check that we should evaluate this tree
    if(goldYield.size() > MAX_GOLD_YIELD) {
      skippedGuessTrees++;
      continue;
    }

    // Only trees with equal yields can be evaluated
    if(goldYield.size() != guessYield.size()) {
      pwOut.printf("Yield mismatch gold: %d tokens vs. guess: %d tokens (lines: gold %d guess %d)%n", goldYield.size(), guessYield.size(), goldLineId, guessLineId);
      skippedGuessTrees++;
      continue;
    }
    
    final Tree evalGuess = tc.transformTree(guessTree);
    final Tree evalGold = tc.transformTree(goldTree);

    metric.evaluate(evalGuess, evalGold, ((VERBOSE) ? pwOut : null));
  }
  
  if(guessItr.hasNext() || goldItr.hasNext()) {
    System.err.printf("Guess/gold files do not have equal lengths (guess: %d gold: %d)%n.", guessLineId, goldLineId);
  }
  
  pwOut.println("================================================================================");
  if(skippedGuessTrees != 0) pwOut.printf("%s %d guess trees\n", "Unable to evaluate", skippedGuessTrees);
  metric.display(true, pwOut);
  pwOut.close();
}
 
开发者ID:benblamey,项目名称:stanford-nlp,代码行数:77,代码来源:LeafAncestorEval.java


注:本文中的edu.stanford.nlp.trees.Treebank.iterator方法示例由纯净天空整理自Github/MSDocs等开源代码及文档管理平台,相关代码片段筛选自各路编程大神贡献的开源项目,源码版权归原作者所有,传播和使用请参考对应项目的License;未经允许,请勿转载。