当前位置: 首页>>代码示例>>Java>>正文


Java TaggedToken类代码示例

本文整理汇总了Java中edu.jhu.hlt.concrete.TaggedToken的典型用法代码示例。如果您正苦于以下问题:Java TaggedToken类的具体用法?Java TaggedToken怎么用?Java TaggedToken使用的例子?那么, 这里精选的类代码示例或许可以为您提供帮助。


TaggedToken类属于edu.jhu.hlt.concrete包,在下文中一共展示了TaggedToken类的11个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Java代码示例。

示例1: addPos

import edu.jhu.hlt.concrete.TaggedToken; //导入依赖的package包/类
private void addPos(AnnoSentenceCollection sents, Communication comm) {
    if (!sents.someHaveAt(AT.POS)) { return; }
    List<Tokenization> ts = getTokenizationsCorrespondingTo(sents, comm);
    AnnotationMetadata meta = new AnnotationMetadata();
    meta.setTool(POS_TOOL);
    meta.setTimestamp(timestamp);
    for(int i=0; i<sents.size(); i++) {
        Tokenization t = ts.get(i);
        AnnoSentence s = sents.get(i);
        List<TaggedToken> taggedTokens = new ArrayList<>();
        for (int j=0; j < s.size(); j++) {
            TaggedToken taggedToken = new TaggedToken();
            taggedToken.setTag(s.getPosTag(j));
            taggedToken.setTokenIndex(j);
            taggedTokens.add(taggedToken);
        }
        TokenTagging tokenTagging = new TokenTagging(getUUID(), meta, taggedTokens);
        tokenTagging.setTaggingType("POS");
        t.addToTokenTaggingList(tokenTagging);
   }
}
 
开发者ID:mgormley,项目名称:pacaya-nlp,代码行数:22,代码来源:ConcreteWriter.java

示例2: addLemmata

import edu.jhu.hlt.concrete.TaggedToken; //导入依赖的package包/类
private void addLemmata(AnnoSentenceCollection sents, Communication comm) {
    if (!sents.someHaveAt(AT.LEMMA)) { return; }
    List<Tokenization> ts = getTokenizationsCorrespondingTo(sents, comm);
    AnnotationMetadata meta = new AnnotationMetadata();
    meta.setTool(LEMMA_TOOL);
    meta.setTimestamp(timestamp);
    for(int i=0; i<sents.size(); i++) {
        Tokenization t = ts.get(i);
        AnnoSentence s = sents.get(i);
        List<TaggedToken> taggedTokens = new ArrayList<>();
        for (int j=0; j < s.size(); j++) {
            TaggedToken taggedToken = new TaggedToken();
            taggedToken.setTag(s.getLemma(j));
            taggedToken.setTokenIndex(j);
            taggedTokens.add(taggedToken);
        }
        TokenTagging tokenTagging = new TokenTagging(getUUID(), meta, taggedTokens);
        tokenTagging.setTaggingType("LEMMA");
        t.addToTokenTaggingList(tokenTagging);
   }
}
 
开发者ID:mgormley,项目名称:pacaya-nlp,代码行数:22,代码来源:ConcreteWriter.java

示例3: addTagging

import edu.jhu.hlt.concrete.TaggedToken; //导入依赖的package包/类
private static void addTagging(Tokenization tokenization, String tagType, String toolName, String[] tags) {
    List<TaggedToken> taggedTokenList = new ArrayList<>();
    int i = 0;
    for (String tag : tags) {
        TaggedToken t = new TaggedToken();
        t.setTag(tag);
        t.setTokenIndex(i++);
        taggedTokenList.add(t);
    }
    TokenTagging tt = new TokenTagging();
    tt.setUuid(getUUID());
    tt.setMetadata(getMetadata(toolName));
    tt.setTaggedTokenList(taggedTokenList);
    tt.setTaggingType(tagType);
    tokenization.addToTokenTaggingList(tt);
}
 
开发者ID:mgormley,项目名称:pacaya-nlp,代码行数:17,代码来源:ConcreteReaderTest.java

示例4: generateConcreteTokenization

import edu.jhu.hlt.concrete.TaggedToken; //导入依赖的package包/类
/**
 * Generate a {@link Tokenization} object from a list of tokens, list of tags, list of offsets, and start position of the text (e.g., first text character in
 * the text). Assumes tags are part of speech tags.
 *
 * Invokes {@link #generateConcreteTokenization(List, int[], int)} then adds tagging.
 *
 * @see #generateConcreteTokenization(List, int[], int)
 *
 * @param tokens
 *          - a {@link List} of tokens (Strings)
 * @param offsets
 *          - an array of integers (offsets)
 * @param startPos
 *          - starting position of the text
 * @return a {@link Tokenization} object with correct tokenization and token tagging
 */
public static Tokenization generateConcreteTokenization(String[] tokens, String[] tokenTags, int[] offsets, int startPos) {
  Tokenization tokenization = generateConcreteTokenization(tokens, offsets, startPos);
  TokenTagging tt = new TokenTagging();
  tt.setUuid(UUIDFactory.newUUID());
  tt.setTaggingType("twitter");
  tt.setMetadata(new AnnotationMetadata(tiftMetadata));
  for (int i = 0; i < tokens.length; i++) {
    String tag = tokenTags[i];
    if (tag != null) {
      TaggedToken tok = new TaggedToken();
      tok.setTokenIndex(i).setTag(tokenTags[i]);
      tt.addToTaggedTokenList(tok);
    }
  }

  // Do not set the tags if everything was "null".
  if (tt.isSetTaggedTokenList())
    tokenization.addToTokenTaggingList(tt);

  return tokenization;
}
 
开发者ID:hltcoe,项目名称:concrete-java,代码行数:38,代码来源:ConcreteTokenization.java

示例5: toTaggedToken

import edu.jhu.hlt.concrete.TaggedToken; //导入依赖的package包/类
private TaggedToken toTaggedToken(final String tag) {
  TaggedToken tt = new TaggedToken();
  int idx = this.orig.getIndex() - 1;

  tt.setTokenIndex(idx);
  tt.setTag(tag);

  return tt;
}
 
开发者ID:hltcoe,项目名称:concrete-stanford-deprecated2,代码行数:10,代码来源:PreNERCoreLabelWrapper.java

示例6: getTagging

import edu.jhu.hlt.concrete.TaggedToken; //导入依赖的package包/类
private static List<String> getTagging(TokenTagging tagging) {
    if (tagging == null) {
        return null;
    }
    List<String> tags = new ArrayList<String>();
    for (TaggedToken tok : tagging.getTaggedTokenList()) {
        tags.add(tok.getTag());
    }
    return tags;
}
 
开发者ID:mgormley,项目名称:pacaya-nlp,代码行数:11,代码来源:ConcreteReader.java

示例7: testEndURL

import edu.jhu.hlt.concrete.TaggedToken; //导入依赖的package包/类
@Test
public void testEndURL() {
  final String test = "'La traición vendrá de un general de alto rango que generará un gran caos' - http://t.co/MgLypirfTV http://…";
  Tokenization t = Tokenizer.TWITTER.tokenizeToConcrete(test);
  assertTrue(t.isSetTokenTaggingList());
  List<TokenTagging> ttl = t.getTokenTaggingList();
  assertEquals(1, ttl.size());
  TokenTagging tt = ttl.get(0);
  assertEquals("twitter", tt.getTaggingType());
  List<TaggedToken> tagTL = tt.getTaggedTokenList().stream()
      .filter(tagtok -> tagtok.getTag().equals("URL"))
      .collect(Collectors.toList());
  logger.debug("Tags:");
  tagTL.stream()
      .map(TaggedToken::getTag)
      .forEach(logger::debug);
  assertEquals(2, tagTL.size());
  TaggedToken last = tagTL.get(tagTL.size() - 1);
  assertTrue(t.isSetTokenList());
  List<Token> tl = t.getTokenList().getTokenList();
  logger.debug("tokens:");
  tl.stream()
    .map(Token::getText)
    .forEach(logger::debug);
  assertEquals("Should get 'http://' as text for last token.", "http://…", tl.get(last.getTokenIndex()).getText());
  assertEquals("Type of last token should be 'URL'.", "URL", last.getTag());
}
 
开发者ID:hltcoe,项目名称:concrete-java,代码行数:28,代码来源:TokenizerTest.java

示例8: toNERToken

import edu.jhu.hlt.concrete.TaggedToken; //导入依赖的package包/类
public Optional<TaggedToken> toNERToken() {
  return this.nerTag.map(x -> this.toTaggedToken(x));
}
 
开发者ID:hltcoe,项目名称:concrete-stanford-deprecated2,代码行数:4,代码来源:PreNERCoreLabelWrapper.java

示例9: toLemmaToken

import edu.jhu.hlt.concrete.TaggedToken; //导入依赖的package包/类
public Optional<TaggedToken> toLemmaToken() {
  return this.lemmaTag.map(x -> this.toTaggedToken(x));
}
 
开发者ID:hltcoe,项目名称:concrete-stanford-deprecated2,代码行数:4,代码来源:PreNERCoreLabelWrapper.java

示例10: toPOSToken

import edu.jhu.hlt.concrete.TaggedToken; //导入依赖的package包/类
public Optional<TaggedToken> toPOSToken() {
  return this.posTag.map(x -> this.toTaggedToken(x));
}
 
开发者ID:hltcoe,项目名称:concrete-stanford-deprecated2,代码行数:4,代码来源:PreNERCoreLabelWrapper.java

示例11: ValidatableTokenTagging

import edu.jhu.hlt.concrete.TaggedToken; //导入依赖的package包/类
/**
 * 
 */
public ValidatableTokenTagging(TokenTagging tagging, Tokenization parent) {
  this.tagging = tagging;
  this.parent = parent;
  
  // TODO: only accept correct Tokenization
  TokenizationKind kind = parent.getKind();
  switch (kind) {
  case TOKEN_LIST:
    TokenList tok = parent.getTokenList();
    List<Token> tokList = tok.getTokenList();
    List<Integer> tokIndicesList = new ArrayList<Integer>();
    int tmpIdx = -1;
    for (Token t : tokList) {
      final int tidx = t.getTokenIndex();
      tokIndicesList.add(tidx);
      if (tmpIdx < tidx)
        tmpIdx = tidx;  
    }
    
    this.maxTokenIdx = tmpIdx;
    this.tokIndices = tokIndicesList;
    break;
  default:
    throw new IllegalArgumentException("Validating of tokenization type: " + parent.getKind() + " not supported.");
  }
  
  List<TaggedToken> ttList = this.tagging.getTaggedTokenList();
  this.tokenTaggings = ttList;
  if (ttList.size() > 0) {
    this.ttIndices = new ArrayList<Integer>();
    
    int tmpMaxIdx = -1;
    for (TaggedToken tt : ttList) {
      int ttIndex = tt.getTokenIndex();
      if (tmpMaxIdx < ttIndex)
        tmpMaxIdx = ttIndex;
      this.ttIndices.add(tt.getTokenIndex());
    }
    
    this.maxTTIndex = tmpMaxIdx;
  } else {
    this.ttIndices = new ArrayList<>();
    this.maxTTIndex = -1;
  }
}
 
开发者ID:hltcoe,项目名称:concrete-java,代码行数:49,代码来源:ValidatableTokenTagging.java


注:本文中的edu.jhu.hlt.concrete.TaggedToken类示例由纯净天空整理自Github/MSDocs等开源代码及文档管理平台,相关代码片段筛选自各路编程大神贡献的开源项目,源码版权归原作者所有,传播和使用请参考对应项目的License;未经允许,请勿转载。