當前位置: 首頁>>代碼示例>>Java>>正文


Java TokenTagging類代碼示例

本文整理匯總了Java中edu.jhu.hlt.concrete.TokenTagging的典型用法代碼示例。如果您正苦於以下問題:Java TokenTagging類的具體用法?Java TokenTagging怎麽用?Java TokenTagging使用的例子?那麽, 這裏精選的類代碼示例或許可以為您提供幫助。


TokenTagging類屬於edu.jhu.hlt.concrete包,在下文中一共展示了TokenTagging類的15個代碼示例,這些例子默認根據受歡迎程度排序。您可以為喜歡或者感覺有用的代碼點讚,您的評價將有助於係統推薦出更棒的Java代碼示例。

示例1: StanfordPreNERCommunication

import edu.jhu.hlt.concrete.TokenTagging; //導入依賴的package包/類
/**
 *
 */
StanfordPreNERCommunication(final Communication c) throws MiscommunicationException {
  this.ctc = new CachedTokenizationCommunication(c);
  final List<TokenTagging> ttList = new ArrayList<TokenTagging>();
  List<Tokenization> tkzList = this.ctc.getTokenizations();
  tkzList.stream().filter(tkz -> tkz.isSetTokenTaggingList())
      .forEach(tkz -> ttList.addAll(tkz.getTokenTaggingList()));

  this.posTTList = ttList.stream()
      .filter(tt -> tt.getTaggingType().equalsIgnoreCase("POS"))
      .collect(Collectors.toList());

  this.nerTTList = ttList.stream()
      .filter(tt -> tt.getTaggingType().equalsIgnoreCase("NER"))
      .collect(Collectors.toList());

  this.lemmaTTList = ttList.stream()
      .filter(tt -> tt.getTaggingType().equalsIgnoreCase("lemma"))
      .collect(Collectors.toList());

  this.depParseList = new ArrayList<>();
  this.ctc.getTokenizations().stream()
      .filter(tkz -> tkz.isSetDependencyParseList())
      .forEach(tkz -> this.depParseList.addAll(tkz.getDependencyParseList()));
}
 
開發者ID:hltcoe,項目名稱:concrete-stanford-deprecated2,代碼行數:28,代碼來源:StanfordPreNERCommunication.java

示例2: addPos

import edu.jhu.hlt.concrete.TokenTagging; //導入依賴的package包/類
private void addPos(AnnoSentenceCollection sents, Communication comm) {
    if (!sents.someHaveAt(AT.POS)) { return; }
    List<Tokenization> ts = getTokenizationsCorrespondingTo(sents, comm);
    AnnotationMetadata meta = new AnnotationMetadata();
    meta.setTool(POS_TOOL);
    meta.setTimestamp(timestamp);
    for(int i=0; i<sents.size(); i++) {
        Tokenization t = ts.get(i);
        AnnoSentence s = sents.get(i);
        List<TaggedToken> taggedTokens = new ArrayList<>();
        for (int j=0; j < s.size(); j++) {
            TaggedToken taggedToken = new TaggedToken();
            taggedToken.setTag(s.getPosTag(j));
            taggedToken.setTokenIndex(j);
            taggedTokens.add(taggedToken);
        }
        TokenTagging tokenTagging = new TokenTagging(getUUID(), meta, taggedTokens);
        tokenTagging.setTaggingType("POS");
        t.addToTokenTaggingList(tokenTagging);
   }
}
 
開發者ID:mgormley,項目名稱:pacaya-nlp,代碼行數:22,代碼來源:ConcreteWriter.java

示例3: addLemmata

import edu.jhu.hlt.concrete.TokenTagging; //導入依賴的package包/類
private void addLemmata(AnnoSentenceCollection sents, Communication comm) {
    if (!sents.someHaveAt(AT.LEMMA)) { return; }
    List<Tokenization> ts = getTokenizationsCorrespondingTo(sents, comm);
    AnnotationMetadata meta = new AnnotationMetadata();
    meta.setTool(LEMMA_TOOL);
    meta.setTimestamp(timestamp);
    for(int i=0; i<sents.size(); i++) {
        Tokenization t = ts.get(i);
        AnnoSentence s = sents.get(i);
        List<TaggedToken> taggedTokens = new ArrayList<>();
        for (int j=0; j < s.size(); j++) {
            TaggedToken taggedToken = new TaggedToken();
            taggedToken.setTag(s.getLemma(j));
            taggedToken.setTokenIndex(j);
            taggedTokens.add(taggedToken);
        }
        TokenTagging tokenTagging = new TokenTagging(getUUID(), meta, taggedTokens);
        tokenTagging.setTaggingType("LEMMA");
        t.addToTokenTaggingList(tokenTagging);
   }
}
 
開發者ID:mgormley,項目名稱:pacaya-nlp,代碼行數:22,代碼來源:ConcreteWriter.java

示例4: addTagging

import edu.jhu.hlt.concrete.TokenTagging; //導入依賴的package包/類
private static void addTagging(Tokenization tokenization, String tagType, String toolName, String[] tags) {
    List<TaggedToken> taggedTokenList = new ArrayList<>();
    int i = 0;
    for (String tag : tags) {
        TaggedToken t = new TaggedToken();
        t.setTag(tag);
        t.setTokenIndex(i++);
        taggedTokenList.add(t);
    }
    TokenTagging tt = new TokenTagging();
    tt.setUuid(getUUID());
    tt.setMetadata(getMetadata(toolName));
    tt.setTaggedTokenList(taggedTokenList);
    tt.setTaggingType(tagType);
    tokenization.addToTokenTaggingList(tt);
}
 
開發者ID:mgormley,項目名稱:pacaya-nlp,代碼行數:17,代碼來源:ConcreteReaderTest.java

示例5: generateConcreteTokenization

import edu.jhu.hlt.concrete.TokenTagging; //導入依賴的package包/類
/**
 * Generate a {@link Tokenization} object from a list of tokens, list of tags, list of offsets, and start position of the text (e.g., first text character in
 * the text). Assumes tags are part of speech tags.
 *
 * Invokes {@link #generateConcreteTokenization(List, int[], int)} then adds tagging.
 *
 * @see #generateConcreteTokenization(List, int[], int)
 *
 * @param tokens
 *          - a {@link List} of tokens (Strings)
 * @param offsets
 *          - an array of integers (offsets)
 * @param startPos
 *          - starting position of the text
 * @return a {@link Tokenization} object with correct tokenization and token tagging
 */
public static Tokenization generateConcreteTokenization(String[] tokens, String[] tokenTags, int[] offsets, int startPos) {
  Tokenization tokenization = generateConcreteTokenization(tokens, offsets, startPos);
  TokenTagging tt = new TokenTagging();
  tt.setUuid(UUIDFactory.newUUID());
  tt.setTaggingType("twitter");
  tt.setMetadata(new AnnotationMetadata(tiftMetadata));
  for (int i = 0; i < tokens.length; i++) {
    String tag = tokenTags[i];
    if (tag != null) {
      TaggedToken tok = new TaggedToken();
      tok.setTokenIndex(i).setTag(tokenTags[i]);
      tt.addToTaggedTokenList(tok);
    }
  }

  // Do not set the tags if everything was "null".
  if (tt.isSetTaggedTokenList())
    tokenization.addToTokenTaggingList(tt);

  return tokenization;
}
 
開發者ID:hltcoe,項目名稱:concrete-java,代碼行數:38,代碼來源:ConcreteTokenization.java

示例6: getFirstXTagsWithName

import edu.jhu.hlt.concrete.TokenTagging; //導入依賴的package包/類
private TokenTagging getFirstXTagsWithName(Tokenization tokenization, TagTypes which,
                                           String toolName) throws ConcreteException {
  if (!tokenization.isSetTokenTaggingList())
    throw new ConcreteException("No TokenTaggings for tokenization: " + tokenization.getUuid());
  
  List<TokenTagging> tokenTaggingLists = tokenization.getTokenTaggingList();
  for(int i = 0; i < tokenTaggingLists.size(); i++) {
    TokenTagging tt = tokenTaggingLists.get(i);
    if(tt.isSetTaggingType() && 
       tt.getTaggingType().equals(which.name()) &&
       tt.getMetadata().getTool().contains(toolName))
      return tt;
  }
  
  throw new ConcreteException("Did not find any tag theories with taggingType == " + which +" in tokenization " + tokenization.getUuid() + " with toolname containing " + toolName);
}
 
開發者ID:hltcoe,項目名稱:concrete-java,代碼行數:17,代碼來源:TokenizationUtils.java

示例7: StanfordToConcreteConversionOutput

import edu.jhu.hlt.concrete.TokenTagging; //導入依賴的package包/類
/**
 *
 */
public StanfordToConcreteConversionOutput(final List<Token> tokenList,
    final TokenTagging nerTT, final TokenTagging posTT, final TokenTagging lemmaTT) {
  this.tokenList = tokenList;
  this.nerTT = nerTT;
  this.posTT = posTT;
  this.lemmaTT = lemmaTT;
}
 
開發者ID:hltcoe,項目名稱:concrete-stanford-deprecated2,代碼行數:11,代碼來源:StanfordToConcreteConversionOutput.java

示例8: getTagging

import edu.jhu.hlt.concrete.TokenTagging; //導入依賴的package包/類
private static List<String> getTagging(TokenTagging tagging) {
    if (tagging == null) {
        return null;
    }
    List<String> tags = new ArrayList<String>();
    for (TaggedToken tok : tagging.getTaggedTokenList()) {
        tags.add(tok.getTag());
    }
    return tags;
}
 
開發者ID:mgormley,項目名稱:pacaya-nlp,代碼行數:11,代碼來源:ConcreteReader.java

示例9: getFirstXTagsWithName

import edu.jhu.hlt.concrete.TokenTagging; //導入依賴的package包/類
public static TokenTagging getFirstXTagsWithName(Tokenization tokenization, String taggingType, String toolName) {
    if (!tokenization.isSetTokenTaggingList()) {
        return null;
    }
    List<TokenTagging> tokenTaggingLists = tokenization.getTokenTaggingList();
    for (int i = 0; i < tokenTaggingLists.size(); i++) {
        TokenTagging tt = tokenTaggingLists.get(i);
        if (tt.isSetTaggingType() && tt.getTaggingType().equals(taggingType)
                && (toolName == null || tt.getMetadata().getTool().contains(toolName))) {
            return tt;
        }
    }
    return null;
}
 
開發者ID:mgormley,項目名稱:pacaya-nlp,代碼行數:15,代碼來源:ConcreteUtils.java

示例10: validateTokenTaggings

import edu.jhu.hlt.concrete.TokenTagging; //導入依賴的package包/類
/**
 * @return true if {@link TokenTagging} list is not present in this {@link Tokenization}, or if all TokenTagging
 * objects in the list are valid.
 */
private boolean validateTokenTaggings() {
  boolean ttsValid = true;
  if (this.annotation.isSetTokenTaggingList()) {
    Iterator<TokenTagging> iter = this.annotation.getTokenTaggingListIterator();

    while (ttsValid && iter.hasNext()) {
      // Check validity of each TokenTagging.
      TokenTagging tt = iter.next();
      ttsValid = new ValidatableTokenTagging(tt, this.annotation).isValid();
    }
  }

  return ttsValid;
}
 
開發者ID:hltcoe,項目名稱:concrete-java,代碼行數:19,代碼來源:ValidatableTokenization.java

示例11: convert

import edu.jhu.hlt.concrete.TokenTagging; //導入依賴的package包/類
private static final TaggedTokenGroup convert(TokenTagging tt) {
  TaggedTokenGroup.Builder b = new TaggedTokenGroup.Builder();
  b.setUUID(convert(tt.getUuid()));
  AnnotationMetadata amd = tt.getMetadata();
  b.setTool(NonEmptyNonWhitespaceString.create(amd.getTool()))
    .setKBest(IntGreaterThanZero.create(amd.getKBest()))
    .setTimestamp(UnixTimestamp.create(amd.getTimestamp()));
  b.setNullableTaggingType(tt.getTaggingType());
  for (edu.jhu.hlt.concrete.TaggedToken tok : tt.getTaggedTokenList()) {
    TaggedToken pt = convert(tok);
    b.putIndexToTaggedTokenMap(pt.getIndex().getVal(), pt);
  }

  return b.build();
}
 
開發者ID:hltcoe,項目名稱:concrete-java,代碼行數:16,代碼來源:FromConcrete.java

示例12: testEndURL

import edu.jhu.hlt.concrete.TokenTagging; //導入依賴的package包/類
@Test
public void testEndURL() {
  final String test = "'La traición vendrá de un general de alto rango que generará un gran caos' - http://t.co/MgLypirfTV http://…";
  Tokenization t = Tokenizer.TWITTER.tokenizeToConcrete(test);
  assertTrue(t.isSetTokenTaggingList());
  List<TokenTagging> ttl = t.getTokenTaggingList();
  assertEquals(1, ttl.size());
  TokenTagging tt = ttl.get(0);
  assertEquals("twitter", tt.getTaggingType());
  List<TaggedToken> tagTL = tt.getTaggedTokenList().stream()
      .filter(tagtok -> tagtok.getTag().equals("URL"))
      .collect(Collectors.toList());
  logger.debug("Tags:");
  tagTL.stream()
      .map(TaggedToken::getTag)
      .forEach(logger::debug);
  assertEquals(2, tagTL.size());
  TaggedToken last = tagTL.get(tagTL.size() - 1);
  assertTrue(t.isSetTokenList());
  List<Token> tl = t.getTokenList().getTokenList();
  logger.debug("tokens:");
  tl.stream()
    .map(Token::getText)
    .forEach(logger::debug);
  assertEquals("Should get 'http://' as text for last token.", "http://…", tl.get(last.getTokenIndex()).getText());
  assertEquals("Type of last token should be 'URL'.", "URL", last.getTag());
}
 
開發者ID:hltcoe,項目名稱:concrete-java,代碼行數:28,代碼來源:TokenizerTest.java

示例13: getFirstXTags

import edu.jhu.hlt.concrete.TokenTagging; //導入依賴的package包/類
private TokenTagging getFirstXTags(Tokenization tokenization, TagTypes which) throws ConcreteException {
  if (!tokenization.isSetTokenTaggingList())
    throw new ConcreteException("No TokenTaggings for tokenization: " + tokenization.getUuid());
  
  List<TokenTagging> tokenTaggingLists = tokenization.getTokenTaggingList();
  for(int i = 0; i < tokenTaggingLists.size(); i++) {
    TokenTagging tt = tokenTaggingLists.get(i);
    if(tt.isSetTaggingType() && tt.getTaggingType().equals(which.name()))
      return tt;
  }
  
  throw new ConcreteException("Did not find any tag theories with taggingType == " + which +" in tokenization " + tokenization.getUuid());
}
 
開發者ID:hltcoe,項目名稱:concrete-java,代碼行數:14,代碼來源:TokenizationUtils.java

示例14: getNerTT

import edu.jhu.hlt.concrete.TokenTagging; //導入依賴的package包/類
/**
 * @return the nerTT
 */
public TokenTagging getNerTT() {
  return nerTT;
}
 
開發者ID:hltcoe,項目名稱:concrete-stanford-deprecated2,代碼行數:7,代碼來源:StanfordToConcreteConversionOutput.java

示例15: getPosTT

import edu.jhu.hlt.concrete.TokenTagging; //導入依賴的package包/類
/**
 * @return the posTT
 */
public TokenTagging getPosTT() {
  return posTT;
}
 
開發者ID:hltcoe,項目名稱:concrete-stanford-deprecated2,代碼行數:7,代碼來源:StanfordToConcreteConversionOutput.java


注:本文中的edu.jhu.hlt.concrete.TokenTagging類示例由純淨天空整理自Github/MSDocs等開源代碼及文檔管理平台,相關代碼片段篩選自各路編程大神貢獻的開源項目,源碼版權歸原作者所有,傳播和使用請參考對應項目的License;未經允許,請勿轉載。