當前位置: 首頁>>代碼示例>>Java>>正文


Java CAS.getDocumentText方法代碼示例

本文整理匯總了Java中org.apache.uima.cas.CAS.getDocumentText方法的典型用法代碼示例。如果您正苦於以下問題:Java CAS.getDocumentText方法的具體用法?Java CAS.getDocumentText怎麽用?Java CAS.getDocumentText使用的例子?那麽, 這裏精選的方法代碼示例或許可以為您提供幫助。您也可以進一步了解該方法所在org.apache.uima.cas.CAS的用法示例。


在下文中一共展示了CAS.getDocumentText方法的4個代碼示例,這些例子默認根據受歡迎程度排序。您可以為喜歡或者感覺有用的代碼點讚,您的評價將有助於係統推薦出更棒的Java代碼示例。

示例1: process

import org.apache.uima.cas.CAS; //導入方法依賴的package包/類
@Override
public void process(CAS cas)
        throws AnalysisEngineProcessException
{
    String text = cas.getDocumentText();

    // NOTE: Twokenize provides a API call that performs a normalization first - this would
    // require a mapping to the text how it is present in the CAS object. Due to HTML escaping
    // that would become really messy, we use the call which does not perform any normalization
    List<String> tokenize = Twokenize.tokenize(text);
    int offset = 0;
    for (String t : tokenize) {
        int start = text.indexOf(t, offset);
        int end = start + t.length();
        createTokenAnnotation(cas, start, end);
        offset = end;
    }

}
 
開發者ID:UKPLab,項目名稱:argument-reasoning-comprehension-task,代碼行數:20,代碼來源:ArkTweetTokenizerFixed.java

示例2: process

import org.apache.uima.cas.CAS; //導入方法依賴的package包/類
@Override
public void process(CAS aCAS) throws AnalysisEngineProcessException {
  Objects.requireNonNull(casRtfParser);
  Objects.requireNonNull(originalDocumentViewName);
  Objects.requireNonNull(targetViewName);

  LOGGER.debug("Parsing an rtf document from {} into CAS", originalDocumentViewName);

  CAS originalDocument = aCAS.getView(originalDocumentViewName);

  String documentText = originalDocument.getDocumentText();

  CAS targetView;
  boolean isRtf;
  if (documentText.indexOf("{\\rtf1") == 0) {
    StringReader reader = new StringReader(documentText);
    RtfSource rtfSource = new ReaderRtfSource(reader);

    try {
      casRtfParser.parseFile(aCAS, rtfSource);
    } catch (IOException | RtfReaderException e) {
      throw new AnalysisEngineProcessException(e);
    }

    isRtf = true;
  } else {
    targetView = aCAS.createView(targetViewName);
    targetView.setDocumentText(documentText);
    isRtf = false;
  }

  Document document = UimaAdapters.getDocument(aCAS, null);
  document.putMetadata("isRtf", Boolean.toString(isRtf));
}
 
開發者ID:nlpie,項目名稱:biomedicus,代碼行數:35,代碼來源:Parser.java

示例3: entityProcessComplete

import org.apache.uima.cas.CAS; //導入方法依賴的package包/類
/**
 * Called when the processing of a Document is completed. <br>
 * The process status can be looked at and corresponding actions taken.
 * 
 * @param aCas
 *          CAS corresponding to the completed processing
 * @param aStatus
 *          EntityProcessStatus that holds the status of all the events for aEntity
 */
public void entityProcessComplete(CAS aCas, EntityProcessStatus aStatus) {
  if (aStatus.isException()) {
    List exceptions = aStatus.getExceptions();
    for (int i = 0; i < exceptions.size(); i++) {
      ((Throwable) exceptions.get(i)).printStackTrace();
    }
    return;
  }
  entityCount++;
  String docText = aCas.getDocumentText();
  if (docText != null) {
    size += docText.length();
  }
}
 
開發者ID:oaqa,項目名稱:knn4qa,代碼行數:24,代碼來源:SimpleRunCPE_fixed.java

示例4: fromView

import org.apache.uima.cas.CAS; //導入方法依賴的package包/類
/**
 * Indexes all the symbols from an original document.
 *
 * @param originalDocumentView jCas original document view.
 * @return The newly created symbol indexed document.
 */
public static SymbolIndexedDocument fromView(CAS originalDocumentView) {
  Type viewIndexType = originalDocumentView.getTypeSystem()
      .getType("edu.umn.biomedicus.rtfuima.type.ViewIndex");

  Feature destinationNameFeature = viewIndexType
      .getFeatureByBaseName("destinationName");
  Feature destinationIndexFeature = viewIndexType
      .getFeatureByBaseName("destinationIndex");

  AnnotationIndex<AnnotationFS> viewIndexAI = originalDocumentView
      .getAnnotationIndex(viewIndexType);

  List<SymbolLocation> symbolLocations = new ArrayList<>();

  Map<String, Map<Integer, Integer>> destinationMap = new HashMap<>();

  int index = 0;
  int lastEnd = 0;
  for (AnnotationFS annotation : viewIndexAI) {
    int begin = annotation.getBegin();
    int end = annotation.getEnd();

    String destinationName
        = annotation.getStringValue(destinationNameFeature);

    SymbolLocation symbolLocation = new SymbolLocation(
        destinationName,
        begin - lastEnd,
        end - begin,
        index++
    );

    symbolLocations.add(symbolLocation);

    int destinationIndex
        = annotation.getIntValue(destinationIndexFeature);

    destinationMap.compute(destinationName,
        (String key, @Nullable Map<Integer, Integer> value) -> {
          if (value == null) {
            value = new HashMap<>();
          }
          value.put(destinationIndex, symbolLocations.size() - 1);

          return value;
        });
    lastEnd = end;
  }
  return new SymbolIndexedDocument(symbolLocations, destinationMap,
      originalDocumentView.getDocumentText());
}
 
開發者ID:nlpie,項目名稱:biomedicus,代碼行數:58,代碼來源:SymbolIndexedDocument.java


注:本文中的org.apache.uima.cas.CAS.getDocumentText方法示例由純淨天空整理自Github/MSDocs等開源代碼及文檔管理平台,相關代碼片段篩選自各路編程大神貢獻的開源項目,源碼版權歸原作者所有,傳播和使用請參考對應項目的License;未經允許,請勿轉載。