當前位置: 首頁>>代碼示例>>Java>>正文


Java CAS.setDocumentText方法代碼示例

本文整理匯總了Java中org.apache.uima.cas.CAS.setDocumentText方法的典型用法代碼示例。如果您正苦於以下問題:Java CAS.setDocumentText方法的具體用法?Java CAS.setDocumentText怎麽用?Java CAS.setDocumentText使用的例子?那麽, 這裏精選的方法代碼示例或許可以為您提供幫助。您也可以進一步了解該方法所在org.apache.uima.cas.CAS的用法示例。


在下文中一共展示了CAS.setDocumentText方法的5個代碼示例,這些例子默認根據受歡迎程度排序。您可以為喜歡或者感覺有用的代碼點讚,您的評價將有助於係統推薦出更棒的Java代碼示例。

示例1: process

import org.apache.uima.cas.CAS; //導入方法依賴的package包/類
/**
 * Use the given analysis engine and process the given text
 * You must release the return cas yourself
 * @param text the text to rpocess
 * @return the processed cas
 */
public CAS process(String text) {
    CAS cas = retrieve();

    cas.setDocumentText(text);
    try {
        analysisEngine.process(cas);
    } catch (AnalysisEngineProcessException e) {
        if (text != null && !text.isEmpty())
            return process(text);
        throw new RuntimeException(e);
    }

    return cas;


}
 
開發者ID:deeplearning4j,項目名稱:DataVec,代碼行數:23,代碼來源:UimaResource.java

示例2: adaptFile

import org.apache.uima.cas.CAS; //導入方法依賴的package包/類
@Override
public void adaptFile(CAS cas, Path path) throws CollectionException, IOException {
  LOGGER.debug("Reading text into a CAS view.");
  CAS targetView = cas.createView(viewName);

  byte[] bytes = Files.readAllBytes(path);
  String documentText = new String(bytes,
      Objects.requireNonNull(encoding,
          "Encoding must not be null"));
  targetView.setDocumentText(documentText);

  String fileName = path.getFileName().toString();
  int period = fileName.lastIndexOf('.');
  if (period == -1) {
    period = fileName.length();
  }
  String documentId = fileName.substring(0, period);
  UimaAdapters.createDocument(cas, null, documentId);
}
 
開發者ID:nlpie,項目名稱:biomedicus,代碼行數:20,代碼來源:PlainTextInputFileAdapter.java

示例3: process

import org.apache.uima.cas.CAS; //導入方法依賴的package包/類
@Override
public void process(CAS aCAS) throws AnalysisEngineProcessException {
  Objects.requireNonNull(casRtfParser);
  Objects.requireNonNull(originalDocumentViewName);
  Objects.requireNonNull(targetViewName);

  LOGGER.debug("Parsing an rtf document from {} into CAS", originalDocumentViewName);

  CAS originalDocument = aCAS.getView(originalDocumentViewName);

  String documentText = originalDocument.getDocumentText();

  CAS targetView;
  boolean isRtf;
  if (documentText.indexOf("{\\rtf1") == 0) {
    StringReader reader = new StringReader(documentText);
    RtfSource rtfSource = new ReaderRtfSource(reader);

    try {
      casRtfParser.parseFile(aCAS, rtfSource);
    } catch (IOException | RtfReaderException e) {
      throw new AnalysisEngineProcessException(e);
    }

    isRtf = true;
  } else {
    targetView = aCAS.createView(targetViewName);
    targetView.setDocumentText(documentText);
    isRtf = false;
  }

  Document document = UimaAdapters.getDocument(aCAS, null);
  document.putMetadata("isRtf", Boolean.toString(isRtf));
}
 
開發者ID:nlpie,項目名稱:biomedicus,代碼行數:35,代碼來源:Parser.java

示例4: main

import org.apache.uima.cas.CAS; //導入方法依賴的package包/類
public static void main(String[] args) throws IOException, InvalidXMLException, ResourceInitializationException,
		AnalysisEngineProcessException, CASException {
	if (args.length != 2) {
		System.err.println("Usage: OpenNlpTrainerExtractor <input folder> <output file>");
	}

	AnalysisEngineDescription descriptor = (AnalysisEngineDescription) createResourceCreationSpecifier(
			new XMLInputSource(OpenNlpTrainerExtractor.class.getClassLoader().getResourceAsStream(
					"org/ie4opendata/octroy/SimpleFrenchTokenAndSentenceAnnotator.xml"), new File(".")),
			new Object[0]);
	AnalysisEngine engine = AnalysisEngineFactory.createEngine(descriptor);
	CAS cas = engine.newCAS();

	PrintWriter pw = new PrintWriter(new FileWriter(args[1]));

	for (File file : new File(args[0]).listFiles()) {
		BufferedReader br = new BufferedReader(new FileReader(file));

		StringBuilder doc = new StringBuilder();
		String line = br.readLine();
		while (line != null) {
			doc.append(line).append('\n');
			line = br.readLine();
		}
		br.close();

		cas.reset();
		cas.setDocumentText(doc.toString());
		cas.setDocumentLanguage("fr");

		DocumentAnnotation documentAnnotation = new DocumentAnnotation(cas.getJCas());
		documentAnnotation.setDocumentName(file.getName());
		documentAnnotation.setClassified(false);
		documentAnnotation.addToIndexes();

		engine.process(cas);

		// one sentence per line, one token separated by spaces
		JCas jcas = cas.getJCas();
		for (Sentence sentence : JCasUtil.select(jcas, Sentence.class)) {
			for (Token token : JCasUtil.selectCovered(Token.class, sentence)) {
				pw.print(token.getCoveredText() + " ");
			}
			pw.println();
		}
		// each document separated by an empty line
		pw.println();
	}
	pw.close();
}
 
開發者ID:IE4OpenData,項目名稱:Octroy,代碼行數:51,代碼來源:OpenNlpTrainerExtractor.java

示例5: setupView

import org.apache.uima.cas.CAS; //導入方法依賴的package包/類
@Override
public void setupView(CAS fromView, CAS toView) {
  toView.setDocumentText(fromView.getDocumentText());
}
 
開發者ID:nlpie,項目名稱:biomedicus,代碼行數:5,代碼來源:MtsamplesTo1_7_0.java


注:本文中的org.apache.uima.cas.CAS.setDocumentText方法示例由純淨天空整理自Github/MSDocs等開源代碼及文檔管理平台,相關代碼片段篩選自各路編程大神貢獻的開源項目,源碼版權歸原作者所有,傳播和使用請參考對應項目的License;未經允許,請勿轉載。