當前位置: 首頁>>代碼示例>>Java>>正文


Java DocumentSource類代碼示例

本文整理匯總了Java中org.apache.any23.source.DocumentSource的典型用法代碼示例。如果您正苦於以下問題:Java DocumentSource類的具體用法?Java DocumentSource怎麽用?Java DocumentSource使用的例子?那麽, 這裏精選的類代碼示例或許可以為您提供幫助。


DocumentSource類屬於org.apache.any23.source包,在下文中一共展示了DocumentSource類的2個代碼示例,這些例子默認根據受歡迎程度排序。您可以為喜歡或者感覺有用的代碼點讚,您的評價將有助於係統推薦出更棒的Java代碼示例。

示例1: _extractTopicsFrom

import org.apache.any23.source.DocumentSource; //導入依賴的package包/類
@Override
public boolean _extractTopicsFrom(URL url, TopicMap topicMap) throws Exception {
    if(url != null) {
        tripletSource = url.toExternalForm();
        Any23 runner = new Any23();

        runner.setHTTPUserAgent("Wandora ANY23 Extractor");
        HTTPClient httpClient = runner.getHTTPClient();
        DocumentSource source = new HTTPDocumentSource(
            httpClient,
            url.toExternalForm()
        );
        namespace = url.toExternalForm();
        TripleHandler handler = new TopicMapsCreator(topicMap);
        runner.extract(source, handler);
    }
    tripletSource = null;
    return true;
}
 
開發者ID:wandora-team,項目名稱:wandora,代碼行數:20,代碼來源:Any23Extractor.java

示例2: run

import org.apache.any23.source.DocumentSource; //導入依賴的package包/類
private static void run(String uri, String outputDir, String outputFormat) throws IOException, URISyntaxException, ExtractionException {
  Any23 runner = new Any23();
  runner.setHTTPUserAgent("Eurosentiment Crawler");
  HTTPClient httpClient = runner.getHTTPClient();
  DocumentSource source = new HTTPDocumentSource(
      httpClient,
      uri
      );
  ByteArrayOutputStream out = new ByteArrayOutputStream();
  TripleHandler handler = null;
  if (outputFormat != null) {
    switch (outputFormat) {
    case "turtle":
      handler = new TurtleWriter(out);
      break;
    case "ntriples":
      handler = new NTriplesWriter(out);
      break;
    case "rdfxml":
      handler = new RDFXMLWriter(out);
      break;
    case "nquads":
      handler = new NQuadsWriter(out);
      break;
    case "trix":
      handler = new TriXWriter(out);
      break;
    case "json":
      handler = new JSONWriter(out);
      break;
    default:
      System.out.println("No output writer found for type: " + outputFormat);
      System.out.println("Defaulting to Turtle output serialization");
      handler = new TurtleWriter(out);
      break;
    }
    System.out.println("Selected " + handler.getClass().getSimpleName() + " as output writer.");
  }
  try {
    runner.extract(source, handler);
  } finally {
    try {
      handler.close();
    } catch (TripleHandlerException e) {
      e.printStackTrace();
    }
  }
  if (outputDir != null) {
    FileUtils.writeStringToFile(new File(outputDir + "/sentiment.txt"), out.toString("UTF-8"));
    System.out.println("Successfully wrote file to: " + outputDir + "/sentiment.txt");
  } else {
    FileUtils.writeStringToFile(new File("sentiment.txt"), out.toString("UTF-8"));
    System.out.println("Successfully wrote file to sentiment.txt");
  }
}
 
開發者ID:eurocent,項目名稱:sentimentCrawler,代碼行數:56,代碼來源:Runner.java


注:本文中的org.apache.any23.source.DocumentSource類示例由純淨天空整理自Github/MSDocs等開源代碼及文檔管理平台,相關代碼片段篩選自各路編程大神貢獻的開源項目,源碼版權歸原作者所有,傳播和使用請參考對應項目的License;未經允許,請勿轉載。