當前位置: 首頁>>代碼示例>>Java>>正文


Java Document.text方法代碼示例

本文整理匯總了Java中org.jsoup.nodes.Document.text方法的典型用法代碼示例。如果您正苦於以下問題:Java Document.text方法的具體用法?Java Document.text怎麽用?Java Document.text使用的例子?那麽, 這裏精選的方法代碼示例或許可以為您提供幫助。您也可以進一步了解該方法所在org.jsoup.nodes.Document的用法示例。


在下文中一共展示了Document.text方法的6個代碼示例,這些例子默認根據受歡迎程度排序。您可以為喜歡或者感覺有用的代碼點讚,您的評價將有助於係統推薦出更棒的Java代碼示例。

示例1: process

import org.jsoup.nodes.Document; //導入方法依賴的package包/類
/**
 * 解析頁麵
 * process函數需要完成的有:
 * 1.解析有用的信息,丟進去Page的List items中。之後save會進行存儲!
 *
 * @param page
 * @return 自己
 */
public Page process(Page page) {
    Document doc = page.getDocument();

    String title = doc.title();
    String text = doc.text();
    Map<String, String> items = new HashMap<String, String>();
    items.put("title", title);
    items.put("text", text);
    items.put("url", page.getUrlSeed().getUrl());

    page.setItems(items);

    return page;
}
 
開發者ID:xjtushilei,項目名稱:ScriptSpider,代碼行數:23,代碼來源:TextPageProcessor.java

示例2: canGetC3PO

import org.jsoup.nodes.Document; //導入方法依賴的package包/類
@Test
public void canGetC3PO() throws IOException {

    Document doc = Jsoup.connect("http://swapi.co/api/people/2/?format=json").ignoreContentType(true).get();

    String json = doc.text();
    System.out.println(json);

    // JSoup does not supply JSON parsing routines
    Assert.assertTrue(json.contains("C-3PO"));
}
 
開發者ID:eviltester,項目名稱:libraryexamples,代碼行數:12,代碼來源:SwapiApiFromJsoupUsageTest.java

示例3: fetchAndSave

import org.jsoup.nodes.Document; //導入方法依賴的package包/類
public AbstractMap.SimpleEntry<Integer, Integer> fetchAndSave() throws Exception {

        URL url = new URL(this.url);

        SyndFeedInput input = new SyndFeedInput();
        SyndFeed feed = input.build(new XmlReader(url));


        int items = feed.getEntries().size();

        if(items > 0){
            log.info("Attempting to parse rss feed: "+ this.url );
            log.info("This Feed has "+items +" items");
        }

        List <SyndEntry> entries = feed.getEntries();

        for (SyndEntry item : entries){
            log.info("Title: " + item.getTitle());
            log.info("Link: " + item.getLink());
            SyndContentImpl contentHolder = (SyndContentImpl) item.getContents().get(0);
            String content = contentHolder.getValue();

            //content might contain html data, let's clean it up
            Document doc = Jsoup.parse(content);
            content = doc.text();
            try {
                    Result result = ld.detectLanguage(content, language);
                    if (result.languageCode.equals(language) && result.isReliable) {

                        FileSaver file = new FileSaver(content, this.language, "bs", item.getLink(), item.getUri(), String.valueOf(content.hashCode()));
                        String fileName = file.getFileName();
                        BlogPost post = new BlogPost(content,this.language,null,"bs",item.getLink(),item.getUri(),fileName);
                        if(DAO.saveEntry(post)) {
                            file.save(this.logDb);
                            numOfFiles++;
                            wrongCount = 0;
                        }

                    }

                    else{
                        log.info("Item " + item.getTitle() + "is in a diff languageCode, skipping this post  "+ result.languageCode);
                        wrongCount ++;
                        if(wrongCount > 3){
                            log.info("Already found 3 posts in the wrong languageCode, skipping this blog");
                        }
                        break;
                    }

            }
            catch(Exception e){
                log.error(e);
                break;
            }


        }
        return new AbstractMap.SimpleEntry<>(numOfFiles,wrongCount);
    }
 
開發者ID:gidim,項目名稱:Babler,代碼行數:61,代碼來源:RSSScraper.java

示例4: getText

import org.jsoup.nodes.Document; //導入方法依賴的package包/類
String getText(final HtmlBlock node) {
  final Document document = Jsoup.parseBodyFragment(node.getChars().toString());
  return document.text();
}
 
開發者ID:camunda,項目名稱:camunda-bpm-swagger,代碼行數:5,代碼來源:HtmlDocumentInterpreter.java

示例5: main

import org.jsoup.nodes.Document; //導入方法依賴的package包/類
public static void main(String[] args) {
    
    try{
        
        // connect to the website         '1
        Connection connection = Jsoup.connect("http://www.bluetata.com");
        
        // get the HTML document          '2
        Document doc = connection.get();
        
        // parse text from HTML           '3
        String strHTML = doc.text();
        
        // out put dom                    '4
        System.out.println(strHTML);
        
    }catch(IOException ioex){
        ioex.printStackTrace();
    }
 
}
 
開發者ID:bluetata,項目名稱:crawler-jsoup-maven,代碼行數:22,代碼來源:Jsoup403ForbiddenExample.java

示例6: canGetLuke

import org.jsoup.nodes.Document; //導入方法依賴的package包/類
@Test
public void canGetLuke() throws IOException {

    // have to ignore content type or it throws exception if not text/*, application/xml, or application/xhtml+xml
    Document doc = Jsoup.connect("http://swapi.co/api/people/1/?format=json").ignoreContentType(true).get();

    String json = doc.text();
    System.out.println(json);

    // JSoup does not supply JSON parsing routines
    Assert.assertTrue(json.contains("Luke Skywalker"));

}
 
開發者ID:eviltester,項目名稱:libraryexamples,代碼行數:14,代碼來源:SwapiApiFromJsoupUsageTest.java


注:本文中的org.jsoup.nodes.Document.text方法示例由純淨天空整理自Github/MSDocs等開源代碼及文檔管理平台,相關代碼片段篩選自各路編程大神貢獻的開源項目,源碼版權歸原作者所有,傳播和使用請參考對應項目的License;未經允許,請勿轉載。