當前位置: 首頁>>代碼示例>>Java>>正文


Java Jsoup.parseBodyFragment方法代碼示例

本文整理匯總了Java中org.jsoup.Jsoup.parseBodyFragment方法的典型用法代碼示例。如果您正苦於以下問題:Java Jsoup.parseBodyFragment方法的具體用法?Java Jsoup.parseBodyFragment怎麽用?Java Jsoup.parseBodyFragment使用的例子?那麽, 這裏精選的方法代碼示例或許可以為您提供幫助。您也可以進一步了解該方法所在org.jsoup.Jsoup的用法示例。


在下文中一共展示了Jsoup.parseBodyFragment方法的10個代碼示例,這些例子默認根據受歡迎程度排序。您可以為喜歡或者感覺有用的代碼點讚,您的評價將有助於係統推薦出更棒的Java代碼示例。

示例1: parseZhihuTopics1

import org.jsoup.Jsoup; //導入方法依賴的package包/類
public static void parseZhihuTopics1(Page page, Result result) {
    String json = page.getContent();
    JSONObject object = JSON.parseObject(json);
    JSONArray array = object.getJSONArray("msg");
    if(array.size()==0) {
        result.setSkip(true);
        return;
    }
    for (int i = 0; i < array.size(); i++) {
        String topicStr = array.getString(i);
        Document doc = Jsoup.parseBodyFragment(topicStr);
        Element a = doc.body().select("div.item").first().select("a[target]").first();
        String href = "https://www.zhihu.com" + a.attr("href")+"/newest";
        result.addRequest(new Request(href, HttpMethod.GET));
    }
    Request request = new Request("https://www.zhihu.com/node/TopicsPlazzaListV2", HttpMethod.POST);
    JSONObject object1 = new JSONObject();
    object1.put("topic_id", page.getRequest().getAddch("topic_id"));
    object1.put("offset", Integer.valueOf(((Integer) page.getRequest().getAddch("offset")) + 20));
    object1.put("hash_id", "22e50cd21ed9df7085ff76d62175e986");
    request.addParame("method", "next")
            .addParame("params", object1.toJSONString()).addAttach("offset", Integer.valueOf(((Integer) page.getRequest().getAddch("offset")) + 20)).addAttach("topic_id", page.getRequest().getAddch("topic_id"));
    result.addRequest(request);
}
 
開發者ID:StevenKin,項目名稱:ZhihuQuestionsSpider,代碼行數:25,代碼來源:ParseRegularUtil.java

示例2: cleanContent

import org.jsoup.Jsoup; //導入方法依賴的package包/類
/**
 * Cleans the html content leaving only the following tags: b, em, i, strong, u, br, cite, em, i, p, strong, img, li, ul, ol, sup, sub, s
 * @param content html content
 * @param extraTags any other tags that you may want to keep, e. g. "a"
 * @return
 */
public String cleanContent(String content, String ... extraTags) {
	Whitelist allowedTags = Whitelist.simpleText(); // This whitelist allows only simple text formatting: b, em, i, strong, u. All other HTML (tags and attributes) will be removed.
	allowedTags.addTags("br", "cite", "em", "i", "p", "strong", "img", "li", "ul", "ol", "sup", "sub", "s");
	allowedTags.addTags(extraTags);
	allowedTags.addAttributes("p", "style"); // Serve per l'allineamento a destra e sinistra
	allowedTags.addAttributes("img", "src", "style", "class"); 
	if (Arrays.asList(extraTags).contains("a")) {
		allowedTags.addAttributes("a", "href", "target"); 
	}
	Document dirty = Jsoup.parseBodyFragment(content, "");
	Cleaner cleaner = new Cleaner(allowedTags);
	Document clean = cleaner.clean(dirty);
	clean.outputSettings().escapeMode(EscapeMode.xhtml); // Non fa l'escape dei caratteri utf-8
	String safe = clean.body().html();
	return safe;
}
 
開發者ID:xtianus,項目名稱:yadaframework,代碼行數:23,代碼來源:YadaWebUtil.java

示例3: assertContainsLink

import org.jsoup.Jsoup; //導入方法依賴的package包/類
public static void assertContainsLink(String expected, StringBuffer actual) {
    String linkifiedUri = actual.toString();
    Document document = Jsoup.parseBodyFragment(linkifiedUri);
    Element anchorElement = document.select("a").first();
    assertNotNull("No <a> element found", anchorElement);
    assertEquals(expected, anchorElement.text());
    assertEquals(expected, anchorElement.attr("href"));
}
 
開發者ID:philipwhiuk,項目名稱:q-mail,代碼行數:9,代碼來源:UriParserTestHelper.java

示例4: assertLinkOnly

import org.jsoup.Jsoup; //導入方法依賴的package包/類
public static void assertLinkOnly(String expected, StringBuffer actual) {
    String linkifiedUri = actual.toString();
    Document document = Jsoup.parseBodyFragment(linkifiedUri);
    Element anchorElement = document.select("a").first();
    assertNotNull("No <a> element found", anchorElement);
    assertEquals(expected, anchorElement.text());
    assertEquals(expected, anchorElement.attr("href"));

    assertAnchorElementIsSoleContent(document, anchorElement);
}
 
開發者ID:philipwhiuk,項目名稱:q-mail,代碼行數:11,代碼來源:UriParserTestHelper.java

示例5: handle

import org.jsoup.Jsoup; //導入方法依賴的package包/類
/**
 * Jsoup.parse(html)
 * Jsoup.parse(html, baseUri)
 * Jsoup.parseBodyFragment(bodyHtml)
 * Jsoup.parseBodyFragment(bodyHtml, baseUri)
 */
@Override
public Document handle( String html,boolean fragment) throws IOException{
	//獲取Jsoup參數
	String baseUri = Docx4jProperties.getProperty(Docx4jConstants.DOCX4J_JSOUP_PARSE_BASEURI,"");
	//使用Jsoup將html轉換成Document對象
	Document doc = fragment ? Jsoup.parseBodyFragment( html, baseUri) : Jsoup.parse( html,baseUri);
	//返回Document對象
	return doc;
}
 
開發者ID:vindell,項目名稱:docx4j-template,代碼行數:16,代碼來源:XHTMLDocumentHandler.java

示例6: postProcess

import org.jsoup.Jsoup; //導入方法依賴的package包/類
@Override
public String postProcess(String html) {
	// Use a faked baseURI, otherwise all relative urls will be stripped out
	Document body = Jsoup.parseBodyFragment(html, "http://localhost/sanitize");
	
	Cleaner cleaner = new Cleaner(whiteList);
	body = cleaner.clean(body);

	for (HtmlTransformer transformer : htmlTransformers)
		transformer.transform(body);
	return body.body().html();
}
 
開發者ID:jmfgdev,項目名稱:gitplex-mit,代碼行數:13,代碼來源:DefaultMarkdownManager.java

示例7: formatToXHtml

import org.jsoup.Jsoup; //導入方法依賴的package包/類
/**
 * Uses Jsoup to convert from HTML to XHTML
 */
private byte[] formatToXHtml(String html, Charset charset) {
    Document document = Jsoup.parseBodyFragment(html);
    document.outputSettings().syntax(Document.OutputSettings.Syntax.xml);
    document.outputSettings().charset(charset);
    return document.toString().getBytes(charset);
}
 
開發者ID:Asymmetrik,項目名稱:nifi-nars,代碼行數:10,代碼來源:GetWebpage.java

示例8: generateFormattedTextObjects

import org.jsoup.Jsoup; //導入方法依賴的package包/類
private void generateFormattedTextObjects(String text) throws IllegalArgumentException {

		Document document = Jsoup.parseBodyFragment(text);
		document.outputSettings(new Document.OutputSettings().prettyPrint(false));
		parseFormattedMessageNode(document.body(), new LinkedList<>());
	}
 
開發者ID:Gurgy,項目名稱:Cypher,代碼行數:7,代碼來源:EventListItemPresenter.java

示例9: getText

import org.jsoup.Jsoup; //導入方法依賴的package包/類
String getText(final HtmlBlock node) {
  final Document document = Jsoup.parseBodyFragment(node.getChars().toString());
  return document.text();
}
 
開發者ID:camunda,項目名稱:camunda-bpm-swagger,代碼行數:5,代碼來源:HtmlDocumentInterpreter.java

示例10: htmlNodeToMap

import org.jsoup.Jsoup; //導入方法依賴的package包/類
private Map<String, ParameterDescription> htmlNodeToMap(final HtmlBlock htmlBlock) {
  final String htmlBlockBody = prepareHTML(htmlBlock);
  final Document document = Jsoup.parseBodyFragment(htmlBlockBody);
  final Elements trs = document.select("tr");
  Integer nameIdx = null;
  Integer descriptionIdx = null;
  Integer typeIdx = null;
  Integer requiredIdx = null;
  final Elements ths = trs.get(0).select("th");

  if(ths.size() == 0) {
    // Workaround for missing table header
    nameIdx = 0;
    switch(trs.get(0).select("td").size()) {
    case 2:
      descriptionIdx = 1;
      break;
    case 3:
      typeIdx = 1;
      descriptionIdx = 2;
      break;
    }
  }
  for (int i = 0; i < ths.size(); i++) {
    final Element element = ths.get(i);
    switch(element.text()) {
    case "Name":
    case "Code":
    case "Form Part Name":
      nameIdx = i;
      break;
    case "Description":
      descriptionIdx = i;
      break;
    case "Media type":
    case "Type":
    case "Content Type":
    case "Value":
      typeIdx = i;
      break;
    case "Required?":
      requiredIdx = i;
      break;
    default:
      log.debug("Fieldname unknown: " + element.text());
      break;
    }
  }
  final HashMap<String, ParameterDescription> result = new HashMap<>();
  for (final Element tr : trs) {
    final Elements tds = tr.select("td");
    if (tds.size() >= 2) {
      final ParameterDescription.ParameterDescriptionBuilder builder = ParameterDescription.builder();
      Optional.ofNullable(nameIdx).map(tds::get).map(Element::text).ifPresent(builder::id);
      Optional.ofNullable(descriptionIdx).map(tds::get).map(Element::text).ifPresent(builder::description);
      Optional.ofNullable(typeIdx).map(tds::get).map(Element::text).ifPresent(builder::type);
      Optional.ofNullable(requiredIdx).map(tds::get).map(Element::text).map(o -> o.equals("Yes")).ifPresent(builder::required);
      final ParameterDescription parameterDescription = builder.build();
      result.put(parameterDescription.getId(), parameterDescription);
    }
  }
  return result;
}
 
開發者ID:camunda,項目名稱:camunda-bpm-swagger,代碼行數:64,代碼來源:HtmlDocumentInterpreter.java


注:本文中的org.jsoup.Jsoup.parseBodyFragment方法示例由純淨天空整理自Github/MSDocs等開源代碼及文檔管理平台,相關代碼片段篩選自各路編程大神貢獻的開源項目,源碼版權歸原作者所有,傳播和使用請參考對應項目的License;未經允許,請勿轉載。