当前位置: 首页>>代码示例>>Java>>正文


Java HtmlPage.getByXPath方法代码示例

本文整理汇总了Java中com.gargoylesoftware.htmlunit.html.HtmlPage.getByXPath方法的典型用法代码示例。如果您正苦于以下问题:Java HtmlPage.getByXPath方法的具体用法?Java HtmlPage.getByXPath怎么用?Java HtmlPage.getByXPath使用的例子?那么, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在com.gargoylesoftware.htmlunit.html.HtmlPage的用法示例。


在下文中一共展示了HtmlPage.getByXPath方法的8个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Java代码示例。

示例1: searchDuck

import com.gargoylesoftware.htmlunit.html.HtmlPage; //导入方法依赖的package包/类
public static List<String> searchDuck (String keyword) {
    List<String> searchResults = new ArrayList<>();
    try{
        WebClient webClient = new WebClient(BrowserVersion.CHROME);
        HtmlPage page = webClient.getPage("https://duckduckgo.com/html/?q=" + keyword);
        List<HtmlAnchor> l = page.getByXPath("//a[@class='result__url']");
        for(HtmlAnchor a: l) {
            searchResults.add(a.getHrefAttribute());
        }


    }
    catch(Exception e){
        System.err.println(e);
    }
    return searchResults;
}
 
开发者ID:nitroignika,项目名称:duck-feed-2,代码行数:18,代码来源:DuckScrape.java

示例2: shouldShowPetIndexPage

import com.gargoylesoftware.htmlunit.html.HtmlPage; //导入方法依赖的package包/类
@Test
public void shouldShowPetIndexPage() throws Exception {
  HtmlPage ownerIndexPage = getPage("/pets");
  assertEquals(13.0, ownerIndexPage.getByXPath("count(//div[@class='card'])").get(0));

  String content = ownerIndexPage.asText();
  assertTrue(content.contains("Leo"));
  assertTrue(content.contains("Owned by: George Franklin"));
  assertTrue(content.contains("Birthday: 2010-09-07"));
  assertTrue(content.contains("Type: cat"));


  List viewEditDeleteLinks = ownerIndexPage.getByXPath("//div[@class='card'][1]//@href");
  assertEquals("/owners/1", ((DomAttr) viewEditDeleteLinks.get(0)).getValue());
  assertEquals("/pets/1", ((DomAttr) viewEditDeleteLinks.get(1)).getValue());
  assertEquals("/pets/1/edit", ((DomAttr) viewEditDeleteLinks.get(2)).getValue());
  assertEquals("/pets/1/delete", ((DomAttr) viewEditDeleteLinks.get(3)).getValue());
}
 
开发者ID:puncha,项目名称:petclinic,代码行数:19,代码来源:PetControllerTests.java

示例3: shouldShowOwnerIndexPage

import com.gargoylesoftware.htmlunit.html.HtmlPage; //导入方法依赖的package包/类
@Test
public void shouldShowOwnerIndexPage() throws Exception {
  HtmlPage ownerIndexPage = getPage("/owners");
  assertEquals(10.0, ownerIndexPage.getByXPath("count(//tbody/tr)").get(0));

  List ownerProperties = ownerIndexPage.getByXPath("//tbody/tr[1]/td/text()");
  assertEquals("George", ownerProperties.get(0).toString());
  assertEquals("Franklin", ownerProperties.get(1).toString());
  assertEquals("110 W. Liberty St.", ownerProperties.get(2).toString());
  assertEquals("Madison", ownerProperties.get(3).toString());
  assertEquals("6085551023", ownerProperties.get(4).toString());
}
 
开发者ID:puncha,项目名称:petclinic,代码行数:13,代码来源:OwnerControllerTests.java

示例4: searchInBaidu

import com.gargoylesoftware.htmlunit.html.HtmlPage; //导入方法依赖的package包/类
public void searchInBaidu() throws Exception {
	HtmlPage page = webClient.getPage("https://www.baidu.com/");  
	HtmlForm form = page.getFormByName("f");
	
	HtmlTextInput input = form.getInputByName("wd");	
	HtmlSubmitInput button = form.getInputByValue("百度一下");  
	
	input.setValueAttribute("无锡");
	HtmlPage nextPage = button.click();  
	
	//System.out.println(nextPage.asXml());
	
	// hit next page
	HtmlAnchor next = null;
	List list = nextPage.getByXPath("//a");
	for(Object obj : list) {
		if(obj instanceof HtmlAnchor) {
			HtmlAnchor ha = (HtmlAnchor)obj;
			//System.out.println(ha.getTextContent());
			if(ha.getTextContent().indexOf("百度百科") != -1) {
				next = ha;
				break;
			}
		}
	}
	
	
	System.out.println(next.asXml());
	System.out.println("--------------------------");
	HtmlPage p = next.click();
	System.out.println(p.asXml());
	
}
 
开发者ID:knshen,项目名称:JSearcher,代码行数:34,代码来源:PostDemo.java

示例5: shouldOwnerIndexPageNavigateToOwnerDetailPage

import com.gargoylesoftware.htmlunit.html.HtmlPage; //导入方法依赖的package包/类
@Test
public void shouldOwnerIndexPageNavigateToOwnerDetailPage() throws Exception {
  HtmlPage ownerIndexPage = getPage("/owners");
  List viewEditDeleteButtons = ownerIndexPage.getByXPath("//tbody/tr[1]/td[6]//a");
  HtmlAnchor aHref = (HtmlAnchor) viewEditDeleteButtons.get(0);
  HtmlPage viewOwnerPage = aHref.click();
  assertTrue(viewOwnerPage.getUrl().toString().matches(".*/owners/1$"));
}
 
开发者ID:puncha,项目名称:petclinic,代码行数:9,代码来源:OwnerControllerTests.java

示例6: fetchLabelsWithAnyTitle

import com.gargoylesoftware.htmlunit.html.HtmlPage; //导入方法依赖的package包/类
private static List<HtmlLabel> fetchLabelsWithAnyTitle(HtmlPage page) {
    return (List<HtmlLabel>) page.getByXPath(LABEL_XPATH);
}
 
开发者ID:theovier,项目名称:lernplattform-crawler,代码行数:4,代码来源:TermCrawler.java

示例7: fetchResourceIDs

import com.gargoylesoftware.htmlunit.html.HtmlPage; //导入方法依赖的package包/类
protected static List<String> fetchResourceIDs(HtmlPage coursePage, final String LIST_ITEMS_XPATH) {
    List<String> courseIDs = new ArrayList<>();
    List<?> courseListItems = coursePage.getByXPath(LIST_ITEMS_XPATH);
    getIDs(courseListItems).forEach(id -> courseIDs.add(id));
    return courseIDs;
}
 
开发者ID:theovier,项目名称:lernplattform-crawler,代码行数:7,代码来源:ResourceIDCrawler.java

示例8: obtainPersonas

import com.gargoylesoftware.htmlunit.html.HtmlPage; //导入方法依赖的package包/类
public Persona obtainPersonas(String host)
    throws FailingHttpStatusCodeException, MalformedURLException,
    IOException {
  if (this.patterns == null
      || (this.patterns != null && this.patterns.isEmpty()))
    initPatterns();

  WebClient webClient = new WebClient();
  webClient.getOptions().setJavaScriptEnabled(false);
  webClient.getOptions().setThrowExceptionOnFailingStatusCode(false);
  webClient.getOptions().setThrowExceptionOnScriptError(false);
  HtmlPage htmlPage = null;
  Persona persona = new Persona();
  persona.setHostPatternKey(host);
  ;
  persona.setPageId(page.toURI().toString());

  try {
    htmlPage = webClient.getPage(page.toURL());
  } catch (Exception e) {
    e.printStackTrace(System.out);
    webClient.close();
    return persona;
  }

  String pattern = patterns.get(host);
  boolean isAnchor = false;
  if (pattern.contains("@href")) {
    isAnchor = true;
  }

  List<?> elements = htmlPage.getByXPath(patterns.get(host));
  for (int i = 0; i < elements.size(); i++) {
    String username = null;
    if (isAnchor) {
      String link = ((HtmlAnchor) elements.get(i)).getHrefAttribute();
      if (isUserLink(link)) {
        int index = link.lastIndexOf('/');
        username = link.substring(index + 1);
      }
    } else {
      if (elements.get(i) instanceof String) {
        username = ((String) elements.get(i)).trim();
      } else {
        username = ((DomNode) elements.get(i)).asText();
      }
    }

    if (username != null && !username.equals("")) {
      persona.getUsernames().add(username);
    }

  }

  webClient.close();
  return persona;
}
 
开发者ID:USCDataScience,项目名称:PersonaExtraction,代码行数:58,代码来源:PersonaExtractor.java


注:本文中的com.gargoylesoftware.htmlunit.html.HtmlPage.getByXPath方法示例由纯净天空整理自Github/MSDocs等开源代码及文档管理平台,相关代码片段筛选自各路编程大神贡献的开源项目,源码版权归原作者所有,传播和使用请参考对应项目的License;未经允许,请勿转载。