当前位置: 首页>>代码示例>>Java>>正文


Java Page.setUrl方法代码示例

本文整理汇总了Java中us.codecraft.webmagic.Page.setUrl方法的典型用法代码示例。如果您正苦于以下问题:Java Page.setUrl方法的具体用法?Java Page.setUrl怎么用?Java Page.setUrl使用的例子?那么, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在us.codecraft.webmagic.Page的用法示例。


在下文中一共展示了Page.setUrl方法的10个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Java代码示例。

示例1: download

import us.codecraft.webmagic.Page; //导入方法依赖的package包/类
@Override
public Page download(Request request, Task task) {
	String html = null;
	try {
		html = casperjs.gatherHtml(new cn.nest.spider.entity.commons.Request(request.getUrl(), true));
	} catch(IOException e) {
		request.putExtra("EXCEPTION", e);
           onError(request);
           return null;
	}
	Page page = new Page().setRawText(html);
	page.setRequest(request);
	page.setUrl(new PlainText(request.getUrl()));
	onSuccess(request);
	return page;
}
 
开发者ID:TransientBuckwheat,项目名称:nest-spider,代码行数:17,代码来源:CasperjsDownloader.java

示例2: download

import us.codecraft.webmagic.Page; //导入方法依赖的package包/类
@Override
public Page download(Request request, Task task) {
    String html = null;
    Site site = null;
    if (task != null) {
        site = task.getSite();
    }
    try {
        html = casperjs.gatherHtml(new com.gs.spider.model.commons.Request(request.getUrl(), true));
    } catch (Exception e) {
        if (site.getCycleRetryTimes() > 0) {
            return addToCycleRetry(request, site);
        }
        request.putExtra("EXCEPTION", e);
        onError(request);
        return null;
    }
    Page page = new Page();
    page.setRawText(html);
    page.setUrl(new PlainText(request.getUrl()));
    page.setRequest(request);
    onSuccess(request);
    return page;
}
 
开发者ID:gsh199449,项目名称:spider,代码行数:25,代码来源:CasperjsDownloader.java

示例3: handleResponse

import us.codecraft.webmagic.Page; //导入方法依赖的package包/类
protected Page handleResponse(Request request, String charset, HttpResponse httpResponse, Task task)
		throws IOException {
	String content = IOUtils.toString(httpResponse.getEntity().getContent(), charset);
	Page page = new Page();
	page.setHtml(new Html(UrlUtils.fixAllRelativeHrefs(content, request.getUrl())));
	page.setUrl(new PlainText(request.getUrl()));
	page.setRequest(request);

	// set http response value
	page.putHttpResponse(Constant.STATUS_CODE, httpResponse.getStatusLine().getStatusCode() + "");
	Header[] headers = httpResponse.getAllHeaders();
	for (Header header : headers) {
		page.putHttpResponse(header.getName(), header.getValue());
	}

	return page;
}
 
开发者ID:yuany,项目名称:en-webmagic,代码行数:18,代码来源:HttpClientDownloader.java

示例4: handleResponse

import us.codecraft.webmagic.Page; //导入方法依赖的package包/类
protected Page handleResponse(Request request, String charset, HttpResponse httpResponse, Task task) throws IOException {
    byte[] bytes = IOUtils.toByteArray(httpResponse.getEntity().getContent());
    String contentType = httpResponse.getEntity().getContentType() == null ? "" : httpResponse.getEntity().getContentType().getValue();
    Page page = new Page();
    page.setBytes(bytes);
    if (!request.isBinaryContent()){
        if (charset == null) {
            charset = getHtmlCharset(contentType, bytes);
        }
        page.setCharset(charset);
        page.setRawText(new String(bytes, charset));
    }
    page.setUrl(new PlainText(request.getUrl()));
    page.setRequest(request);
    page.setStatusCode(httpResponse.getStatusLine().getStatusCode());
    page.setDownloadSuccess(true);
    if (responseHeader) {
        page.setHeaders(HttpClientUtils.convertHeaders(httpResponse.getAllHeaders()));
    }
    return page;
}
 
开发者ID:code4craft,项目名称:webmagic,代码行数:22,代码来源:HttpClientDownloader.java

示例5: test

import us.codecraft.webmagic.Page; //导入方法依赖的package包/类
@Ignore
@Test
public void test() {
    ModelPageProcessor modelPageProcessor = ModelPageProcessor.create(Site.me(), OschinaBlog.class);
    Page page = new Page();
    page.setRequest(new Request("http://my.oschina.net/flashsword/blog"));
    page.setUrl(new PlainText("http://my.oschina.net/flashsword/blog"));
    page.setHtml(new Html(html));
    long time = System.currentTimeMillis();
    for (int i = 0; i < 1000; i++) {
        modelPageProcessor.process(page);
    }
    System.out.println(System.currentTimeMillis() - time);
    time = System.currentTimeMillis();
    for (int i = 0; i < 1000; i++) {
        modelPageProcessor.process(page);
    }
    System.out.println(System.currentTimeMillis() - time);
}
 
开发者ID:code4craft,项目名称:webmagic,代码行数:20,代码来源:ProcessorBenchmark.java

示例6: download

import us.codecraft.webmagic.Page; //导入方法依赖的package包/类
@Override
public Page download(Request request, Task task) {
    Page page = new Page();
    InputStream resourceAsStream = this.getClass().getResourceAsStream("/html/mock-github.html");
    try {
        page.setRawText(IOUtils.toString(resourceAsStream));
    } catch (IOException e) {
        e.printStackTrace();
    }
    page.setRequest(new Request("https://github.com/code4craft/webmagic"));
    page.setUrl(new PlainText("https://github.com/code4craft/webmagic"));
    return page;
}
 
开发者ID:code4craft,项目名称:webmagic,代码行数:14,代码来源:MockGithubDownloader.java

示例7: getMockJsonPage

import us.codecraft.webmagic.Page; //导入方法依赖的package包/类
public Page getMockJsonPage() throws IOException {
    Page page = new Page();
    page.setRawText(IOUtils.toString(PageMocker.class.getClassLoader().getResourceAsStream("json/mock-githubrepo.json")));
    page.setRequest(new Request("https://api.github.com/repos/code4craft/webmagic"));
    page.setUrl(new PlainText("https://api.github.com/repos/code4craft/webmagic"));
    return page;
}
 
开发者ID:code4craft,项目名称:webmagic,代码行数:8,代码来源:PageMocker.java

示例8: getMockPage

import us.codecraft.webmagic.Page; //导入方法依赖的package包/类
public Page getMockPage() throws IOException {
    Page page = new Page();
    page.setRawText(IOUtils.toString(PageMocker.class.getClassLoader().getResourceAsStream("html/mock-webmagic.html")));
    page.setRequest(new Request("http://webmagic.io/list/0"));
    page.setUrl(new PlainText("http://webmagic.io/list/0"));
    return page;
}
 
开发者ID:code4craft,项目名称:webmagic,代码行数:8,代码来源:PageMocker.java

示例9: testMultiModel_should_not_skip_when_match

import us.codecraft.webmagic.Page; //导入方法依赖的package包/类
@Test
public void testMultiModel_should_not_skip_when_match() throws Exception {
    Page page = new Page();
    page.setRawText("<div foo='foo'></div>");
    page.setRequest(new Request("http://codecraft.us/foo"));
    page.setUrl(PlainText.create("http://codecraft.us/foo"));
    ModelPageProcessor modelPageProcessor = ModelPageProcessor.create(null, ModelFoo.class, ModelBar.class);
    modelPageProcessor.process(page);
    assertThat(page.getResultItems().isSkip()).isFalse();
}
 
开发者ID:code4craft,项目名称:webmagic,代码行数:11,代码来源:ModelPageProcessorTest.java

示例10: download

import us.codecraft.webmagic.Page; //导入方法依赖的package包/类
@Override
public Page download(Request request, Task task) {
    Page page = new Page();
    page.setRawText(html);
    page.setStatusCode(200);
    page.setRequest(new Request("https://github.com/code4craft/webmagic"));
    page.setUrl(new PlainText("https://github.com/code4craft/webmagic"));
    return page;
}
 
开发者ID:code4craft,项目名称:webmagic,代码行数:10,代码来源:MockGithubDownloader.java


注:本文中的us.codecraft.webmagic.Page.setUrl方法示例由纯净天空整理自Github/MSDocs等开源代码及文档管理平台,相关代码片段筛选自各路编程大神贡献的开源项目,源码版权归原作者所有,传播和使用请参考对应项目的License;未经允许,请勿转载。