当前位置: 首页>>代码示例>>PHP>>正文


PHP Crawler::filter方法代码示例

本文整理汇总了PHP中Crawler::filter方法的典型用法代码示例。如果您正苦于以下问题:PHP Crawler::filter方法的具体用法?PHP Crawler::filter怎么用?PHP Crawler::filter使用的例子?那么恭喜您, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在Crawler的用法示例。


在下文中一共展示了Crawler::filter方法的3个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的PHP代码示例。

示例1: perform

 function perform()
 {
     $ps = DB::prepare('INSERT INTO listings SET scraped=FALSE, code=:code, title=:title, link=:link, date=:date, price=:price, neighborhood=:neighborhood');
     array_map(function ($url) use($ps) {
         $code = substr($url, 30, 3);
         // SUPER brittle obvs
         $crawler = new Crawler(Guzzle::get($url)->getBody());
         $crawler->filter('.row > .txt')->each(function ($node) use($ps, $code) {
             try {
                 $a = $node->filter('.pl > a.hdrlnk');
                 $ps->execute([':code' => $code, ':title' => $a->text(), ':link' => $a->attr('href'), ':date' => strftime('%Y-%m-%d', strtotime($node->filter('.pl > .date')->text())), ':price' => ($n = $node->filter('.l2 > .price')) && $n->count() ? preg_replace('/\\D/', '', $n->text()) : null, ':neighborhood' => ($n = $node->filter('.l2 > .pnr > small')) && $n->count() ? $n->text() : null]);
             } catch (Exception $e) {
                 Logger::error($e->getMessage(), $ps->errorinfo());
             }
         });
     }, ['http://newyork.craigslist.org/nfa/', 'http://newyork.craigslist.org/roo/', 'http://newyork.craigslist.org/sub/']);
 }
开发者ID:dthtvwls,项目名称:padscraper,代码行数:17,代码来源:Crawl.php

示例2: perform

 function perform()
 {
     $q = DB::query('SELECT link, neighborhood FROM listings WHERE scraped != TRUE', PDO::FETCH_ASSOC);
     $ps = DB::prepare('UPDATE listings SET scraped=TRUE, street=:street, description=:description, lat=:lat, lng=:lng WHERE link=:link');
     /*
     Guzzle::sendAll(array_map(function ($listing) {
       return Guzzle::createRequest('GET', 'http://newyork.craigslist.org' . $listing['link']);
     }, iterator_to_array($q)), ['complete' => function ($event) use($ps) {
       try {        
         $body = $event->getResponse()->getBody();
       
         $crawler = new Crawler($body);
         $readability = new Readability($body);
     
         $street = $crawler->filter('.mapAndAttrs > .mapbox > div.mapaddress');
       
         $ps->execute([
           ':link' => parse_url($event->getRequest()->getUrl())['path'],
           ':lat'  => null,
           ':lng'  => null,
           ':street' => $street->count() ? $street->text() : null,
           ':description' => $readability->init() ? trim(strip_tags(tidy_parse_string($readability->getContent()->innerHTML, [], 'UTF8'))) : null    
         ]);
       } catch (Exception $e) {
         Logger::error($e->getMessage(), $ps->errorinfo());
       }
     }]);
     */
     foreach ($q as $listing) {
         try {
             $body = Guzzle::get('http://newyork.craigslist.org' . $listing['link'])->getBody();
             $crawler = new Crawler($body);
             $readability = new Readability($body);
             $street = $crawler->filter('.mapAndAttrs > .mapbox > div.mapaddress');
             $url = 'http://maps.googleapis.com/maps/api/geocode/json?address=' . ($street->count() ? $street->text() : $listing['neighborhood']);
             $json = json_decode(Guzzle::get($url)->getBody(), true);
             $loc = isset($json['results'][0]) ? $json['results'][0]['geometry']['location'] : null;
             $ps->execute([':link' => $listing['link'], ':lat' => isset($loc['lat']) ? $loc['lat'] : null, ':lng' => isset($loc['lng']) ? $loc['lng'] : null, ':street' => $street->count() ? $street->text() : null, ':description' => $readability->init() ? trim(strip_tags(tidy_parse_string($readability->getContent()->innerHTML, [], 'UTF8'))) : null]);
         } catch (Exception $e) {
             Logger::error($e->getMessage(), $ps->errorinfo());
         }
     }
 }
开发者ID:dthtvwls,项目名称:padscraper,代码行数:43,代码来源:Scrape.php

示例3: testFacade

 function testFacade()
 {
     $crawler = new Crawler('<html/>');
     $this->assertEquals($crawler->filter('html')->count(), 1);
 }
开发者ID:dthtvwls,项目名称:chisel,代码行数:5,代码来源:CrawlerTest.php


注:本文中的Crawler::filter方法示例由纯净天空整理自Github/MSDocs等开源代码及文档管理平台,相关代码片段筛选自各路编程大神贡献的开源项目,源码版权归原作者所有,传播和使用请参考对应项目的License;未经允许,请勿转载。