PHP HtmlDomParser::file_get_html方法代码示例

本文整理汇总了PHP中Sunra\PhpSimple\HtmlDomParser::file_get_html方法的典型用法代码示例。如果您正苦于以下问题：PHP HtmlDomParser::file_get_html方法的具体用法？PHP HtmlDomParser::file_get_html怎么用？PHP HtmlDomParser::file_get_html使用的例子？那么, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在类Sunra\PhpSimple\HtmlDomParser的用法示例。

在下文中一共展示了HtmlDomParser::file_get_html方法的15个代码示例，这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞，您的评价将有助于系统推荐出更棒的PHP代码示例。

示例1: getUrlsFromSitemap

 protected static function getUrlsFromSitemap($sitemapLocation)
 {
     $sitemap = HtmlDomParser::file_get_html($sitemapLocation);
     $urls = [];
     foreach ($sitemap->find('loc') as $loc) {
         $urls[] = $loc->innertext;
     }
     return $urls;
 }

开发者ID:shortlist-digital，项目名称:agreable-catfish-importer-plugin，代码行数:9，代码来源:Sitemap.php

示例2: get_news

function get_news()
{
    $news_page = HtmlDomParser::file_get_html("http://p-karaj.tvu.ac.ir/");
    $elems = $news_page->find("#simple-list_11643 ", 0);
    echo $elems->plaintext;
    $link = $elems[0]->href;
    $fixLink = str_replace('./', '/', $link);
    //echo $fixLink;
    $behe = "http://p-karaj.tvu.ac.ir";
}

开发者ID:moeinrahimi，项目名称:beheshtinotifier，代码行数:10，代码来源:dom.php

示例3: downloadUrl

 protected function downloadUrl($url)
 {
     $html = '';
     $html = HtmlDomParser::file_get_html($url);
     if ($html == '') {
         exec("wget -qO- " . $url . " 2>&1", $wget_result);
         $html = HtmlDomParser::str_get_html($wget_result);
     }
     return $html;
 }

开发者ID:volrac，项目名称:scraper，代码行数:10，代码来源:SiteScraper.php

示例4: fetch

 /**
  * Méthode pour récupérer les informations d'une page par son URL.
  *
  * @param string $url L'adresse complète de la page.
  *
  * @return array Un tableau contenant les informations.
  *
  * @throws \InvalidArgumentException Si l'adresse est mal formatée.
  * @throws \RuntimeException         Si la page ne peut pas être traitée.
  */
 public function fetch($url)
 {
     // On contrôle que c'est bien une URL.
     if (!$this->testUrl($url)) {
         throw new \InvalidArgumentException($this->text->sprintf('APP_ERROR_BAD_URL', $url));
     }
     // On récupère le contenu de l'adresse.
     $dom = @HtmlDomParser::file_get_html($url);
     // Si une erreur est survenue.
     if (empty($dom)) {
         throw new \RuntimeException($this->text->sprintf('APP_ERROR_UNABLE_TO_LOAD_URL', $url));
     }
     // On récupère le titre de la page.
     $page_title = $dom->find('title')[0]->text();
     // On récupère les balises meta.
     $metas = [];
     foreach ($dom->find('head')[0]->find('meta') as $element) {
         foreach ($this->meta as $meta) {
             if ($element->hasAttribute($meta['key'])) {
                 if (strtolower($element->getAttribute($meta['key'])) == $meta['tag']) {
                     $content = $element->getAttribute('content');
                     if (!empty($content)) {
                         $metas[$meta['name']] = $content;
                     }
                 }
             }
         }
     }
     // On récupère le contenu de la page.
     $body = trim($dom->find('body')[0]->plaintext);
     $body = preg_replace('/\\s+/', ' ', $body);
     $pos = strpos($body, ' ', 200);
     $body = substr($body, 0, $pos);
     // On récupère les images.
     $images = [];
     foreach ($dom->find('img') as $element) {
         // On teste que c'est bien une URL valide.
         if ($this->testUrl($element->src)) {
             // On ne prend que les extensions images.
             $parts = UriHelper::parse_url($element->src);
             if (in_array($this->file_ext($parts['path']), $this->image_extensions)) {
                 $images[] = $parts['scheme'] . '://' . $parts['host'] . $parts['path'];
             }
         }
     }
     // Si on arrive ici c'est que tout s'est bien passé.
     return ['title' => $page_title, 'text' => $body, 'images' => $images, 'metas' => $metas];
 }

开发者ID:etd-framework，项目名称:fetcher，代码行数:58，代码来源:Fetcher.php

示例5: getPrice

 public static function getPrice($url)
 {
     $parser = new HtmlDomParser();
     $dom = $parser->file_get_html($url);
     $price = $dom->find('div.pricelabel strong')[0]->plaintext;
     unset($dom);
     if (isset($price) && !empty($price)) {
         preg_match_all("/(\\d+)/", str_replace(" ", "", $price), $price);
         if (isset($price[0]) && !empty($price[0])) {
             return $price[0];
         } else {
             return "0";
         }
     } else {
         return "0";
     }
 }

开发者ID:Sywooch，项目名称:find-parser，代码行数:17，代码来源:RabotaOlx.php

示例6: ReadPage

 /**
  * Reads a HTML page
  *
  * @param $page
  *
  * @return bool|null
  */
 public function ReadPage($page)
 {
     /**
      * Lets first get the page
      */
     $page = HTMLDomParser::file_get_html($page);
     /**
      * Checks
      */
     if (empty($page)) {
         return null;
     }
     /**
      * Else return false
      */
     return $page;
 }

开发者ID:CrashfortStudios，项目名称:FurAffinitySDK，代码行数:24，代码来源:PageReader.php

示例7: players

 /**
  * Get players info
  *
  * @param int $offset
  * @param int|null $limit
  * @param \Closure|null $condition
  * @return array
  */
 public function players($offset = 1, $limit = null)
 {
     $players = [];
     if (is_null($limit)) {
         $limit = $this->pages();
     }
     foreach (range($offset, $limit) as $page) {
         $html = Html::file_get_html($this->url . $page);
         foreach ($html->find('.player-row') as $player) {
             $data = $this->player($player);
             if (null == $this->before or null != $this->before and call_user_func($this->before, $data)) {
                 $players[] = null != $this->after ? call_user_func($this->after, $data) : $data;
             }
         }
     }
     return $players;
 }

开发者ID:balatsky，项目名称:futhead，代码行数:25，代码来源:Parser.php

示例8: getRSS

 private function getRSS(CraigslistRequest $request)
 {
     $body = file_get_contents($request->url());
     $listings = simplexml_load_string(utf8_encode($body));
     $results = [];
     foreach ($listings as $item) {
         $id = substr($item->link, -15, -5);
         if (!is_numeric($id)) {
             continue;
         }
         if ($this->remove_duplicates) {
             if (in_array($id, $this->ids) || in_array((string) $item->title, $this->titles)) {
                 continue;
             }
             $this->ids[] = $id;
             $this->titles[] = (string) $item->title;
         }
         $results[$id] = ['id' => $id, 'link' => (string) $item->link, 'title' => (string) $item->title, 'description' => (string) $item->description];
         if ($request->follow()) {
             $results[$id]['content'] = [];
             $dom = HtmlDomParser::file_get_html($item->link);
             @($results[$id]['date'] = $dom->find('time', 0)->datetime);
             @($results[$id]['page_title'] = $dom->find('.postingtitletext', 0)->innertext);
             @($results[$id]['location'] = str_replace(['(', ')'], '', $dom->find('.postingtitletext small', 0)->innertext));
             @($results[$id]['price'] = $dom->find('.price', 0)->innertext);
             @($results[$id]['body'] = $dom->find('.postingbody, #postingbody, #postingBody', 0)->innertext);
             foreach ($dom->find('.attrgroup span') as $attr) {
                 $results[$id]['attributes'][] = $attr->innertext;
             }
             foreach ($request->selectors as $selector) {
                 $target = $selector['target'];
                 foreach ($dom->find($selector['element']) as $k => $attr) {
                     if (isset($selector['limit']) && $k > $selector['limit'] - 1) {
                         continue;
                     }
                     $results[$id][$selector['label']][] = $attr->{$target};
                 }
             }
         }
     }
     return $results;
 }

开发者ID:andrewevansmith，项目名称:php-craigslist-api-utility，代码行数:42，代码来源:CraigslistApi.php

示例9: findMovies

 public static function findMovies($str, $cinema_name)
 {
     $res = array();
     $dom = HtmlDomParser::file_get_html($str);
     $cinemas = $dom->find(".movie_results");
     foreach ($cinemas as $cinema) {
         foreach ($cinema->children() as $theater) {
             foreach ($theater->find('.desc') as $els) {
                 foreach ($els->find('h2') as $title) {
                     if (strtolower($title->text()) == $cinema_name) {
                         foreach ($theater->find('.name') as $name) {
                             $res[] = array($name->text());
                         }
                     }
                 }
             }
         }
     }
     return $res;
 }

开发者ID:enrike1983，项目名称:telegram_bot，代码行数:20，代码来源:Crawler.php

示例10: downloadTorrentFile

 public function downloadTorrentFile($torrentUrlCard)
 {
     $dom = HtmlDomParser::file_get_html($torrentUrlCard);
     $urlTorrentFile = $dom->find('a[id=telecharger]', 0)->href;
     $urlTorrentFile = self::CPASBIEN_BASE_URL . $urlTorrentFile;
     $curl = curl_init($urlTorrentFile);
     curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
     curl_setopt($curl, CURLOPT_COOKIESESSION, true);
     $fileContent = curl_exec($curl);
     curl_close($curl);
     $filename = $torrentUrlCard;
     while (strpos($filename, '/') !== false) {
         $test = strpos($filename, '/');
         $filename = substr($filename, strpos($filename, '/') + 1);
     }
     $test = strpos($filename, '/');
     $filename = substr($filename, 0, strlen($filename) - 5) . '.torrent';
     $filePath = $this->tmpFolder . '/' . $filename;
     file_put_contents($filePath, $fileContent);
     return $filePath;
 }

开发者ID:elpiafo3，项目名称:Torrent-Streamer，代码行数:21，代码来源:CPasBienExtractor.php

示例11: search

 public function search($query)
 {
     $dom = HtmlDomParser::file_get_html(sprintf(self::SERVICE_URL, urlencode($query)));
     $songs = $dom->find('ul.songs li');
     $results = array();
     if ($songs) {
         /* @var $song \simple_html_dom_node */
         foreach ($songs as $song) {
             $result = new mp3withResult();
             $result->id = $song->attr['data-id'];
             $result->url = 'http://mp3with.co' . $song->attr['data-mp3'];
             $song = $song->find('.song', 0);
             if ($song) {
                 $result->title = trim($song->find('strong', 0)->innertext);
                 $result->artist = trim($song->find('strong.artist', 0)->innertext);
             }
             $results[] = $result;
         }
     }
     return $results;
 }

开发者ID:skipperbent，项目名称:mp3vibez，代码行数:21，代码来源:mp3with.php

示例12: getPartiesInfo

 /**
  * Get party information from URL
  *
  * @param $url
  * @return array of party information defined as above
  */
 protected function getPartiesInfo($url)
 {
     $html = HtmlDomParser::file_get_html($url);
     $party = [];
     foreach ($html->find('.borderbox1') as $index => $partyHtml) {
         $nameHtml = $partyHtml->find('h3.partytitle', 0);
         // If name section has website link
         if (strpos($nameHtml, 'href')) {
             $party[$index]['name'] = $nameHtml->find('a', 0)->innertext;
             $party[$index]['website'] = $nameHtml->find('a', 0)->href;
         } else {
             $party[$index]['name'] = $nameHtml->innertext;
         }
         // Get party info from left column
         $infoPartOneHtml = $partyHtml->find('div.colun', 0);
         $infoPartOneIndex = 0;
         $shortNameHtml = $infoPartOneHtml->find('p', $infoPartOneIndex++)->innertext;
         $party[$index]['short_name'] = $this->getAfterFirstBr($shortNameHtml);
         $leaderHtml = $infoPartOneHtml->find('p', $infoPartOneIndex++)->innertext;
         $party[$index]['leader'] = $this->getAfterFirstBr($leaderHtml);
         $headquartersHtml = $infoPartOneHtml->find('p', $infoPartOneIndex++)->innertext;
         $party[$index]['headquarters'] = $this->getAfterFirstBr($headquartersHtml);
         // Get party info from right column
         $infoPartTwoHtml = $partyHtml->find('div.coldeux', 0);
         $infoPartTwoIndex = 0;
         $eligibleHtml = $infoPartTwoHtml->find('p', $infoPartTwoIndex++)->innertext;
         $party[$index]['eligible_date'] = $this->getAfterSpan($eligibleHtml);
         $registeredHtml = $infoPartTwoHtml->find('p', $infoPartTwoIndex++)->innertext;
         $party[$index]['registered_date'] = $this->getAfterSpan($registeredHtml);
         if (strpos($infoPartTwoHtml->innertext, 'Deregistered')) {
             $deregisteredHtml = $infoPartTwoHtml->find('p', $infoPartTwoIndex++)->innertext;
             $party[$index]['deregistered_date'] = $this->getAfterSpan($deregisteredHtml);
         }
         $chefAgentHtml = $infoPartTwoHtml->find('p', $infoPartTwoIndex++)->innertext;
         $party[$index]['chef_agent'] = $this->getAfterFirstBr($chefAgentHtml);
         $auditorHtml = $infoPartTwoHtml->find('p', $infoPartTwoIndex++)->innertext;
         $party[$index]['auditor'] = $this->getAfterFirstBr($auditorHtml);
     }
     return $party;
 }

开发者ID:Haolicopter，项目名称:thug-election，代码行数:46，代码来源:PartiesTableSeeder.php

示例13: getFind

 public function getFind()
 {
     $url_for_check = App\Film::where('check', '0')->orderBy('id', 'desc')->first();
     $url_for_check->check = 1;
     $url_for_check->save();
     if (parse_url($url_for_check->url)['host'] == 'filmiha.com' && substr(parse_url($url_for_check->url)['path'], 1, 3) != 'tag') {
         $dom = HtmlDomParser::file_get_html($url_for_check->url);
         foreach ($dom->find('a') as $link) {
             $href = $link->href;
             $hrefs = App\Film::where('url', $href);
             if ($hrefs->count() == 0) {
                 $new_href = new App\Film();
                 if (parse_url($href)['host'] != 'filmiha.com' || substr(parse_url($href)['path'], 1, 3) == 'tag') {
                     $new_href->check = 1;
                 }
                 $new_href->url = $href;
                 $new_href->save();
             }
         }
     }
     dd("Operation Was Successful! :) ");
 }

开发者ID:tuytoosh，项目名称:search，代码行数:22，代码来源:SearchController.php

示例14: getPhoneNumber

 protected static function getPhoneNumber($url)
 {
     $out = null;
     $parser = new HtmlDomParser();
     $dom = $parser->file_get_html($url);
     $uuid = $dom->find('div.rel ul.brbott-12 li');
     unset($dom);
     if (isset($uuid) && !empty($uuid)) {
         preg_match("/\\'id\\'\\:\\'(.*?)\\'/", $uuid[0]->class, $uuid);
         $uuid = explode(':', $uuid[0]);
         $uuid = str_replace("'", "", $uuid[1]);
         if ($curl = curl_init()) {
             curl_setopt($curl, CURLOPT_URL, 'http://olx.ua/ajax/misc/contact/phone/' . $uuid . '/white/');
             curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
             curl_setopt($curl, CURLOPT_POST, true);
             curl_setopt($curl, CURLOPT_POSTFIELDS, "");
             $out = curl_exec($curl);
             curl_close($curl);
         }
         $phone = json_decode($out);
         if (isset($phone->value) && !empty($phone->value)) {
             if (preg_match("/<span\\sclass=\"block\">(.*)<\\/span\\>/", $phone->value)) {
                 $ddom = $parser->str_get_html($phone->value);
                 $phone = $ddom->find('span[class=block]')[0]->innertext;
                 unset($ddom);
                 return $phone;
             } else {
                 return $phone->value;
             }
         } else {
             return false;
         }
     } else {
         return false;
     }
 }

开发者ID:Sywooch，项目名称:find-parser，代码行数:36，代码来源:ParserOlx.php

示例15: getImage

 public function getImage($returnJson = true)
 {
     $url = $_POST['url'];
     $crawlerId = $_POST['crawler'];
     $crawlerModel = new CrawlerModel();
     $crawlerInfo = $crawlerModel->getCrawlerInfo($crawlerId);
     if (empty($url)) {
         return false;
     }
     if (empty($crawlerId)) {
         return false;
     }
     $crawlerModel = new CrawlerModel();
     $source = $crawlerModel->getSourceById($crawlerInfo['source_type']);
     $dom = HtmlDomParser::file_get_html($url);
     $images = $dom->find($source['main_image']);
     $imgSrc = $images[0]->src;
     /**
      * find more sources
      */
     $moreSources = array();
     $moreFromBlock = $dom->find('.domainLinkWrapper');
     if (isset($moreFromBlock[0])) {
         $moreSources[] = $moreFromBlock[0]->href;
     }
     //paged Collection (suggested boards under pin
     $pagedCollection = $dom->find('.PagedCollection');
     if (isset($pagedCollection[0]) && !empty($pagedCollection[0])) {
         $boardLinkWrapper = $pagedCollection[0]->find('.boardLinkWrapper');
         foreach ($boardLinkWrapper as $blw) {
             $moreSources[] = $blw->href;
         }
     }
     $imgData = array('image' => $imgSrc, 'more-sources' => $moreSources);
     if ($returnJson) {
         echo json_encode($imgData);
         exit;
     } else {
         return $imgData;
     }
 }

开发者ID:krike，项目名称:crawler-tool，代码行数:41，代码来源:Pinterest.php

注：本文中的Sunra\PhpSimple\HtmlDomParser::file_get_html方法示例由纯净天空整理自Github/MSDocs等开源代码及文档管理平台，相关代码片段筛选自各路编程大神贡献的开源项目，源码版权归原作者所有，传播和使用请参考对应项目的License；未经允许，请勿转载。