當前位置: 首頁>>代碼示例>>Python>>正文


Python crawler.Crawler方法代碼示例

本文整理匯總了Python中crawler.Crawler方法的典型用法代碼示例。如果您正苦於以下問題:Python crawler.Crawler方法的具體用法?Python crawler.Crawler怎麽用?Python crawler.Crawler使用的例子?那麽, 這裏精選的方法代碼示例或許可以為您提供幫助。您也可以進一步了解該方法所在crawler的用法示例。


在下文中一共展示了crawler.Crawler方法的5個代碼示例,這些例子默認根據受歡迎程度排序。您可以為喜歡或者感覺有用的代碼點讚,您的評價將有助於係統推薦出更棒的Python代碼示例。

示例1: DoCrawler

# 需要導入模塊: import crawler [as 別名]
# 或者: from crawler import Crawler [as 別名]
def DoCrawler(message):
        print "DO CRAWLER MESSAGE : "+message
        import crawler
        Jconf = json.loads(message)
        RunnerID = Jconf["RunnerID"]
        RunnerList = Jconf["RunnerList"]
        JobID = Jconf["JobID"]
	JobOwner = Jconf["JobOwner"]
        client.JobDict[JobID] = Jconf
        Cclass = crawler.Crawler(JobID, RunnerID, RunnerList, JobOwner)
        Cclass.Run() 
開發者ID:yenkuanlee,項目名稱:IPDC,代碼行數:13,代碼來源:Dmqtt.py

示例2: scan_web

# 需要導入模塊: import crawler [as 別名]
# 或者: from crawler import Crawler [as 別名]
def scan_web(self, url):
        '''
		'''
        w = Crawler()
        req_list = w.crawl(url)

        for item in req_list:
            print item
            self.scan_request(item) 
開發者ID:imiyoo2010,項目名稱:teye_scanner_for_book,代碼行數:11,代碼來源:tcore.py

示例3: test_load_crawler

# 需要導入模塊: import crawler [as 別名]
# 或者: from crawler import Crawler [as 別名]
def test_load_crawler():
    ini = Ini('files/config.ini')
    crawler = Crawler(ini)
    assert crawler

    report = crawler.scan('http://wikitjerrta4qgz4.onion')
    assert type(report) == DynamicObject
    assert report.webpage.url == 'http://wikitjerrta4qgz4.onion'
    assert report.webpage.domain == 'wikitjerrta4qgz4.onion'

    del crawler 
開發者ID:bunseokbot,項目名稱:darklight,代碼行數:13,代碼來源:test_crawler.py

示例4: test_sortcrawl_sd_dir

# 需要導入模塊: import crawler [as 別名]
# 或者: from crawler import Crawler [as 別名]
def test_sortcrawl_sd_dir(self):
        with Sorter(db_handler=self.db_handler) as sortbot9k:
            sortbot9k.scrape_directories(self.sd_directory)
            sortbot9k.sort_onions(self.class_tests)

        uptodate_class, uptodate_name = \
                self.db_handler.get_onion_class(self.get_cur_runtime(), True)
        self.assertEqual(type(uptodate_class), dict)
        # At least 10 of our instances should be on the latest version
        self.assertGreaterEqual(len(uptodate_class), 10)
        self.assertRegex(list(uptodate_class)[0], "http")
        self.assertRegex(list(uptodate_class)[0], ".onion")

        outofdate_class, outofdate_name = \
                self.db_handler.get_onion_class(self.get_cur_runtime(), False)
        self.assertEqual(type(outofdate_class), dict)
        # At least 1 of our instances will be lagging behind versions :'(
        self.assertGreaterEqual(len(outofdate_class), 1)
        self.assertRegex(list(outofdate_class)[0], "http")
        self.assertRegex(list(outofdate_class)[0], ".onion")

        class_data = self.db_handler.get_onions(self.get_cur_runtime())
        nonmonitored_name, monitored_name = class_data.keys()
        # Test that we get the expected class names, and data types back
        self.assertEqual(nonmonitored_name, 'nonmonitored')
        self.assertRegex(monitored_name, 'sd')
        nonmonitored_class, monitored_class = class_data.values()
        self.assertEqual(type(nonmonitored_class), dict)
        self.assertEqual(type(monitored_class), dict)

        with Crawler(db_handler=self.db_handler) as crawlbot9k:
            crawlbot9k.collect_set_of_traces(nonmonitored_class)

        # There are not yet methods to query crawled data, but in the future,
        # tests will be added here to verify Crawler-related data is being
        # read/written to the database in the expected manner. 
開發者ID:freedomofpress,項目名稱:fingerprint-securedrop,代碼行數:38,代碼來源:test_database.py

示例5: test_crawl_of_bad_sites

# 需要導入模塊: import crawler [as 別名]
# 或者: from crawler import Crawler [as 別名]
def test_crawl_of_bad_sites(self):
        with Crawler(restart_on_sketchy_exception=True) as crawler:
            crawler.collect_set_of_traces(self.bad_sites) 
開發者ID:freedomofpress,項目名稱:fingerprint-securedrop,代碼行數:5,代碼來源:test_sketchy_sites.py


注:本文中的crawler.Crawler方法示例由純淨天空整理自Github/MSDocs等開源代碼及文檔管理平台,相關代碼片段篩選自各路編程大神貢獻的開源項目,源碼版權歸原作者所有,傳播和使用請參考對應項目的License;未經允許,請勿轉載。