本文整理汇总了Python中readability.Document.get_clean_article方法的典型用法代码示例。如果您正苦于以下问题:Python Document.get_clean_article方法的具体用法?Python Document.get_clean_article怎么用?Python Document.get_clean_article使用的例子?那么恭喜您, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在类readability.Document
的用法示例。
在下文中一共展示了Document.get_clean_article方法的2个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Python代码示例。
示例1: test_si_sample_html_partial
# 需要导入模块: from readability import Document [as 别名]
# 或者: from readability.Document import get_clean_article [as 别名]
def test_si_sample_html_partial(self):
"""Using the si sample, make sure we can get the article alone."""
sample = load_sample('si-game.sample.html')
doc = Document('http://sportsillustrated.cnn.com/baseball/mlb/gameflash/2012/04/16/40630_preview.html',
sample)
res = doc.get_clean_article()
self.assertEqual('<div><div class="', res[0:17])
示例2: test_lazy_images
# 需要导入模块: from readability import Document [as 别名]
# 或者: from readability.Document import get_clean_article [as 别名]
def test_lazy_images(self):
"""
Some sites use <img> elements with data-lazy-src elements pointing to the actual image.
"""
sample = load_sample('wired.sample.html')
doc = Document('http://www.wired.com/design/2014/01/will-influential-ui-design-minority-report/', sample)
article = doc.get_clean_article()
self.assertIn('<img src="http://www.wired.com/images_blogs/design/2014/01/her-joaquin-phoenix-41-660x371.jpg"', article)