本文整理汇总了Python中amcat.models.Article.text方法的典型用法代码示例。如果您正苦于以下问题:Python Article.text方法的具体用法?Python Article.text怎么用?Python Article.text使用的例子?那么恭喜您, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在类amcat.models.Article
的用法示例。
在下文中一共展示了Article.text方法的1个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Python代码示例。
示例1: _scrape_unit
# 需要导入模块: from amcat.models import Article [as 别名]
# 或者: from amcat.models.Article import text [as 别名]
def _scrape_unit(self, document):
article = Article()
metadata = list(META)
# We select all 'div' elements directly under '.article'
divs = document.cssselect("* > div")
# Check for author field. If present: remove from metadata
# fields list
try:
author_field = document.cssselect(".author")[0]
except IndexError:
pass
else:
article.author = author_field.text_content().lstrip("Von").strip()
divs.remove(author_field)
# Strip everything before headline
headline_field = document.cssselect("b.deHeadline")[0].getparent()
divs = divs[divs.index(headline_field):]
# Parse metadata. Loop through each 'div' within an article, along with
# its field name according to META (thus based on its position)
for field_name, element in zip(metadata, divs):
if field_name is None:
continue
processor = PROCESSORS.get(field_name, lambda x: x)
text_content = element.text_content().strip()
setattr(article, field_name, processor(text_content))
# Fetch text, which is
paragraphs = [p.text_content() for p in document.cssselect("p")]
article.text = ("\n\n".join(paragraphs)).strip()
# We must return a iterable, so we return a one-tuple
return (article,)