当前位置: 首页>>代码示例>>Python>>正文


Python html.fragment_fromstring方法代码示例

本文整理汇总了Python中lxml.html.fragment_fromstring方法的典型用法代码示例。如果您正苦于以下问题:Python html.fragment_fromstring方法的具体用法?Python html.fragment_fromstring怎么用?Python html.fragment_fromstring使用的例子?那么, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在lxml.html的用法示例。


在下文中一共展示了html.fragment_fromstring方法的5个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Python代码示例。

示例1: parse_html

# 需要导入模块: from lxml import html [as 别名]
# 或者: from lxml.html import fragment_fromstring [as 别名]
def parse_html(html, cleanup=True):
    """
    Parses an HTML fragment, returning an lxml element.  Note that the HTML will be
    wrapped in a <div> tag that was not in the original document.

    If cleanup is true, make sure there's no <head> or <body>, and get
    rid of any <ins> and <del> tags.
    """
    if cleanup:
        # This removes any extra markup or structure like <head>:
        html = cleanup_html(html)
    return fragment_fromstring(html, create_parent=True) 
开发者ID:JFox,项目名称:aws-lambda-lxml,代码行数:14,代码来源:diff.py

示例2: convert_json_to_html

# 需要导入模块: from lxml import html [as 别名]
# 或者: from lxml.html import fragment_fromstring [as 别名]
def convert_json_to_html(elements):
    content = html.fragment_fromstring('<div></div>')
    for element in elements:
        content.append(_recursive_convert_json(element))
    content.make_links_absolute(base_url=base_url)
    for x in content.xpath('.//span'):
        x.drop_tag()
    html_string = html.tostring(content, encoding='unicode')
    html_string = replace_line_breaks_except_pre(html_string, '<br/>')
    html_string = html_string[5:-6]
    return html_string 
开发者ID:mercuree,项目名称:html-telegraph-poster,代码行数:13,代码来源:html_to_telegraph.py

示例3: transform_misused_divs_into_paragraphs

# 需要导入模块: from lxml import html [as 别名]
# 或者: from lxml.html import fragment_fromstring [as 别名]
def transform_misused_divs_into_paragraphs(self):
        """
        Transforms <div> without other block elements into <p>, merges near-standing <p> together.
        """
        for elem in self.tags(self._html, 'div'):
            # transform <div>s that do not contain other block elements into
            # <p>s
            # FIXME: The current implementation ignores all descendants that are not direct children of elem
            # This results in incorrect results in case there is an <img> buried within an <a> for example

            if not REGEXES['divToPElementsRe'].search(tostring(elem).decode()):
                elem.tag = "p"

        for elem in self.tags(self._html, 'div'):
            if elem.text and elem.text.strip():
                p = fragment_fromstring('<p/>')
                p.text = elem.text
                elem.text = None
                elem.insert(0, p)

            for pos, child in reversed(list(enumerate(elem))):
                if child.tail and child.tail.strip():
                    p = fragment_fromstring('<p/>')
                    p.text = child.tail
                    child.tail = None
                    elem.insert(pos + 1, p)

                if child.tag == 'br':
                    child.drop_tree() 
开发者ID:neegor,项目名称:wanish,代码行数:31,代码来源:cleaner.py

示例4: initial_output

# 需要导入模块: from lxml import html [as 别名]
# 或者: from lxml.html import fragment_fromstring [as 别名]
def initial_output(html_partial=False):
        """
        Creates initial HTML document according to the given flag
        :param html_partial: determines if there should be the html page or only a fragment
        :return: html output element
        """
        return fragment_fromstring('<div/>') if html_partial else document_fromstring('<div/>') 
开发者ID:neegor,项目名称:wanish,代码行数:9,代码来源:cleaner.py

示例5: parse_html

# 需要导入模块: from lxml import html [as 别名]
# 或者: from lxml.html import fragment_fromstring [as 别名]
def parse_html(text):
    return html.fragment_fromstring(text, parser=_HTML_PARSER) 
开发者ID:guohuadeng,项目名称:odoo13-x64,代码行数:4,代码来源:translate.py


注:本文中的lxml.html.fragment_fromstring方法示例由纯净天空整理自Github/MSDocs等开源代码及文档管理平台,相关代码片段筛选自各路编程大神贡献的开源项目,源码版权归原作者所有,传播和使用请参考对应项目的License;未经允许,请勿转载。