Skip to main content

Convert HTML to dash format.

Project description

html-to-dash

Convert HTML to dash format.

Installation

pip install html-to-dash

Examples

Basic usage

from html_to_dash import parse_html

element_str = """

<div>

    <div class='bg-gray-800' style='color:red;margin:10px'>

        <svg aria-label="Ripples. Logo" role="img" xmlns="http://www.w3.org/2000/svg"></svg>

        <a href="#" id="link1">A</a>

    </div>

    <div>text</div>

    <div><a href="#" id="link1">a1</a>tail1<a href="#" id="link2">a2</a>tail2</div>

</div>

"""

parse_html(element_str)

Print:


# Tags : Unsupported [svg] removed.

Result:

html.Div(

    children=[

        html.Div(

            className="bg-gray-800",

            style={"color": "red", "margin": "10px"},

            children=[html.A(href="#", id="link1", children=["A"])],

        ),

        html.Div(children=["text"]),

        html.Div(

            children=[

                html.A(href="#", id="link1", children=["a1"]),

                html.Span(children=["tail1"]),

                html.A(href="#", id="link2", children=["a2"]),

                html.Span(children=["tail2"]),

            ]

        ),

    ]

)

  • By default, only tags in the dash.html module are supported.

  • Tags and attributes are checked, and those that are not supported are automatically removed.

  • The tags and attributes are case-insensitive.

  • If the provided HTML string is unclosed, div will be automatically added as the root tag.

  • The html, body, and head tags will be automatically removed without notification, as these tags may be automatically supplemented by the lxml module and are not supported in dash.

  • The tail(Text after element's end tag, but before the next sibling element's start tag) will automatically be converted into the text of a span tag.

Enable dash_svg

Use dash-svg module to render SVG tags.

from html_to_dash import parse_html



element_str = """

<svg xmlns=" http://www.w3.org/2000/svg " version="1.1" width="300" height="300">

  <rect x="100" y="100" width="100" height="100" fill="#e74c3c"></rect>

  <polygon points="100,100 200,100 150,50" fill="#c0392b"></polygon>

  <polygon points="200,100 200,200 250,150" fill="#f39c12"></polygon>

  <polygon points="100,100 150,50 150,150 100,200" fill="#f1c40f"></polygon>

  <polygon points="150,50 200,100 250,50 200,0" fill="#2ecc71"></polygon>

  <polygon points="100,200 150,150 200,200 150,250" fill="#3498db"></polygon>

</svg>

"""



parse_html(element_str, enable_dash_svg=True)

Print:


Result:

dash_svg.Svg(

    xmlns=" http://www.w3.org/2000/svg ",

    version="1.1",

    width="300",

    height="300",

    children=[

        dash_svg.Rect(x="100", y="100", width="100", height="100", fill="#e74c3c"),

        dash_svg.Polygon(points="100,100 200,100 150,50", fill="#c0392b"),

        dash_svg.Polygon(points="200,100 200,200 250,150", fill="#f39c12"),

        dash_svg.Polygon(points="100,100 150,50 150,150 100,200", fill="#f1c40f"),

        dash_svg.Polygon(points="150,50 200,100 250,50 200,0", fill="#2ecc71"),

        dash_svg.Polygon(points="100,200 150,150 200,200 150,250", fill="#3498db"),

    ],

)

  • In the dash application, import dash_svg module will render normally.

  • The dash_svg has higher priority than dash.html, but lower priority than extra module.

Expanded usage

from html_to_dash import parse_html

element_str = """

<html>

<body>

<div>

    <input type="text" id="username" name="username" aria-label="Enter your username" aria-required="true">

    <div class='bg-gray-800' style='color:red;margin:10px'>

        <a href="#" id="link1">A</a>

    </div>

    <div>text</div>

    <svg></svg>

    <script></script>

    <div><a href="#" id="link2">B</a></div>

</div>

</body>

</html>

"""



extra_mod = [{"dcc": {"Input": ["id", "type", "placeholder", "aria-*"]}}]



def tag_attr_func(tag, items):

    if tag == "Input":

        k, v = items

        if "-" in k:

            return f'**{{"{k}": "{v}"}}'



parsed_ret = parse_html(

    element_str,

    tag_map={"svg": "img"},

    skip_tags=['script'],

    extra_mod=extra_mod,

    tag_attr_func=tag_attr_func,

    if_return=True,

)

print(parsed_ret)

Print:


# Tags : Unsupported [script] removed.

# Attrs: Unsupported [name] in dcc.Input removed.

html.Div(

    children=[

        dcc.Input(

            type="text",

            id="username",

            **{"aria-label": "Enter your username"},

            **{"aria-required": "true"}

        ),

        html.Div(

            className="bg-gray-800",

            style={"color": "red", "margin": "10px"},

            children=[html.A(href="#", id="link1", children=["A"])],

        ),

        html.Div(children=["text"]),

        html.Img(),

        html.Div(children=[html.A(href="#", id="link2", children=["B"])]),

    ]

)

  • The * sign is supported as a wildcard, like data-*, aria-*.

  • Both class and className can be handled correctly.

  • In fact, attributes with the "-" symbol are processed by default, which is only used here as an example. Similarly, the style attribute can be handled correctly.

  • If tag_map param is provided, will convert the corresponding tag names in the HTML based on the dict content before formal processing.

  • Tag in skip_tags will remove itself and its text.The priority of tag_map is higher than skip_tags.

  • Supports any custom module, not limited to HTML and DCC. Essentially, it is the processing of strings.

  • Custom module prioritize in order and above the default dash.html module.

  • The tag_attr_func param is a function that handle attribute formatting under the tag.

    When adding quotation marks within a string, double quotation marks should be added to avoid the black module being unable to parse.

    For example,f'**{{"{k}": "{v}"}}' instead of f"**{{'{k}': '{v}'}}"f'{k}="{v}"' instead of f"{k}='{v}'"

  • If the HTML structure is huge, set huge_tree to True.

References

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

html-to-dash-0.2.7.tar.gz (13.2 kB view hashes)

Uploaded Source

Built Distribution

html_to_dash-0.2.7-py3-none-any.whl (12.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page