Skip to main content

Democritus functions for working with HTML.

Project description

Democritus Html

PyPI CI Lint codecov The Democritus Project uses semver version 2.0.0 The Democritus Project uses black to format code License: LGPL v3

Democritus functions[1] for working with HTML.

[1] Democritus functions are simple, effective, modular, well-tested, and well-documented Python functions.

We use d8s as an abbreviation for democritus (you can read more about this here).

Functions

  • def html_text(html_content: StringOrBeautifulSoupObject) -> str:
        """."""
    
  • def html_unescape(html_content: StringOrBeautifulSoupObject) -> str:
        """."""
    
  • def html_escape(html_content: StringOrBeautifulSoupObject) -> str:
        """."""
    
  • def html_to_markdown(html_content: StringOrBeautifulSoupObject, **kwargs) -> str:
        """Convert the html string to markdown."""
    
  • def html_find_comments(html_content: StringOrBeautifulSoupObject) -> str:
        """Get a list of all of the comments in the html strings."""
    
  • def html_soupify(html_string: str, parser: str = 'html.parser') -> bs4.BeautifulSoup:
        """Return an instance of beautifulsoup with the html."""
    
  • def html_remove_tags(html_content: StringOrBeautifulSoupObject) -> bs4.BeautifulSoup:
        """."""
    
  • def html_remove_element(html_content: StringOrBeautifulSoupObject, element_tag: str) -> bs4.BeautifulSoup:
        """."""
    
  • def html_find_css_path(html_content: StringOrBeautifulSoupObject, css_path: str) -> ListOfBeautifulSoupTags:
        """Find the given css_path in the html_content."""
    
  • def html_elements_with_class(
        html_content: StringOrBeautifulSoupObject, html_element_class: str
    ) -> ListOfBeautifulSoupTags:
        """Find all elements with the given class from the html string."""
    
  • def html_elements_with_id(html_content: StringOrBeautifulSoupObject, html_element_id: str) -> ListOfBeautifulSoupTags:
        """Find all elements with the given html_element_id from the html_content."""
    
  • def html_elements_with_tag(html_content: StringOrBeautifulSoupObject, tag: str) -> ListOfBeautifulSoupTags:
        """."""
    
  • def html_headings_table_of_contents(html_content: StringOrBeautifulSoupObject) -> ListOfBeautifulSoupTags:
        """."""
    
  • def html_headings_table_of_contents_string(
        html_content: StringOrBeautifulSoupObject, *, indentation: str = '  '
    ) -> str:
        """."""
    
  • def html_headings(html_content: StringOrBeautifulSoupObject) -> ListOfBeautifulSoupTags:
        """."""
    
  • def html_to_json(html_content: StringOrBeautifulSoupObject, *, convert_only_tables: bool = False):
        """Convert the html to json using https://gitlab.com/fhightower/html-to-json."""
    
  • def html_soupify_first_arg_string(func):
        """Return a Beautiful Soup instance of the first argument (if it is a string)."""
    

Development

👋  If you want to get involved in this project, we have some short, helpful guides below:

If you have any questions or there is anything we did not cover, please raise an issue and we'll be happy to help.

Credits

This package was created with Cookiecutter and Floyd Hightower's Python project template.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

d8s_html-0.5.2.tar.gz (27.4 kB view details)

Uploaded Source

Built Distribution

d8s_html-0.5.2-py2.py3-none-any.whl (23.1 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file d8s_html-0.5.2.tar.gz.

File metadata

  • Download URL: d8s_html-0.5.2.tar.gz
  • Upload date:
  • Size: 27.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.10.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.9.4

File hashes

Hashes for d8s_html-0.5.2.tar.gz
Algorithm Hash digest
SHA256 2b563d40c41a2522cd5da93dd7dd346366df0994b603e8f93e47f92ab85dc7a3
MD5 9f24f593c51c6eaccf083583f6939401
BLAKE2b-256 95d0cd8ba8c6a30f3713a376d4882cfd316ec75106d7e3db16c47fffa4ad238d

See more details on using hashes here.

File details

Details for the file d8s_html-0.5.2-py2.py3-none-any.whl.

File metadata

  • Download URL: d8s_html-0.5.2-py2.py3-none-any.whl
  • Upload date:
  • Size: 23.1 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.10.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.9.4

File hashes

Hashes for d8s_html-0.5.2-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 20f925a99444851c1f046b946bb14a2ad10665bba8939aa8f74b88ae05395333
MD5 e3b17e64b8788c111b7ee76ac4e9800f
BLAKE2b-256 0d56318c9fd14206d815cc1f7376aaf93002057f85c1f7db260968f45ff83613

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page