Skip to main content

Democritus functions for working with HTML.

Project description

Democritus Html

PyPI CI Lint codecov The Democritus Project uses semver version 2.0.0 The Democritus Project uses black to format code License: LGPL v3

Democritus functions[1] for working with HTML.

[1] Democritus functions are simple, effective, modular, well-tested, and well-documented Python functions.

We use d8s (pronounced "dee-eights") as an abbreviation for democritus (you can read more about this here).

Installation

pip install d8s-html

Usage

You import the library like:

from d8s_html import *

Once imported, you can use any of the functions listed below.

Functions

  • def html_text(html_content: StringOrBeautifulSoupObject) -> str:
        """."""
    
  • def html_unescape(html_content: StringOrBeautifulSoupObject) -> str:
        """."""
    
  • def html_escape(html_content: StringOrBeautifulSoupObject) -> str:
        """."""
    
  • def html_to_markdown(html_content: StringOrBeautifulSoupObject, **kwargs) -> str:
        """Convert the html string to markdown."""
    
  • def html_find_comments(html_content: StringOrBeautifulSoupObject) -> str:
        """Get a list of all of the comments in the html strings."""
    
  • def html_soupify(html_string: str, parser: str = 'html.parser') -> bs4.BeautifulSoup:
        """Return an instance of beautifulsoup with the html."""
    
  • def html_remove_tags(html_content: StringOrBeautifulSoupObject) -> bs4.BeautifulSoup:
        """."""
    
  • def html_remove_element(html_content: StringOrBeautifulSoupObject, element_tag: str) -> bs4.BeautifulSoup:
        """."""
    
  • def html_find_css_path(html_content: StringOrBeautifulSoupObject, css_path: str) -> ListOfBeautifulSoupTags:
        """Find the given css_path in the html_content."""
    
  • def html_elements_with_class(
        html_content: StringOrBeautifulSoupObject, html_element_class: str
    ) -> ListOfBeautifulSoupTags:
        """Find all elements with the given class from the html string."""
    
  • def html_elements_with_id(html_content: StringOrBeautifulSoupObject, html_element_id: str) -> ListOfBeautifulSoupTags:
        """Find all elements with the given html_element_id from the html_content."""
    
  • def html_elements_with_tag(html_content: StringOrBeautifulSoupObject, tag: str) -> ListOfBeautifulSoupTags:
        """."""
    
  • def html_headings_table_of_contents(html_content: StringOrBeautifulSoupObject) -> ListOfBeautifulSoupTags:
        """."""
    
  • def html_headings_table_of_contents_string(
        html_content: StringOrBeautifulSoupObject, *, indentation: str = '  '
    ) -> str:
        """."""
    
  • def html_headings(html_content: StringOrBeautifulSoupObject) -> ListOfBeautifulSoupTags:
        """."""
    
  • def html_to_json(html_content: StringOrBeautifulSoupObject, *, convert_only_tables: bool = False):
        """Convert the html to json using https://gitlab.com/fhightower/html-to-json."""
    
  • def html_soupify_first_arg_string(func):
        """Return a Beautiful Soup instance of the first argument (if it is a string)."""
    

Development

👋  If you want to get involved in this project, we have some short, helpful guides below:

If you have any questions or there is anything we did not cover, please raise an issue and we'll be happy to help.

Credits

This package was created with Cookiecutter and Floyd Hightower's Python project template.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

d8s_html-0.6.1.tar.gz (28.1 kB view details)

Uploaded Source

Built Distribution

d8s_html-0.6.1-py2.py3-none-any.whl (23.2 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file d8s_html-0.6.1.tar.gz.

File metadata

  • Download URL: d8s_html-0.6.1.tar.gz
  • Upload date:
  • Size: 28.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.10.5

File hashes

Hashes for d8s_html-0.6.1.tar.gz
Algorithm Hash digest
SHA256 4153122cdea87a1378e00c1a3a42d5e36b349c4e0f84fb9fd6943dc14c50ffa0
MD5 e5fdb6521e43b5ac98b8bc891eae6e7b
BLAKE2b-256 752c3523025172dfbaa148d5abe54d3349c04d96122d74b4e0734fee4956e2ae

See more details on using hashes here.

File details

Details for the file d8s_html-0.6.1-py2.py3-none-any.whl.

File metadata

  • Download URL: d8s_html-0.6.1-py2.py3-none-any.whl
  • Upload date:
  • Size: 23.2 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.10.5

File hashes

Hashes for d8s_html-0.6.1-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 7ea76942e79bc604209d7812a22cc5f79f899ad5503f27502c543027090d95d4
MD5 93a937ba6261524d76286d7e96ec331a
BLAKE2b-256 7f2870cc7326d5bf280c076f8f21fed8043d2fc6d648a60ad57ad33749884094

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page