A basic HTML parser in Python
Project description
htmlparse: A basic HTML parser in Python
Installation
# Linux
python3 -m pip install parser-html
# Windows
python -m pip install parser-html
# Build from source
python -m pip install git+https://github.com/AaravMalani/htmlparse
Usage
import htmlparse
with open('index.html', 'r') as f:
element = htmlparse.parse_html(f.read())
if not element:
raise ValueError("Parsing failed!")
print(element.children) # Sub-elements
print(element.innerHTML) # Data enclosed by tag
print(element.outerHTML) # Data enclosed by tag as well as the tag itself
element.innerHTML = 'e' # Rebuilds this element and sets the innerHTML of all the parent elements
element.outerHTML = '<div class="black blue"><a href="https://github.com/" id="abc"></div>' # Read above statement
# assigning to element.children is in the works
tag = element.getElementById('abc')
print(tag.attrs) # {"href":"https://github.com/", "id":"abc"}
print(tag.tag_name) # a
ToDo
- Support for CSS styles
- Support for JS scripts
- Support for assignment to
HTMLElement.children
list
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
parser-html-1.0.0.tar.gz
(5.0 kB
view hashes)
Built Distribution
Close
Hashes for parser_html-1.0.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ae0c93e566c424d24185e5096d38cd9009a59478d4f3698eeec7b75a7dfb2ee6 |
|
MD5 | 1aa487f4b3df85c4addc2ee1c7f68e2e |
|
BLAKE2b-256 | 78e42e5cfd2e69dcb4e5db9177ed2d91450d0460ee30f03dddbf4a9ee8cdc0cf |