DOM parser for html, xml and other.
Project description
Spicy
is a tag-based parser of text.
For example, HTML or XML are based on tags. And data text parsing can be useful when you need to find some tags by name and attributes or take separate parts of document.
Running
git clone 'https://github.com/michael7nightingale/spicy.git'
Python >= 3.11 There is no any installed python libraries. Every thing is from the box.
"""your parser file"""
from spicy import Spicy
from urllib.request import Request,urlopen
request = Request(url='https://example.com/')
with urlopen(request) as response:
html_text = response.read().decode(encoding='utf-8')
spicy = Spicy(
text=html_text,
doctype='html' # it`s already default
)
print(spicy.tag) # html
print(spicy.children) # ['<Tag: head>', '<Tag: body>']
print(spicy) # all the document in string type
head, body = spicy.children
print(el.attributes for el in body)
Spicy tags and document have rich searching logic:
findAll()- returns the list of tag objects with the given parameters;findIter()- generator version offindAll(), can reduce memory usage;findFirst()- returns first tag object with given parameters;findLast()- returns first tag object with given parameters;getElementById()- returns tag object with given id;
Useful properties:
tag- represents tag name (link, div, html, img, etc.)className- class attribute value, if exists;id- id attribute value, if exists;attributes- representation of all tag options (attributes), for example: align=center, href=/admin/user;parent- parent tag node of DOM;children- the list of children tag nodes;
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file spicyy-0.6.tar.gz.
File metadata
- Download URL: spicyy-0.6.tar.gz
- Upload date:
- Size: 13.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 colorama/0.4.4 importlib-metadata/4.6.4 keyring/23.5.0 pkginfo/1.8.2 readme-renderer/34.0 requests-toolbelt/0.9.1 requests/2.25.1 rfc3986/1.5.0 tqdm/4.57.0 urllib3/1.26.5 CPython/3.10.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
065fc263fb9a859206c8a82e14d7c139bebf3bcf5b15a8d67a56f72713489635
|
|
| MD5 |
a61de60414d1593b2d4f9066de907326
|
|
| BLAKE2b-256 |
5d22f4832438f1f4f6c732cfd20ad0799e6b03e73792ad86cea9ea35b43832e2
|
File details
Details for the file spicyy-0.6-py3-none-any.whl.
File metadata
- Download URL: spicyy-0.6-py3-none-any.whl
- Upload date:
- Size: 18.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 colorama/0.4.4 importlib-metadata/4.6.4 keyring/23.5.0 pkginfo/1.8.2 readme-renderer/34.0 requests-toolbelt/0.9.1 requests/2.25.1 rfc3986/1.5.0 tqdm/4.57.0 urllib3/1.26.5 CPython/3.10.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ab369d8b8c404ad799e3e5f48a2bda0a86c17dc1ed1a5ec73daaa7fd30ad8895
|
|
| MD5 |
9bf26191745111d7fcd144e65dba00f7
|
|
| BLAKE2b-256 |
2e35b8b9d1ca4a61281c0aebef2306a87ef49e9e1ae5267c004693bc29683c47
|