Skip to main content

DOM parser for html, xml and other.

Project description

Spicy

is a tag-based parser of text.

For example, HTML or XML are based on tags. And data text parsing can be useful when you need to find some tags by name and attributes or take separate parts of document.

Running

git clone 'https://github.com/michael7nightingale/spicy.git' 

Python >= 3.11 There is no any installed python libraries. Every thing is from the box.

"""your parser file"""
from spicy import Spicy
from urllib.request import Request,urlopen

request = Request(url='https://example.com/')
with urlopen(request) as response:
    html_text = response.read().decode(encoding='utf-8')
    
spicy = Spicy(
    text=html_text, 
    doctype='html'    # it`s already default
)

print(spicy.tag)    # html
print(spicy.children)   # ['<Tag: head>', '<Tag: body>']
print(spicy)    # all the document in string type

head, body = spicy.children
print(el.attributes for el in body)

Spicy tags and document have rich searching logic:

  • findAll() - returns the list of tag objects with the given parameters;
  • findIter() - generator version of findAll(), can reduce memory usage;
  • findFirst() - returns first tag object with given parameters;
  • findLast() - returns first tag object with given parameters;
  • getElementById() - returns tag object with given id;

Useful properties:

  • tag - represents tag name (link, div, html, img, etc.)
  • className - class attribute value, if exists;
  • id - id attribute value, if exists;
  • attributes - representation of all tag options (attributes), for example: align=center, href=/admin/user;
  • parent - parent tag node of DOM;
  • children - the list of children tag nodes;

Project details


Release history Release notifications | RSS feed

This version

0.6

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spicyy-0.6.tar.gz (13.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

spicyy-0.6-py3-none-any.whl (18.6 kB view details)

Uploaded Python 3

File details

Details for the file spicyy-0.6.tar.gz.

File metadata

  • Download URL: spicyy-0.6.tar.gz
  • Upload date:
  • Size: 13.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 colorama/0.4.4 importlib-metadata/4.6.4 keyring/23.5.0 pkginfo/1.8.2 readme-renderer/34.0 requests-toolbelt/0.9.1 requests/2.25.1 rfc3986/1.5.0 tqdm/4.57.0 urllib3/1.26.5 CPython/3.10.6

File hashes

Hashes for spicyy-0.6.tar.gz
Algorithm Hash digest
SHA256 065fc263fb9a859206c8a82e14d7c139bebf3bcf5b15a8d67a56f72713489635
MD5 a61de60414d1593b2d4f9066de907326
BLAKE2b-256 5d22f4832438f1f4f6c732cfd20ad0799e6b03e73792ad86cea9ea35b43832e2

See more details on using hashes here.

File details

Details for the file spicyy-0.6-py3-none-any.whl.

File metadata

  • Download URL: spicyy-0.6-py3-none-any.whl
  • Upload date:
  • Size: 18.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 colorama/0.4.4 importlib-metadata/4.6.4 keyring/23.5.0 pkginfo/1.8.2 readme-renderer/34.0 requests-toolbelt/0.9.1 requests/2.25.1 rfc3986/1.5.0 tqdm/4.57.0 urllib3/1.26.5 CPython/3.10.6

File hashes

Hashes for spicyy-0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 ab369d8b8c404ad799e3e5f48a2bda0a86c17dc1ed1a5ec73daaa7fd30ad8895
MD5 9bf26191745111d7fcd144e65dba00f7
BLAKE2b-256 2e35b8b9d1ca4a61281c0aebef2306a87ef49e9e1ae5267c004693bc29683c47

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page