BeautifulSoup Element Parser for Swarmauri.
Project description
Swarmauri Parser Beautifulsoupelement
A specialized parser that utilizes BeautifulSoup to extract specific HTML elements and their content from HTML documents. The parser accepts HTML strings only and produces a list of Document objects that capture both the HTML snippet for each matched element and metadata (the element tag and its index within the input).
Installation
Choose the installation workflow that fits your project:
pip
pip install swarmauri_parser_beautifulsoupelement
Poetry
poetry add swarmauri_parser_beautifulsoupelement
uv
If you have not installed uv yet, grab it with the official installer:
curl -LsSf https://astral.sh/uv/install.sh | sh
Once uv is available, add the parser to your environment:
uv pip install swarmauri_parser_beautifulsoupelement
Usage
The BeautifulSoupElementParser allows you to extract specific HTML elements from HTML content:
from swarmauri_parser_beautifulsoupelement import BeautifulSoupElementParser
# Create a parser instance to extract paragraphs
parser = BeautifulSoupElementParser(element="p")
# HTML content to parse
html_content = "<div><p>First paragraph</p><p>Second paragraph</p></div>"
# Parse the content (input must be a string)
documents = parser.parse(html_content)
# Access the extracted elements and metadata
for doc in documents:
print(doc.content) # Prints each paragraph element, including the surrounding <p> tag
print(doc.metadata) # {'element': 'p', 'index': 0}, {'element': 'p', 'index': 1}, ...
Note:
BeautifulSoupElementParser.parseraises aValueErrorif the provideddataargument is not a string. Ensure that you pass HTML content as a text string before invoking the parser.
Want to help?
If you want to contribute to swarmauri-sdk, read up on our guidelines for contributing that will help you get started.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file swarmauri_parser_beautifulsoupelement-0.9.0.dev36.tar.gz.
File metadata
- Download URL: swarmauri_parser_beautifulsoupelement-0.9.0.dev36.tar.gz
- Upload date:
- Size: 7.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.4 {"installer":{"name":"uv","version":"0.10.4","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
98db1ec64a27fd74e04f6c5bb9e991f21f654608daf4ee8df2444a78abbd5b53
|
|
| MD5 |
5fa6dcc9e7274215a796b247e3f98cf8
|
|
| BLAKE2b-256 |
43f8b744e9f6a1948939a8ee247001fd63c23ab43c99d2349a36916085722e40
|
File details
Details for the file swarmauri_parser_beautifulsoupelement-0.9.0.dev36-py3-none-any.whl.
File metadata
- Download URL: swarmauri_parser_beautifulsoupelement-0.9.0.dev36-py3-none-any.whl
- Upload date:
- Size: 8.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.4 {"installer":{"name":"uv","version":"0.10.4","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c843dab3e5430c097f2126f7c15cc617da2b7517b5d285ccba6b76ad3979210e
|
|
| MD5 |
b4c4b838bab1747f69ad7c40ae97a0a6
|
|
| BLAKE2b-256 |
a5c05399803cc9bd76598ab688904a6be32b2c25393ada9ed86978fbd4226672
|