BeautifulSoup interface for lxml
Project description
FastSoup
BeautifulSoup interface for lxml
Key features
FAST search in tree
FAST serialize to str
BeautifulSoup4 interface to interact with object:
Search: find, find_all, find_next, find_next_sibling
Text: get_text, string
Tag: name, get, clear, __getitem__, __str__, __repr__, append, new_tag, extract, replace_with
Install
pip install fast-soup==1.1.0
How to use
from fast_soup import FastSoup
content = ... # read some html content
soup = FastSoup(content)
# interact like BS4 object
result = soup.find('a', id='my_link')
# interact like lxml object
el = result.unwrap()
FAQ
Q: BS4 already implement lxml parser. Why i should use FastSoup?
A: Yes, BS4 implement parser, and it’s just building the tree. All next interactions proceed with “Python speed”: searching, serialization. FastSoup internally use lxml and guarantee “C speed”.
Q: How FastSoup speedup works?
A: FastSoup just build xpath and execute them. For prevent rebuilding LRU cache used.
Q: Why you don’t support whole interface? This will be soon?
A: I wrote functions which speed up parsing in my projects. Just create a issue or pull request and i think we find the solution ;)
Miscellaneous
You can got power of BeautifulSoup when wrap your lxml objects, e.g:
from fast_soup import Tag
content = ... # some bytes ready to parse
context = lxml.etree.iterparse(
io.BytesIO(content), ...
)
for event, elem in context:
tag = Tag(elem)
tag_text = tag.get_text()
tag_attr = tag['attribute']
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file fast-soup-1.1.0.tar.gz
.
File metadata
- Download URL: fast-soup-1.1.0.tar.gz
- Upload date:
- Size: 17.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.0.2 CPython/3.7.3 Darwin/19.3.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1dd5561ad632c80b1fedf1369dddf3ab36f588216507cb036122b58ea1098b25 |
|
MD5 | 0ea4683f3cf5a365b30ce61bc3136119 |
|
BLAKE2b-256 | 6e2bd79b13c8c9a0616fe7e6cbece71ea55d1ec2ed5a6f98a2239cd03dd8e2ad |
File details
Details for the file fast_soup-1.1.0-py3-none-any.whl
.
File metadata
- Download URL: fast_soup-1.1.0-py3-none-any.whl
- Upload date:
- Size: 16.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.0.2 CPython/3.7.3 Darwin/19.3.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 667727116d3a85edcb91fce60a023c5ae18e7ac9aff1c349d2bd938db9c2e539 |
|
MD5 | bddc9eda773ff28cb726ac96870e8dd2 |
|
BLAKE2b-256 | 1173bb35af0264172a4ad5a374d00f294071e3b284829eb9a8f7c3e114672797 |