Fast HTML5 CSS selector.
Project description
A fast HTML5 parser and CSS selectors using Modest engine.
Alpha version.
Installation
From PyPI using pip:
pip install selectolax
Development version from github:
git clone --recursive https://github.com/rushter/selectolax
cd selectolax
pip -r requirements_dev.txt
python setup.py install
Examples
from selectolax.parser import HTMLParser
html = "<div><p id=p1><p id=p2><p id=p3><a>link</a><p id=p4><p id=p5>text<p id=p6></div>"
selector = "div > :nth-child(2n+1):not(:has(a))"
for node in HTMLParser(html).css(selector):
print(node.attributes, node.text(), node.tag)
print(node.parent.tag)
print(node.html)
Simple Benchmark
Average of 10 experiments to parse and retrieve URLs from 800 Google SERP pages.
Package |
Time |
Memory (peak) |
---|---|---|
selectolax |
2.38 sec. |
768.11 MB |
lxml |
18.67 sec. |
769.21 MB |
Links
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
selectolax-0.1.6.tar.gz
(1.2 MB
view hashes)
Built Distributions
selectolax-0.1.6-cp36-cp36m-win32.whl
(457.7 kB
view hashes)
Close
Hashes for selectolax-0.1.6-cp36-cp36m-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 49c87632abe8c97e1a456871b51e3d6278f750d5fc9a7b562d970699eb9d04f7 |
|
MD5 | ecd0da0ad7091a4ea1f556cb0d419f09 |
|
BLAKE2b-256 | 6a8c9ff1bcd10f28f7fee8e8a588d6bf80255a19a2e12ae051ee573a3e4ba857 |
Close
Hashes for selectolax-0.1.6-cp36-cp36m-macosx_10_7_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 44929c5e13a7bde4f27d18c5130e6a83883c4360dc6cef7b1e8783c098f28efb |
|
MD5 | b73f446deab5ab45b7898b87294c1b9b |
|
BLAKE2b-256 | 26bb3b6caf7cce5f42c956fab99edbc8d58810c1783257ce590fb6fbd5bf207e |