Fast HTML5 parser with CSS selectors.
Project description
A fast HTML5 parser with CSS selectors using Modest engine.
Installation
From PyPI using pip:
pip install selectolax
Development version from github:
git clone --recursive https://github.com/rushter/selectolax
cd selectolax
pip install -r requirements_dev.txt
python setup.py install
How to compile selectolax while developing:
make clean
make dev
Examples
from selectolax.parser import HTMLParser
html = "<div><p id=p1><p id=p2><p id=p3><a>link</a><p id=p4><p id=p5>text<p id=p6></div>"
selector = "div > :nth-child(2n+1):not(:has(a))"
for node in HTMLParser(html).css(selector):
print(node.attributes, node.text(), node.tag)
print(node.parent.tag)
print(node.html)
Simple Benchmark
Average of 10 experiments to parse and retrieve URLs from 800 Google SERP pages.
Package |
Time |
Memory (peak) |
---|---|---|
selectolax |
2.38 sec. |
768.11 MB |
lxml |
18.67 sec. |
769.21 MB |
Links
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
selectolax-0.2.4.tar.gz
(1.3 MB
view hashes)
Built Distributions
selectolax-0.2.4-cp37-cp37m-win32.whl
(511.7 kB
view hashes)
selectolax-0.2.4-cp36-cp36m-win32.whl
(512.0 kB
view hashes)
selectolax-0.2.4-cp35-cp35m-win32.whl
(511.3 kB
view hashes)
Close
Hashes for selectolax-0.2.4-pp36-pypy36_pp73-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 29fae132b2be078a3efe7aebc6ff4c6f47eec7a91aac1da94cbb698f8a35ba93 |
|
MD5 | 887dcdabc250df00852b0625de8dd4f9 |
|
BLAKE2b-256 | 02d9df5af61808f78eb2cc95adf17f8e06ccdc174be4afd2d0ac1e0679e32080 |
Close
Hashes for selectolax-0.2.4-pp27-pypy_73-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e45f7d361b939fc706e6208aaede9f60e5d9d0074202d95e12ee45be5b78705c |
|
MD5 | ca8f187fcd6b2838b8abd8e4902a29d1 |
|
BLAKE2b-256 | 80f5c171d96bcda9d68757be54045411456ee5af88bf99316a5a6d12f5d5f21a |
Close
Hashes for selectolax-0.2.4-cp38-cp38-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9f6b34a67c08dc8ce0d9c8862bdcf72618d4722e1fb760c9994e9f9850d04836 |
|
MD5 | 6e64afffea32f1f82d98ee8d6bbc1175 |
|
BLAKE2b-256 | 988ba737c9e5686d3e0beb07dde2e2e784437a961880dbb83c063e2aa511fa72 |
Close
Hashes for selectolax-0.2.4-cp38-cp38-manylinux2010_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 942898b406f5e4be9b2fb45b5d2de0a7b80434d7fae30c286f60f1ba103e3ff2 |
|
MD5 | 62c7d90bdb926f25cda40fd33e4fc8b1 |
|
BLAKE2b-256 | 0dce6b8139650c6a2353154a6195c257bf8ed13b8757c2dc3bda8eb23a396bbf |
Close
Hashes for selectolax-0.2.4-cp38-cp38-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7f5beb09facc1715120674fa9606ea773bcd2d01507e35dd6359d8ea565abe48 |
|
MD5 | 69aceb220e653fa3a9f01a20eabe6d4e |
|
BLAKE2b-256 | d054b72d727adc56c993c08991734a114bed7ef0762f86791c4d28bafdc7088e |
Close
Hashes for selectolax-0.2.4-cp38-cp38-manylinux1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f3d4c60cf5c0d8e170bc123965079d431e34a044fdd94650a3ac9015148810b9 |
|
MD5 | b60ed1da484945d0f6209cab3c60f637 |
|
BLAKE2b-256 | c6ab883731c33fc2cf138a64b28163c3e2c48b8c0fdbc774de9f215613aa3783 |
Close
Hashes for selectolax-0.2.4-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9e54b4b0abf6b58293774313251f90487ba24184dc0f840045032e4df1001312 |
|
MD5 | a91aab64b12a15ea0850e8bf87cc9bc1 |
|
BLAKE2b-256 | 01b90ed994f782e016aee74579bbccaf5cc0ecbeb8c05fd8d4434863a4046859 |
Close
Hashes for selectolax-0.2.4-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 398151e74ba8b791a9ad247f69683680a1c5265485b3b97d8c0b40f0198716e8 |
|
MD5 | fddb0e4cf49e3ac2f5f2ec6d8d1635ed |
|
BLAKE2b-256 | d6ef8a06fa02e2b9198003d2bdae9fb26301009c5252290d41d63cdec6b1cb24 |
Close
Hashes for selectolax-0.2.4-cp37-cp37m-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5292ce489b0c7053c62f0fddd5b13fb3ac7baed68ba5cb8a585b219475ade9c3 |
|
MD5 | d5274f6db8c1d22ce5bb9904a03a5c98 |
|
BLAKE2b-256 | 22f05ea862bb41cef20c2150217c0f3a1f0244da2e6a0e101e1c1581cd83b007 |
Close
Hashes for selectolax-0.2.4-cp37-cp37m-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | da8be05ccb243efc6e6f93eff31b4fd55654f2ddb2c74de5ad9cc19424217428 |
|
MD5 | 37508692b7892baa56801fa6ce8c20ce |
|
BLAKE2b-256 | d17e4a0e8fe48013180baa450516a2049ebc814caa0e8347fbe235902a8ad129 |
Close
Hashes for selectolax-0.2.4-cp37-cp37m-manylinux2010_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 759d19a28938c6929854a2d09db2d30bfdec1735f0613972157ad149a743546f |
|
MD5 | 9340037551b86edfb2121bb5f289d0d3 |
|
BLAKE2b-256 | 9aad420db678daa4db10ab9974bb12da1c176e9dedcb537d89812348d960eb61 |
Close
Hashes for selectolax-0.2.4-cp37-cp37m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 91c5e13687ba0dae005263f6ed05e7568a8ca9a1bd7af6d23aef4449fa85b2d4 |
|
MD5 | 7046a0f51fc764a13bf1042a819aea6b |
|
BLAKE2b-256 | d2bf7dd15b91cc7065c696b2ac8e4bf1e9b8164d8477cb701f5113ed37ff447d |
Close
Hashes for selectolax-0.2.4-cp37-cp37m-manylinux1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c5647b1819094c1b920fa473bf2f5c4fd95cb5fcf16bba526c20544ae9301449 |
|
MD5 | cf05751e6df739fec4fc9853d839708c |
|
BLAKE2b-256 | 447fa86ff5d1b0b055dfb80dc8e4cc99ed264c258c87fa8275ebfa26d071443b |
Close
Hashes for selectolax-0.2.4-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d593c3fd12d3362b183a31f4f6985d24909976f9e42dd936c54615ee14dcb06a |
|
MD5 | 18345c9b194065134eaab757a7d90df4 |
|
BLAKE2b-256 | e7bb47c4af4256ce95f84c0c0a5001213a6dce91be2e576d5e185eba1a83007d |
Close
Hashes for selectolax-0.2.4-cp36-cp36m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 427e7484631c95d2c589287381a10e5d73d2c58d9cba24176a7eb44ec561ac00 |
|
MD5 | 823a7d7627578b0e5845d29b40116139 |
|
BLAKE2b-256 | 5075fc030b1adce4b0d60581456459d8c1287a125c6ac2a20300e5ddd2179ada |
Close
Hashes for selectolax-0.2.4-cp36-cp36m-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9b19463e44d9c670c30dd4f75d2c22ff04d74aea637b7cc8f12a5a063a306e44 |
|
MD5 | b01400cc2873c14d8b93a25fa4643458 |
|
BLAKE2b-256 | cf8b97d5f36c88a47a292ecf3b80ed873dec4e0f9df2bd08ed5c247b7b0b3447 |
Close
Hashes for selectolax-0.2.4-cp36-cp36m-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f296d157cd5a23365b0b22dafc74cc1b4953b6b3bbd76927abdad319a61c7d92 |
|
MD5 | 8517fc67085cd33c79fc04472b672b19 |
|
BLAKE2b-256 | 8450a42831bfbb70b451e512699b8253f41a9ac1e4b29e1cdf31d7cb8bf1f9ae |
Close
Hashes for selectolax-0.2.4-cp36-cp36m-manylinux2010_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 654e12173b57d9714092c05f45ae126c89b9d69eb00b5cc1a275b4e1f7c37f96 |
|
MD5 | 9cad603c758eaac485936045125eeeb9 |
|
BLAKE2b-256 | b9379a45ca5b6f9848d2129792b818ba84923d2224702297f0f7a323eace7db3 |
Close
Hashes for selectolax-0.2.4-cp36-cp36m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8c22532fd702666585760dfc605fcf24237955bdaefa3b1f9458fcc7841d373a |
|
MD5 | f1d26f37989e7bbbf233647a1ee2af09 |
|
BLAKE2b-256 | 0b29412cd664323bf7b1fa58bb8a0f8182b74af18cdf06558f2c36d3ec8dcb2d |
Close
Hashes for selectolax-0.2.4-cp36-cp36m-manylinux1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 25ea0c8f074f2ad7b66c87819a7f82da3d9ab630e873dec7725af686e54f9267 |
|
MD5 | e5cba5a6c3aedba47d9ebabba2aee21c |
|
BLAKE2b-256 | 2c29f258bbbe21773618e8a025e173c1eded452c7e5d9e55126b1ed988853a57 |
Close
Hashes for selectolax-0.2.4-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 45b3bdacc27964247444fd3ec9d096dc8752a3200440fbe37cfdf48acbd67c69 |
|
MD5 | 153ede9b9863f695e00acf2a37c9ae8a |
|
BLAKE2b-256 | 4949f245702eca1780b7f3ef55c943ea788b6d18562985ffbf3cb40b666b2fb2 |
Close
Hashes for selectolax-0.2.4-cp35-cp35m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 30220c9bf407220545794ff989d51928d42a4c1ad340a581a0a9faeeb98a4505 |
|
MD5 | b6de5b72b84209d911bca132a6c0f261 |
|
BLAKE2b-256 | 15b66877f6ed302d41cd31f8594e6fda3d43c839413c99d3bbf77219fd4659c9 |
Close
Hashes for selectolax-0.2.4-cp35-cp35m-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 655341abe295294522d7d092e827f5e9f81fc615602f689bd0fe988bf1517a25 |
|
MD5 | 9770e2486733b56c0e5635cefe385fb3 |
|
BLAKE2b-256 | 3a57a2bba31397285cc43514361fe94218aba298879dc74f6399d320e6183aa9 |
Close
Hashes for selectolax-0.2.4-cp35-cp35m-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f8113756bb1fdbddea0dfe32339b178796d12b20c5f1c62b0aec85e0b2eee421 |
|
MD5 | 1cfedf4d1a97a8bcf341513924a69080 |
|
BLAKE2b-256 | e2cfb561add360fe8a1b9041de3fb9917de0fa980252714dc35b658773fb7129 |
Close
Hashes for selectolax-0.2.4-cp35-cp35m-manylinux2010_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | af0cb30cea5b132d7a642524bea6a6c51aae6fe57b4898b8211f1e8608bb8527 |
|
MD5 | 3dc0c0c32b06e8d185b0fde8b3bcd7f0 |
|
BLAKE2b-256 | 267029e2fa5987dd6a57180cdbe28f39f7fbfc3163c4a699897a269bf8eb7052 |
Close
Hashes for selectolax-0.2.4-cp35-cp35m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fcae783425d2e4faa9311cc0d46b4a4f4f37c488ac32a07bd52795232d53c14e |
|
MD5 | 7c94980b3724e47978f25ffc5a2efce1 |
|
BLAKE2b-256 | 93bc147039675eb98f536354328146b5f513ccaad7b15e1eeb90434a2838f804 |
Close
Hashes for selectolax-0.2.4-cp35-cp35m-manylinux1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fb68e80c1d1c5d1dab10a167a12b11ae4265f2060037d92b8df52f6e501b0e24 |
|
MD5 | 4802b90e81275eff5ef2aa8f901ddbe3 |
|
BLAKE2b-256 | c39ddeb340cb0cdf05edabc0dd59ed1ec61e95c934b00041289aedf69fcfcc40 |
Close
Hashes for selectolax-0.2.4-cp35-cp35m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9ead9e56610746df25d54654633231ae43e86a828ef8fbd163ee69e87e25238e |
|
MD5 | 4a9d9cf17b8008ff08f6183faec2fcdf |
|
BLAKE2b-256 | 2cadf0335faa373ecdf3adb8dd9c2aaf30c6037312fcd4a7cb722b970080110d |