Fast HTML5 parser with CSS selectors.
Project description
A fast HTML5 parser with CSS selectors using Modest engine.
Installation
From PyPI using pip:
pip install selectolax
Development version from github:
git clone --recursive https://github.com/rushter/selectolax
cd selectolax
pip install -r requirements_dev.txt
python setup.py install
How to compile selectolax while developing:
make clean
make dev
Basic examples
In [1]: from selectolax.parser import HTMLParser
...:
...: html = """
...: <h1 id="title" data-updated="20201101">Hi there</h1>
...: <div class="post">Lorem Ipsum is simply dummy text of the printing and typesetting industry. </div>
...: <div class="post">Lorem ipsum dolor sit amet, consectetur adipiscing elit.</div>
...: """
...: tree = HTMLParser(html)
In [2]: tree.css_first('h1#title').text()
Out[2]: 'Hi there'
In [3]: tree.css_first('h1#title').attributes
Out[3]: {'id': 'title', 'data-updated': '20201101'}
In [4]: [node.text() for node in tree.css('.post')]
Out[4]:
['Lorem Ipsum is simply dummy text of the printing and typesetting industry. ',
'Lorem ipsum dolor sit amet, consectetur adipiscing elit.']
In [1]: html = "<div><p id=p1><p id=p2><p id=p3><a>link</a><p id=p4><p id=p5>text<p id=p6></div>"
...: selector = "div > :nth-child(2n+1):not(:has(a))"
In [2]: for node in HTMLParser(html).css(selector):
...: print(node.attributes, node.text(), node.tag)
...: print(node.parent.tag)
...: print(node.html)
...:
{'id': 'p1'} p
div
<p id="p1"></p>
{'id': 'p5'} text p
div
<p id="p5">text</p>
Simple Benchmark
Average of 10 experiments to parse and retrieve URLs from 800 Google SERP pages.
Package |
Time |
Memory (peak) |
---|---|---|
selectolax |
2.38 sec. |
768.11 MB |
lxml |
18.67 sec. |
769.21 MB |
Links
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
selectolax-0.2.9.tar.gz
(1.3 MB
view hashes)
Built Distributions
selectolax-0.2.9-cp39-cp39-win32.whl
(557.5 kB
view hashes)
selectolax-0.2.9-cp38-cp38-win32.whl
(556.3 kB
view hashes)
selectolax-0.2.9-cp37-cp37m-win32.whl
(555.2 kB
view hashes)
selectolax-0.2.9-cp36-cp36m-win32.whl
(555.2 kB
view hashes)
selectolax-0.2.9-cp35-cp35m-win32.whl
(554.5 kB
view hashes)
Close
Hashes for selectolax-0.2.9-pp37-pypy37_pp73-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 08d26cc587b65bbaf8ee02c59ed701442b4cc885b8efd748d0cebfd71a6adf17 |
|
MD5 | 053bd7495b6f49ecc1bced94ffb0eb99 |
|
BLAKE2b-256 | e9409af50be03799d3cada101f7a0dd9246f444718add6a5f4693133144e67d2 |
Close
Hashes for selectolax-0.2.9-pp37-pypy37_pp73-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b3df1aacac8abfdb8a48903af270cc8dadb267bbcba4a1619509550c5621d2e8 |
|
MD5 | 5c75069b7f9de86660215894b4ea6e51 |
|
BLAKE2b-256 | b9a1d3e49dde92b3b98172f254e3e2caa7f0cc33a9250e27b74a8b5235fe6f9f |
Close
Hashes for selectolax-0.2.9-pp37-pypy37_pp73-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e38b68695e26c4122368d7d8300f03c501ddf342b55fa77a33680c3d1a136e3a |
|
MD5 | a226c203623156a7384fca01ee3f3f71 |
|
BLAKE2b-256 | 031f814b6050d70011402ae46c299895f32168baf677e363e38df9281714943c |
Close
Hashes for selectolax-0.2.9-pp37-pypy37_pp73-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9c3655eba589d8b0e48ada55103b0999e2a138546c7d87c4a145943cb76cd484 |
|
MD5 | 0229e43bd1ae87265f2ef4dbb2245c2b |
|
BLAKE2b-256 | 261e99d49447036b85a180f340bb35a4e07bd1cebd6d3e0e521aebedbbec5c61 |
Close
Hashes for selectolax-0.2.9-pp36-pypy36_pp73-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 47f89319f77a49162a63a72a998b37222a0fe9ef4239e9ec93f49f3bd1812f99 |
|
MD5 | 9cd580895ef812dcf5f41061c51a8c75 |
|
BLAKE2b-256 | d440bd53a16803b9b1ef4ef2cc4eb9c8a7d85df4ad0e9da67f3b2ce2c2bab574 |
Close
Hashes for selectolax-0.2.9-pp36-pypy36_pp73-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b6db30c9e1d09091d5c2ba7840533131f3c82dd6ddbdfad6858f2148659b5ce1 |
|
MD5 | 339e47f4363811c351872709c4127b4a |
|
BLAKE2b-256 | 1cb3b3a6facaca9cac32fed22806e34e08cea322a8e72fd8789fd7be7e5c945b |
Close
Hashes for selectolax-0.2.9-pp36-pypy36_pp73-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f34ae8ff9dd0f99cf704d70319530476afb5425fd0d744630af9bf974f056dca |
|
MD5 | 32f48c54947d0e854c346def4291b13f |
|
BLAKE2b-256 | c332e0d082e5def5a55201e568121a83a864fc51f4b09e1373db73aab17e5ae1 |
Close
Hashes for selectolax-0.2.9-pp36-pypy36_pp73-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9ac44a41af4b7b8aa1a987d23384c573fbd66b3e3c2b4c5d13958df167bd241d |
|
MD5 | b7b9c65aea569ffaf101efa3d3f825d1 |
|
BLAKE2b-256 | f4ef3c30c81656d930a4f5a383608c634848a5b6886c2312bfa16557b8d43b2b |
Close
Hashes for selectolax-0.2.9-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e0edc945fbb7013bbe7211297e1dcdbc1fe84fb20e6271e7dde2b388873cb205 |
|
MD5 | 5423f56098a53a5df7239b611f1544cc |
|
BLAKE2b-256 | 95b3458d433e1951e92c5fee131a572c8bff08ac087fd3e8049530d6eaf4faa2 |
Close
Hashes for selectolax-0.2.9-cp39-cp39-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 97fe610e863bb7c723d79a5a303e28fbb614b8208269378a66f51d657ccb2924 |
|
MD5 | 2370d2a4a2e5667a7159a7a6126874b6 |
|
BLAKE2b-256 | b26677ec0ed9ee86f36e8a0798658d3e3db7e427bab8e6092f4a144fcccc7747 |
Close
Hashes for selectolax-0.2.9-cp39-cp39-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | bce0c91cb2c67c1028e26855f1399b093eed178dcb15d42732c976d62b0a57f3 |
|
MD5 | f086e62d6ac696076069e7f493622aab |
|
BLAKE2b-256 | e57021be10192628d2193b191f5f85c1a0654aac1679cca94b692574a61a9966 |
Close
Hashes for selectolax-0.2.9-cp39-cp39-manylinux2010_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f4413d61d83bd6eaf914680a08c5506ef1bd0736dd08ca9b5a6fb39638d3fe01 |
|
MD5 | a453a8a698ee91eaa88cd03b56b91696 |
|
BLAKE2b-256 | 0fd5a2936020c14b77bb477740232d90806a333ef5ba86a635c80fdc3fb5de26 |
Close
Hashes for selectolax-0.2.9-cp39-cp39-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 557976a2136ade40eb44795d9e59c26db5dd2e756b4e11d7e290ad16e08c59d7 |
|
MD5 | f4d63b12a8cbbfd254585990c619c686 |
|
BLAKE2b-256 | 9d32037cefd5b40cef014c10ee805e0bea43db8760e46ebcabba68b27a5ef385 |
Close
Hashes for selectolax-0.2.9-cp39-cp39-manylinux1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b516b4f2cfea6a7d8b1ce1f6018f39ed84088c83c842afbeed4840b9ff0240a1 |
|
MD5 | df5276b84cce765cd94aa574785e55e2 |
|
BLAKE2b-256 | 9a4809fea0ff5deb187ad327176f9f173f1aa7bf0a776bf93e5b544ef1e858a6 |
Close
Hashes for selectolax-0.2.9-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d0aed68bb6e400d9ccb9b39c6b93ede5418bafdbfa4d755e69612efe58a82e65 |
|
MD5 | 2b5a95ccd62bd5755af36a3f8bb8a6dc |
|
BLAKE2b-256 | 5c93e7cef82180980b3ec4e872e06c40cd72d7dea9b6be4adb89644d89904e60 |
Close
Hashes for selectolax-0.2.9-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b7faa28d9a40559bcfa7a3d23505dd6c53de32c5c18a83086a695f47b37f0d4c |
|
MD5 | d01e5c9bfdbecf052326ebad397e7164 |
|
BLAKE2b-256 | 7b9ebf16bfc9db821d352961515d9fcd1f214ef2e504578cc44cdbe994f39f56 |
Close
Hashes for selectolax-0.2.9-cp38-cp38-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e27a6088d0eea0bb9fb626d3cb61bbf747379eff7adb498189115d1270902a97 |
|
MD5 | c632f0efcf1d1c28db9e926657a89a79 |
|
BLAKE2b-256 | 7e6b0c1b383c13452f65082410435ddf5731712e88c59e3389f885c7a5d9c578 |
Close
Hashes for selectolax-0.2.9-cp38-cp38-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 676b676fbabb623e7f1f478c0291b5892c585cc7a2cd310963dbaf2aa602f113 |
|
MD5 | c987a74ed954bf399b69ea2a9813a515 |
|
BLAKE2b-256 | 88d798f81b683d13330d661051a4df9242f4f2615bf461417b95e2303a92fd6f |
Close
Hashes for selectolax-0.2.9-cp38-cp38-manylinux2010_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1b399833418ec12b0cbd38b09e25659dec35295c7481f5c53736af0efe57abb4 |
|
MD5 | ea53c374c06ebefca79ffff9402ef15a |
|
BLAKE2b-256 | 574d0832ecbafa3fbb7a41708040714a02306a8c3000be4deba70f2d090a61d0 |
Close
Hashes for selectolax-0.2.9-cp38-cp38-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4e99ab32d443f0fd3f4880cda397829065ec875d6f82e4877d19bbf4de5ea44d |
|
MD5 | 21a89a851c68b40f06030c56a55c148f |
|
BLAKE2b-256 | fc5c0802b225f539491b35365cf12160fef832b3bc11721ec6457bee0626231c |
Close
Hashes for selectolax-0.2.9-cp38-cp38-manylinux1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 70d25eceb4c46cf5e56a328b7d82fa5a4657ab5425f8005a147214f97a2be28a |
|
MD5 | 07c25b9e78e783e48ea4ff1b4cf1ee92 |
|
BLAKE2b-256 | e538ae0110a886f6faec9d60dac754fee38ebfd4ea944cfa1fda76d93abdafd1 |
Close
Hashes for selectolax-0.2.9-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 30dbe35e1a21890eb3bc5bf78a92623c61315bf16eac7539932a7a559397c2e8 |
|
MD5 | 760be3a880f30cdbde450ff57a218629 |
|
BLAKE2b-256 | 3d71747699e08e45f5dba9de0708cf834337080a2b9881a15c08926f557426dd |
Close
Hashes for selectolax-0.2.9-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3a26be6c3bb52dd11beb21ddcdf5f459ca7811d38d115df0425716777bd42148 |
|
MD5 | e9692855a46a62d41c288655c4d9b9fd |
|
BLAKE2b-256 | 5f5fea98a357f804a735c62851e5734cee706f8fa65c0c33b99fa680b0debc24 |
Close
Hashes for selectolax-0.2.9-cp37-cp37m-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 131e23982b3d0ab7fe8da6ead3d0d9958d8a8fe8526159fcf15150aba9d68782 |
|
MD5 | 7f31810df0b336436c3f897ed692a941 |
|
BLAKE2b-256 | 7e08b33c7daede2b1d486a28f0fb9ef610ad7df56c87c8b31fc3665cc6f9494d |
Close
Hashes for selectolax-0.2.9-cp37-cp37m-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5c8e752136b98d01e255def86e35b280938dbc6872794e87e241245a82874549 |
|
MD5 | d93cdc67f20dc1763b6c0b6acb9ccda4 |
|
BLAKE2b-256 | 7cc995b82cbeccbac9f3e7f26792ca59eb8f5644d40676f4ab5938f08629683a |
Close
Hashes for selectolax-0.2.9-cp37-cp37m-manylinux2010_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9dcb4c8eb9abf21d61c2d157ce63f083bd8c8eac2e7dbe8f4f719092efccce7b |
|
MD5 | 418abd19a3aa86b0b2cdcb39dd9515f5 |
|
BLAKE2b-256 | 4da28a51524213b28f404bdac8c3931734e7767d04b79d40e9ea0c1fbfab521c |
Close
Hashes for selectolax-0.2.9-cp37-cp37m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b9b5ae1ec3fdf7488b0bff2c2023e68a3ecb2e964f91b54ff1feb8b29858123f |
|
MD5 | dd8c44ce73b16bfaefef713e2a68f157 |
|
BLAKE2b-256 | dbbe2820200a51746f767a190f4850a4f1ac2bb97c8eefc80118a8cd7ef9e6f6 |
Close
Hashes for selectolax-0.2.9-cp37-cp37m-manylinux1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 98ad70bc574c9d820e5cdb1561f4c382a291603159b916bb64f5590a0d6817f6 |
|
MD5 | ba2863e244e9be0ea4183e58bb773a2d |
|
BLAKE2b-256 | 67e41799c900a0f2db23f71ae8f1fc4d0cb4999c77e637852f49ba5cf134e8b5 |
Close
Hashes for selectolax-0.2.9-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 09145e3fa548264f1f3a3fdc4183d1d89995506b11d87ce193409c5e2f69c354 |
|
MD5 | 29ae9a25129cbaf142566671c4765f9e |
|
BLAKE2b-256 | b4c2b55eff4d44c33cdf5d66701d4a6bedd882b33d17b394c78eb5d30a15ea8d |
Close
Hashes for selectolax-0.2.9-cp36-cp36m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c313b18e95453dfb8ebad64230794ad3ffe8a0e2e0e7a4bb518355f1b4b9610b |
|
MD5 | 8df9520d5109e46ac6c381c0c9243166 |
|
BLAKE2b-256 | 9e6ad3d3c82a527732b70c7fa7c7924554d4cc6a2b20bd00e2ce50dbd27dcf93 |
Close
Hashes for selectolax-0.2.9-cp36-cp36m-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 91d514ee2c483a5952d427ebf56f606d3fe542c4b0e2a0ed09ad2362d63fe8ee |
|
MD5 | 010842a4fd43a45cc6e289716471e38b |
|
BLAKE2b-256 | 018b83950f3a629f4540d8b57dc2c60ad3248ec7fa228d6c2ff129ca1379679c |
Close
Hashes for selectolax-0.2.9-cp36-cp36m-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 761c975d2baf78f34fc972b8659b02e927719f01bf77cdcfc71e2cd42fcf566e |
|
MD5 | 7d5962463b392334b182a671a1589dcf |
|
BLAKE2b-256 | 7672173f943f24c05adfd3730eb744e787856d921e1077cc968dd97bcc8b43ee |
Close
Hashes for selectolax-0.2.9-cp36-cp36m-manylinux2010_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 75881a3ce1363c6a502a0c5288318dcf751c81beac7d0428f9f90f8152b2cdfa |
|
MD5 | 9ea8896085f6426327ab3b5167a0f381 |
|
BLAKE2b-256 | b0665ef0906b0eb08a7ad8c84a21f381c610a33033c872d2f74fd878f8ab1487 |
Close
Hashes for selectolax-0.2.9-cp36-cp36m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 265fedbda6ce2793c6aaae0f251e59fd657478b64168df223319c46d6e13dd98 |
|
MD5 | bb2f078d76370c66b2e02a61d54ddab1 |
|
BLAKE2b-256 | 583dcd22bfd7752d41c9eb369b8434a4f07078ca0c19f2e682047768c8e13d31 |
Close
Hashes for selectolax-0.2.9-cp36-cp36m-manylinux1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 536a2fa2f59ba708c5e3f32f71eee9a36fdc9355ee76ef94d2f74f196a81b3d1 |
|
MD5 | d56deaf7189d6767f6b06caa86ca8e28 |
|
BLAKE2b-256 | acbd18163d45c8a5d64b68ee442a032ec37baaf243515a6fd5cdb1d435321217 |
Close
Hashes for selectolax-0.2.9-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d8d14dcce878395a4f28d9735504506f04b61599b123bd25a767f42d7f3c7c10 |
|
MD5 | c0509937553f65e65d8671e34c6f3fa2 |
|
BLAKE2b-256 | e1270db7d26c50091b09f71419eb492767aab2e61de82d43eb8f0aa5aac9822c |
Close
Hashes for selectolax-0.2.9-cp35-cp35m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | dd2a3e81f2c63076a1257d87ab4bd5c43a1ae68f1d5d5a8aea20959c1f5aa7e7 |
|
MD5 | e5ad0b36e578361b7c5d91d64d87f716 |
|
BLAKE2b-256 | 6da879b59dbf7830d5ad38988b54ae4e701802e19ebd8bc7db27064840f0ce94 |
Close
Hashes for selectolax-0.2.9-cp35-cp35m-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c1381988ea2c9874dd75947f23d001127c2c362c6e89f97e93e9a076d948fed5 |
|
MD5 | 6347febe0c12882873724d1791b942b0 |
|
BLAKE2b-256 | 2976da95a118482649954ceb3e07a83eab16595c4764bd5c4e65b78948cbcaab |
Close
Hashes for selectolax-0.2.9-cp35-cp35m-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6c4418e6b6ad42c3e0de356a398756a616b78e14c2151b0df9e09c6ab779b456 |
|
MD5 | 7841e878ef8424c5f9893349246ede06 |
|
BLAKE2b-256 | e85ac18e41ec7134b6dbb46d7b304f1fbbf90d8a50bc728cffdcb60b227c32c4 |
Close
Hashes for selectolax-0.2.9-cp35-cp35m-manylinux2010_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 005b5bd8fbf01f16a43a10f0cd5da41c9868c88e17bc8fa3dc1ce756913599ec |
|
MD5 | 333e4087174ada4ea072b316d2794b2b |
|
BLAKE2b-256 | 16972f365f912654dc68ff0ae64ed24c46ea1dff382571985c8ab15786904bd3 |
Close
Hashes for selectolax-0.2.9-cp35-cp35m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2270e509e5f6f2af42b939e87196451fdeed661a6bf2183fef8125a1637b6432 |
|
MD5 | d31d34b5f1dd67231ec819cdab26eafd |
|
BLAKE2b-256 | 5f010d5901e826119ae6c5747886ba002f33402850c895744a13b615e7297bf9 |
Close
Hashes for selectolax-0.2.9-cp35-cp35m-manylinux1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 327b538e22e1af44a09a4a5844644cb149650067d867528f164407a7e60f70af |
|
MD5 | 5ff70bfa43c07bea5fa5c66192435aa5 |
|
BLAKE2b-256 | 53744dcd23be328c14f6d76a8591f0ac50700684aecd66a584cd673c31d0af99 |
Close
Hashes for selectolax-0.2.9-cp35-cp35m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c2d2ee9ed63953f72ce837a8b299b2fe209662bf538290b556b5824ba9b50882 |
|
MD5 | fd238c2b3281882b5a8daddb32a05933 |
|
BLAKE2b-256 | 5b6b577921c2582a02058a98fcf1578c694b4512914b7ac6ad15c7301f62a7bc |