A common library for netloc rule use case.
Project description
os-netloc-rule
A common library for netloc rule use case.
Netloc match is a very common and useful operation on processing URL. For example, netloc blacklist is a series rules of netloc with ALLOWED or DISALLOWED:
abc.example.com ALLOWED
.example.com DISALLOWED
You can skip processing http://www.example.com/001.html
becase it match the rule .example.com DISALLOWED
.
Install
pip install os-netloc-rule
Usage
-
we have different implementations of matcher. From the benchmark ,
DictMatcher
is much more quickly and memory efficient-
TreeMatcher
, based on prefix treefrom os_netloc_rule import TreeMatcher as Matcher
-
DictMatcher
, based on dictfrom os_netloc_rule import DictMatcher as Matcher
-
-
load rule
from os_netloc_rule import DictMatcher as Matcher rules = [ ('www.example.com', 1), ('abc.example.com', 2), ('abc.example.com:8080', 3), ] matcher = Matcher() for netloc, rule in rules: matcher.load(netloc, rule)
-
match rule
matcher.match('www.example.com') matcher.match('abc.example.com:8080')
-
if there are same netloc with different rule, the latter covers the former by default. But you can custom your own
cmp
function when loading rulesdef cmp(former, latter): return former if former > latter else latter matcher.load(netloc, rule, cmp=cmp)
-
dump rules
for netloc, rule in matcher.dump(): pass
-
delete rule
delete, rule = matcher.delete('www.example.com')
Benchmark
TreeMatcher
:
python version | operation | memory | speed |
---|---|---|---|
2.7.14 | load | 100w, 380M | 91k/s |
2.7.14 | match | - | 118k/s |
3.6.4 | load | 100w, 300M | 96k/s |
3.6.4 | match | - | 123k/s |
pypy-5.7.1 | load | 100w, 251M | 283k/s |
pypy-5.7.1 | match | - | 529k/s |
pypy3.6-7.2.0 | load | 100w, 305M | 265k/s |
pypy3.6-7.2.0 | match | - | 473k/s |
DictMatcher
:
python version | operation | memory | speed |
---|---|---|---|
2.7.14 | load | 100w, 120M | 650k/s |
2.7.14 | match | - | 417k/s |
3.6.4 | load | 100w, 100M | 578k/s |
3.6.4 | match | - | 389k/s |
pypy-5.7.1 | load | 100w, 75M | 1.14m/s |
pypy-5.7.1 | match | - | 2.4m/s |
pypy3.6-7.2.0 | load | 100w, 180M | 1.1m/s |
pypy3.6-7.2.0 | match | - | 2m/s |
Unit Tests
tox
License
MIT licensed.
Project details
Release history Release notifications
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Filename, size | File type | Python version | Upload date | Hashes |
---|---|---|---|---|
Filename, size os-netloc-rule-0.2.tar.gz (8.3 kB) | File type Source | Python version None | Upload date | Hashes View hashes |