A common library for netloc rule use case.
Project description
os-netloc-rule
A common library for netloc rule use case.
Netloc match is a very common and useful operation on processing URL. For example, netloc blacklist is a series rules of netloc with ALLOWED or DISALLOWED:
abc.example.com ALLOWED
.example.com DISALLOWED
You can skip processing http://www.example.com/001.html
becase it match the rule .example.com DISALLOWED
.
Install
pip install os-netloc-rule
Usage
-
we have different implementations of matcher. From the benchmark ,
DictMatcher
is much more quickly and memory efficient-
TreeMatcher
, based on prefix treefrom os_netloc_rule import TreeMatcher as Matcher
-
DictMatcher
, based on dictfrom os_netloc_rule import DictMatcher as Matcher
-
-
load rule
from os_netloc_rule import DictMatcher as Matcher rules = [ ('www.example.com', 1), ('abc.example.com', 2), ('abc.example.com:8080', 3), ] matcher = Matcher() for netloc, rule in rules: matcher.load(netloc, rule)
-
match rule
matcher.match('www.example.com') matcher.match('abc.example.com:8080')
-
if there are same netloc with different rule, the latter covers the former by default. But you can custom your own
cmp
function when loading rulesdef cmp(former, latter): return former if former > latter else latter matcher.load(netloc, rule, cmp=cmp)
-
dump rules
for netloc, rule in matcher.dump(): pass
-
delete rule
delete, rule = matcher.delete('www.example.com')
Benchmark
TreeMatcher
:
python version | operation | memory | speed |
---|---|---|---|
2.7.14 | load | 100w, 380M | 91k/s |
2.7.14 | match | - | 118k/s |
3.6.4 | load | 100w, 300M | 96k/s |
3.6.4 | match | - | 123k/s |
pypy-5.7.1 | load | 100w, 251M | 283k/s |
pypy-5.7.1 | match | - | 529k/s |
pypy3.6-7.2.0 | load | 100w, 305M | 265k/s |
pypy3.6-7.2.0 | match | - | 473k/s |
DictMatcher
:
python version | operation | memory | speed |
---|---|---|---|
2.7.14 | load | 100w, 120M | 650k/s |
2.7.14 | match | - | 417k/s |
3.6.4 | load | 100w, 100M | 578k/s |
3.6.4 | match | - | 389k/s |
pypy-5.7.1 | load | 100w, 75M | 1.14m/s |
pypy-5.7.1 | match | - | 2.4m/s |
pypy3.6-7.2.0 | load | 100w, 180M | 1.1m/s |
pypy3.6-7.2.0 | match | - | 2m/s |
Unit Tests
tox
License
MIT licensed.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.