Skip to main content

A common library for netloc rule use case.

Project description

os-netloc-rule

Build Status codecov PyPI - Python Version PyPI

A common library for netloc rule use case.

Netloc match is a very common and useful operation on processing URL. For example, netloc blacklist is a series rules of netloc with ALLOWED or DISALLOWED:

abc.example.com ALLOWED
.example.com DISALLOWED

You can skip processing http://www.example.com/001.html becase it match the rule .example.com DISALLOWED.

Install

pip install os-netloc-rule

Usage

  • we have different implementations of matcher. From the benchmark , DictMatcher is much more quickly and memory efficient

    • TreeMatcher, based on prefix tree

      from os_netloc_rule import TreeMatcher as Matcher
      
    • DictMatcher, based on dict

      from os_netloc_rule import DictMatcher as Matcher
      
  • load rule

    from os_netloc_rule import DictMatcher as Matcher
    
    rules = [
        ('www.example.com', 1),
        ('abc.example.com', 2),
        ('abc.example.com:8080', 3),
    ]
    
    matcher = Matcher()
    for netloc, rule in rules:
        matcher.load(netloc, rule)
    
  • match rule

    matcher.match('www.example.com')
    matcher.match('abc.example.com:8080')
    
  • if there are same netloc with different rule, the latter covers the former by default. But you can custom your own cmp function when loading rules

    def cmp(former, latter):
        return former if former > latter else latter
        
    matcher.load(netloc, rule, cmp=cmp)
    
  • dump rules

    for netloc, rule in matcher.dump():
        pass
    
  • delete rule

    delete, rule = matcher.delete('www.example.com')
    

Benchmark

TreeMatcher:

python version operation memory speed
2.7.14 load 100w, 380M 91k/s
2.7.14 match - 118k/s
3.6.4 load 100w, 300M 96k/s
3.6.4 match - 123k/s
pypy-5.7.1 load 100w, 251M 283k/s
pypy-5.7.1 match - 529k/s
pypy3.6-7.2.0 load 100w, 305M 265k/s
pypy3.6-7.2.0 match - 473k/s

DictMatcher:

python version operation memory speed
2.7.14 load 100w, 120M 650k/s
2.7.14 match - 417k/s
3.6.4 load 100w, 100M 578k/s
3.6.4 match - 389k/s
pypy-5.7.1 load 100w, 75M 1.14m/s
pypy-5.7.1 match - 2.4m/s
pypy3.6-7.2.0 load 100w, 180M 1.1m/s
pypy3.6-7.2.0 match - 2m/s

Unit Tests

tox

License

MIT licensed.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for os-netloc-rule, version 0.2
Filename, size File type Python version Upload date Hashes
Filename, size os-netloc-rule-0.2.tar.gz (8.3 kB) File type Source Python version None Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page