Skip to main content

A common library for netloc rule use case.

Project description

os-netloc-rule

Build Status codecov PyPI - Python Version PyPI

A common library for netloc rule use case.

Netloc match is a very common and useful operation on processing URL. For example, netloc blacklist is a series rules of netloc with ALLOWED or DISALLOWED:

abc.example.com ALLOWED
.example.com DISALLOWED

You can skip processing http://www.example.com/001.html becase it match the rule .example.com DISALLOWED.

Install

pip install os-netloc-rule

Usage

  • we have different implementations of matcher. From the benchmark , DictMatcher is much more quickly and memory efficient

    • TreeMatcher, based on prefix tree

      from os_netloc_rule import TreeMatcher as Matcher
      
    • DictMatcher, based on dict

      from os_netloc_rule import DictMatcher as Matcher
      
  • load rule

    from os_netloc_rule import DictMatcher as Matcher
    
    rules = [
        ('www.example.com', 1),
        ('abc.example.com', 2),
        ('abc.example.com:8080', 3),
    ]
    
    matcher = Matcher()
    for netloc, rule in rules:
        matcher.load(netloc, rule)
    
  • match rule

    matcher.match('www.example.com')
    matcher.match('abc.example.com:8080')
    
  • if there are same netloc with different rule, the latter covers the former by default. But you can custom your own cmp function when loading rules

    def cmp(former, latter):
        return former if former > latter else latter
        
    matcher.load(netloc, rule, cmp=cmp)
    
  • dump rules

    for netloc, rule in matcher.dump():
        pass
    
  • delete rule

    delete, rule = matcher.delete('www.example.com')
    

Benchmark

TreeMatcher:

python version operation memory speed
2.7.14 load 100w, 380M 91k/s
2.7.14 match - 118k/s
3.6.4 load 100w, 300M 96k/s
3.6.4 match - 123k/s
pypy-5.7.1 load 100w, 251M 283k/s
pypy-5.7.1 match - 529k/s
pypy3.6-7.2.0 load 100w, 305M 265k/s
pypy3.6-7.2.0 match - 473k/s

DictMatcher:

python version operation memory speed
2.7.14 load 100w, 120M 650k/s
2.7.14 match - 417k/s
3.6.4 load 100w, 100M 578k/s
3.6.4 match - 389k/s
pypy-5.7.1 load 100w, 75M 1.14m/s
pypy-5.7.1 match - 2.4m/s
pypy3.6-7.2.0 load 100w, 180M 1.1m/s
pypy3.6-7.2.0 match - 2m/s

Unit Tests

tox

License

MIT licensed.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

os-netloc-rule-0.2.tar.gz (8.3 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page