Fast Unicode mapping and character blacklists using Python C extension.
Project description
turboguard
Python C extension to validate and sanitize the user input using blacklist and character map.
Install
pip install turboguard
Quickstart.
Create an instance of the Sanitizer
class as the below.
The Sanitizer.__enter__
method returns a callable(str) -> str
which let
you to call it many times without worring about performance and memory leak.
from turboguard import Sanitizer, BlacklistedError
blacklist = [
('\U0001d100', '\U0001d1ff'), # Blacklist Unicode range
('\u0600', '\u0610'), # Blacklist Unicode range
'\u0635', # Blacklist single character
]
replace = [
('\u0636', '\u0637'), # Replace \u0636 by \u0637
('b', 'B'),
]
with Sanitizer(blacklist, replace) as sanitize: # Loading(Slow) part
try:
# Fast calls
assert sanitize('foo bar') == 'foo bar'
assert sanitize(None) == None
except BlacklistedError:
print('Validation failed!')
Contribution
The turboguard/core.c
file contains all logics for allocation and memory
cleanup as well as the core_sanitize
function which is the only function
to use the given database.
turboguard/__init__.py
just contains the Python wrapper arround the C
module and force cleanup and initiate using the Python's context manager
(the with
syntax).
What to do after fork:
Setup development environment
It's highly recommended to create a virtual environment at the first.
Install project in editable mode: pip install -e .
make env
Build the C extension
make build
Test your environment
make cover
What to do after edit:
Lint code using:
make lint
Pass tests:
make clean build cover
Submit a pull request.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.