4 projects
fpscan
An extremely fast first-order parallel scan for PyTorch (CUDA).
entropy-hash
EntropyHash: near document duplicate detection algorithm
maryam
OWASP Maryam is a modular/optional open-source framework based on OSINT and data gathering.
doc2term
A fast NLP tokenizer that detects tokens and remove duplications and punctuations