Skip to main content

A C implementation of Nilsimsa for Python.

Project description

# cNilsimsa

A C implementation of Nilsimsa for Python.

`shell $ pip install cnilsimsa `

We are building this module one piece at a time. So far, that means only compare_hexdigests because needing a faster way to do that was the primary motivation to start this project.

`python from cnilsimsa import compare_hexdigests `

It works exactly like the method of the same name from pynilsimsa but is more than an order of magnitude faster, so if you need to do lots of deduplication over a large corpus of documents via nilsimsa hex digests from Python, this should be helpful.

Building out the rest of of the methods for representing and cooking LSHs to provide a full drop-in replacement for pynilsimsa is the longer term goal.

`python import cnilsimsa as nilsimsa `

The more complete pure Python implementation is here:

https://code.google.com/p/py-nilsimsa/

Thanks to the authors of the Ruby/C implementation from which our our fillpopcount() function is taken.

https://github.com/jwilkins/nilsimsa

Thanks to the Perl/C implementation that inspired both predecessors.

http://ixazon.dynip.com/~cmeclax/nilsimsa.html

Contributions welcome.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cnilsimsa-0.1.2.tar.gz (3.1 kB view details)

Uploaded Source

File details

Details for the file cnilsimsa-0.1.2.tar.gz.

File metadata

  • Download URL: cnilsimsa-0.1.2.tar.gz
  • Upload date:
  • Size: 3.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for cnilsimsa-0.1.2.tar.gz
Algorithm Hash digest
SHA256 874aca98b970434cb1ac7dffcdbc47a9ab4399fa43619033be20ad49e5d82576
MD5 eb3be0bf94ea71b2c050ac7fddcd0fde
BLAKE2b-256 445ff8713b8a692b1f61e35277b4f2321fe038cd54833034e1a548debfaddbf8

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page