A set of command-line tools and a Python library for cheminformatics fingerprint creation and similarity search.
chemfp is a set of command-lines tools for generating cheminformatics fingerprints and searching those fingerprints by Tanimoto similarity, as well as a Python library which can be used to build new tools.
These algorithms are designed for the dense, 100-10,000 bit fingerprints which occur in small-molecule/pharmaceutical chemisty. The Tanimoto search algorithms are implemented in C for performance and support both threshold and k-nearest searches.
Fingerprint generation can be done either by extracting existing fingerprint data from an SD file or by using an existing chemistry toolkit. chemfp supports the Python libraries from Open Babel, OpenEye, and RDKit toolkits.