Skip to main content

Improved fingerprints for the OpenEye Toolkits

Project description

OEFP

High-performance molecular fingerprints for the OpenEye Toolkits.

OEFP generates RDKit-compatible Morgan and Atom Pair fingerprints from OpenEye molecules, stores them in compact C++ containers, and compares them with fast scalar and batch kernels. Python bindings are built with SWIG, so openeye.oechem molecules pass directly into C++ without serialization.

OEFP currently supports dense binary, sparse binary, and sparse counted fingerprint containers; scalar comparison; query-to-batch comparison; cdist; and SciPy-compatible condensed pdist.

Try it out:

pip install oefp

Usage

Here are a few examples of using oefp.

Python

from openeye import oechem
import oefp

mol = oechem.OEGraphMol()
oechem.OESmilesToMol(mol, "CC(=O)OC1=CC=CC=C1C(=O)O")  # aspirin

# Generate an RDKit-compatible Morgan fingerprint.
fp = oefp.morgan_fingerprint(mol, radius=2, num_bits=2048)
print(fp.popcount)
print(fp.words[:4])

# Compare fingerprints.
score = oefp.compare(fp, fp, oefp.Metric.tanimoto())
print(score)

Use reusable generators when applying the same options to many molecules:

from openeye import oechem
import oefp

smiles = ["c1ccccc1", "c1ccc(O)cc1", "CC(=O)O"]
mols = []
for smi in smiles:
    mol = oechem.OEGraphMol()
    oechem.OESmilesToMol(mol, smi)
    mols.append(mol)

generator = oefp.MorganGenerator(radius=2, num_bits=2048)
fps = [generator.fingerprint(mol) for mol in mols]

batch = oefp.OEFPBatch.from_fingerprints(fps)
distances = oefp.pdist(batch, oefp.Metric.jaccard())

Generate sparse and counted fingerprints:

folded_count = oefp.morgan_count_fingerprint(mol)
sparse_binary = oefp.morgan_sparse_fingerprint(mol)
atom_pair_count = oefp.atom_pair_sparse_count_fingerprint(mol)

print(folded_count.indices[:5])
print(folded_count.counts[:5])
print(sparse_binary.indices[:5])
print(atom_pair_count.total_count)

Inspect Morgan bit provenance:

result = oefp.morgan_fingerprint_with_mapping(mol)
print(result.fingerprint.popcount)
print(result.mapping.bit_info())

Import and export OpenEye fingerprints:

from openeye import oechem, oegraphsim
import oefp

mol = oechem.OEGraphMol()
oechem.OESmilesToMol(mol, "CCO")

oe_fp = oegraphsim.OEFingerPrint()
oegraphsim.OEMakeCircularFP(oe_fp, mol)

fp = oefp.from_openeye_fingerprint(oe_fp)
round_tripped = oefp.to_openeye_fingerprint(fp)
print(oegraphsim.OETanimoto(oe_fp, round_tripped))

C++

#include <oefp/oefp.h>
#include <oechem.h>
#include <iostream>

int main() {
    OEChem::OEGraphMol mol_a;
    OEChem::OEGraphMol mol_b;
    OEChem::OESmilesToMol(mol_a, "c1ccccc1");
    OEChem::OESmilesToMol(mol_b, "c1ccc(O)cc1");

    OEFP::MorganGenerator generator;
    OEFP::OEFP fp_a = generator.Fingerprint(mol_a);
    OEFP::OEFP fp_b = generator.Fingerprint(mol_b);

    double score = OEFP::Compare(fp_a, fp_b, OEFP::Metric::Tanimoto());
    std::cout << score << "\n";

    return 0;
}

Supported Fingerprints

Family Outputs Notes
Morgan Folded binary, folded count, sparse binary, sparse count Bit mapping is available for all Morgan outputs
Atom Pair Folded binary, folded count, sparse binary, sparse count Count simulation is enabled by default for binary output
OpenEye OEFingerPrint import/export Numeric type metadata is preserved when available

Current conformance scope is explicit: Morgan chirality, Atom Pair chirality, and Atom Pair 3D-distance generation raise ValueError until those paths have dedicated RDKit parity coverage.

Installation

Install OpenEye Toolkits first:

pip install --extra-index-url https://pypi.anaconda.org/openeye/simple openeye-toolkits

Install OEFP:

pip install oefp

Build from Source

Set the OpenEye C++ SDK path:

export OPENEYE_ROOT=/path/to/openeye/sdk

Build the C++ library and Python bindings:

cmake --preset debug
cmake --build build-debug

Install the Python package in editable mode:

pip install --config-settings editable_mode=compat -e python/

The editable_mode=compat flag keeps the package on a traditional editable path that works with compiled SWIG extension modules.

Tests

C++ tests:

cmake --build build-debug --target oefp_tests
ctest --test-dir build-debug --output-on-failure

Python tests:

PYTHONPATH=python python -m pytest tests/python -q

RDKit is required for conformance tests but is not a runtime dependency.

Documentation

Build the Sphinx documentation:

python -m pip install -r docs/requirements.txt
make -C docs html

Open the local build:

open docs/_build/html/index.html

The documentation includes installation, quickstart, Python API notes, C++ API reference generation through Doxygen, and release build guidance.

Benchmarks

Run the RDKit generation and dense pdist benchmark:

PYTHONPATH=python python benchmarks/benchmark_rdkit_generation.py \
  --max-mols 1500 \
  --trials 7 \
  --warmup 1 \
  --pdist-size 400 \
  --generation-max-ratio 1.10 \
  --atom-pair-generation-max-ratio 1.10

Run the optional C++ guardrail against a local oecluster checkout:

cmake -S . -B build-bench \
  -DOEFP_BUILD_BENCHMARKS=ON \
  -DOEFP_OECLUSTER_SOURCE_DIR=/path/to/oecluster
cmake --build build-bench --target oefp_oecluster_fingerprint_benchmark
./build-bench/benchmarks/oefp_oecluster_fingerprint_benchmark 512 0 256

Tools

Tool Purpose
CMake C++ build system
SWIG Python bindings
scikit-build-core Python wheel build backend
cmake-openeye OpenEye CMake discovery and SWIG helpers
vrzn Version synchronization
pytest Python tests
Sphinx Documentation

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

oefp-0.2.4-cp310-abi3-win_amd64.whl (5.6 MB view details)

Uploaded CPython 3.10+Windows x86-64

oefp-0.2.4-cp310-abi3-manylinux_2_34_aarch64.whl (1.0 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.34+ ARM64

oefp-0.2.4-cp310-abi3-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (995.0 kB view details)

Uploaded CPython 3.10+manylinux: glibc 2.24+ x86-64manylinux: glibc 2.28+ x86-64

oefp-0.2.4-cp310-abi3-macosx_15_0_arm64.whl (349.9 kB view details)

Uploaded CPython 3.10+macOS 15.0+ ARM64

File details

Details for the file oefp-0.2.4-cp310-abi3-win_amd64.whl.

File metadata

  • Download URL: oefp-0.2.4-cp310-abi3-win_amd64.whl
  • Upload date:
  • Size: 5.6 MB
  • Tags: CPython 3.10+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for oefp-0.2.4-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 69c6a8b2af3b829d7cbc46f9c4d0c9debdc3c5f72e5d5ccac821618d9541a190
MD5 8e734571bd49599b8232a0c1a44b9065
BLAKE2b-256 dc54b3613735798c65127cdc5a7e5e426198d19ab58d4079bfd1d12a3d7b4a14

See more details on using hashes here.

Provenance

The following attestation bundles were made for oefp-0.2.4-cp310-abi3-win_amd64.whl:

Publisher: build-wheels.yml on scott-arne/oefp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file oefp-0.2.4-cp310-abi3-manylinux_2_34_aarch64.whl.

File metadata

File hashes

Hashes for oefp-0.2.4-cp310-abi3-manylinux_2_34_aarch64.whl
Algorithm Hash digest
SHA256 12f0bf5b94e1f2972af911069e45d6ac5dc3e409e3ab50bcf6b97b0a4e3fa650
MD5 7f165a629de5a9b909910acd643da05b
BLAKE2b-256 da4e70c90f7f6470fd9eb13e59933b1e99c8784ad3c30671ad33705934fd0122

See more details on using hashes here.

Provenance

The following attestation bundles were made for oefp-0.2.4-cp310-abi3-manylinux_2_34_aarch64.whl:

Publisher: build-wheels.yml on scott-arne/oefp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file oefp-0.2.4-cp310-abi3-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for oefp-0.2.4-cp310-abi3-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 372a6ef7759871b8ed2532d7faad701f1fa94ac5ba9fd0ce800ea4b1d812bf49
MD5 a624ce7f33695afdd4daa0c4a990ef80
BLAKE2b-256 fc5c69c7ced45615380461547b7aeaeea98750950a3dbe206dcc2ea5bb2c4ba9

See more details on using hashes here.

Provenance

The following attestation bundles were made for oefp-0.2.4-cp310-abi3-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl:

Publisher: build-wheels.yml on scott-arne/oefp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file oefp-0.2.4-cp310-abi3-macosx_15_0_arm64.whl.

File metadata

  • Download URL: oefp-0.2.4-cp310-abi3-macosx_15_0_arm64.whl
  • Upload date:
  • Size: 349.9 kB
  • Tags: CPython 3.10+, macOS 15.0+ ARM64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for oefp-0.2.4-cp310-abi3-macosx_15_0_arm64.whl
Algorithm Hash digest
SHA256 0897b2d63c0d59466c0fd938e4e2958010fae6e1b0df641c929bf1f54a7158b9
MD5 e45d59568047b91ec2a7f7f9aa055493
BLAKE2b-256 5d0b4d8e8fb030e6831ed10cc115cc184f330bdd76ad77019a14af7521057ae4

See more details on using hashes here.

Provenance

The following attestation bundles were made for oefp-0.2.4-cp310-abi3-macosx_15_0_arm64.whl:

Publisher: build-wheels.yml on scott-arne/oefp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page