Skip to main content

Package calculating the approximate graph edit distance between chemicals

Project description

ChemGED

ChemGED is a Python package for enabling the appoximate graph edit distance (GED) computation between chemicals. Normally, GED is a NP-hard problem, but ChemGED uses heuristics to approximate the GED in a reasonable time.

Installation

You can install ChemGED using pip:

pip install chemged

Usage

To use ChemGED, you just create an ApproximateChemicalGED object, and then call the compute_ged method with two chemical structures. They can be SMILES or RDKit Mol objects.

from chemged import ApproximateChemicalGED
ged_calc = ApproximateChemicalGED()

chemical1 = "CCO"  # SMILES of the first chemical
chemical2 = "CCN"  # SMILES of the second chemical

# you can use SMILES strings directly
approx_ged = ged_calc.compute_ged(chemical1, chemical2)

# you can also use RDKit Mol objects
from rdkit.Chem import MolFromSmiles
approx_ged = ged_calc.compute_ged(MolFromSmiles(chemical1), MolFromSmiles(chemical2))

ChemGED also implements pdist and cdist functions to compute pairwise distances between sets of chemicals. These will return as Numpy arrays.

[!NOTE] pdist will return the vector-form distance vector, while cdist will return a square-form distance matrix. scipy.spatial.distance.squareform can be used to convert the vector-form distance vector to a square-form distance matrix.

from chemged import pdist, cdist

# Create a list of chemicals
chemicals = ["CCO", "CCN", "CC", "C"]

# Compute pairwise distances
distances_vector = pdist(chemicals)
print(distances_vector)  # Vector form

# Compute all-vs-all distances
distances_matrix = cdist(chemicals, chemicals)
print(distances_matrix)  # Square form

Documentation

You can read more detailed documentation in the docs folder.

Implementation

The approach used here uses bipartite graph matching[1], and most of its implementation in python is based off scripts from https://github.com/priba/aproximated_ged/tree/master. ChemGED uses RDKit to handle chemicals inside python.

References

[1] Riesen, Kaspar, and Horst Bunke. "Approximate graph edit distance computation by means of bipartite graph matching." Image and Vision computing 27.7 (2009): 950-959

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chemged-0.1.1.tar.gz (12.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

chemged-0.1.1-py3-none-any.whl (13.2 kB view details)

Uploaded Python 3

File details

Details for the file chemged-0.1.1.tar.gz.

File metadata

  • Download URL: chemged-0.1.1.tar.gz
  • Upload date:
  • Size: 12.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.11.9 Windows/10

File hashes

Hashes for chemged-0.1.1.tar.gz
Algorithm Hash digest
SHA256 09c5b6c7544ec0125e8f7cc326e2fdea8f719fdccedc5d1723b01f0721ec0f43
MD5 dacef8122f6e0fed71cca9761efcdd27
BLAKE2b-256 75cb6e9c144cc8dec700ddf9b8967d873685a4cf8ae8526d7cb31ffbf3c0ca0e

See more details on using hashes here.

File details

Details for the file chemged-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: chemged-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 13.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.11.9 Windows/10

File hashes

Hashes for chemged-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 db8e65a6b06ab37d9d94bf96a7bec613082406ca1ebf7512e2abb7b18f1faf96
MD5 a5a229c49b0b55f1d56ea5e0691320f7
BLAKE2b-256 f43226a78853348cf71420beb14c4f15bdb099d489a7561b3b41df3009ea8f85

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page