Skip to main content

No project description provided

Project description

ChemGED

ChemGED is a Python package for enabling the appoximate graph edit distance (GED) computation between chemicals. Normally, GED is a NP-hard problem, but ChemGED uses heuristics to approximate the GED in a reasonable time.

Installation

You can install ChemGED using pip:

pip install chemged

Usage

To use ChemGED, you just create an ApproximateChemicalGED object, and then call the compute_ged method with two chemical structures. They can be SMILES or RDKit Mol objects.

from chemged import ApproximateChemicalGED
ged_calc = ApproximateChemicalGED()

chemical1 = "CCO"  # SMILES of the first chemical
chemical2 = "CCN"  # SMILES of the second chemical

# you can use SMILES strings directly
approx_ged = ged_calc.compute_ged(chemical1, chemical2)

# you can also use RDKit Mol objects
from rdkit.Chem import MolFromSmiles
approx_ged = ged_calc.compute_ged(MolFromSmiles(chemical1), MolFromSmiles(chemical2))

ChemGED also implements pdist and cdist functions to compute pairwise distances between sets of chemicals. These will return as Numpy arrays.

[!NOTE] pdist will return the vector-form distance vector, while cdist will return a square-form distance matrix. scipy.spatial.distance.squareform can be used to convert the vector-form distance vector to a square-form distance matrix.

from chemged import pdist, cdist

# Create a list of chemicals
chemicals = ["CCO", "CCN", "CC", "C"]

# Compute pairwise distances
distances_vector = pdist(chemicals)
print(distances_vector)  # Vector form

# Compute all-vs-all distances
distances_matrix = cdist(chemicals, chemicals)
print(distances_matrix)  # Square form

Documentation

You can read more detailed documentation in the docs folder.

Implementation

The approach used here uses bipartite graph matching[1], and most of its implementation in python is based off scripts from https://github.com/priba/aproximated_ged/tree/master. ChemGED uses RDKit to handle chemicals inside python.

References

[1] Riesen, Kaspar, and Horst Bunke. "Approximate graph edit distance computation by means of bipartite graph matching." Image and Vision computing 27.7 (2009): 950-959

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chemged-0.1.0.tar.gz (12.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

chemged-0.1.0-py3-none-any.whl (13.2 kB view details)

Uploaded Python 3

File details

Details for the file chemged-0.1.0.tar.gz.

File metadata

  • Download URL: chemged-0.1.0.tar.gz
  • Upload date:
  • Size: 12.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.11.9 Windows/10

File hashes

Hashes for chemged-0.1.0.tar.gz
Algorithm Hash digest
SHA256 f2afccf4916e2a90097b0fc786c28ea6de7de51b0ef4dfbd520fc1bd42cdca40
MD5 4d867dc60c2275d73a1cf1af63052479
BLAKE2b-256 6c0863938ea6fb9e075e4b87e7aa5394b011d9219e209b80efa354e4f1a011c4

See more details on using hashes here.

File details

Details for the file chemged-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: chemged-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 13.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.11.9 Windows/10

File hashes

Hashes for chemged-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 51e17f57d3397006409da8e69ae1ca2db8a5c5c57617273eb30489a6d9d7c3b0
MD5 b04846c5c64c800755b86200d37079c9
BLAKE2b-256 620bb67cb11f2b7e8f9c664ce4a66ba8ab05df968621058a1b4ab1cc7779caa6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page