No project description provided
Project description
ChemGED
ChemGED is a Python package for enabling the appoximate graph edit distance (GED) computation between chemicals. Normally, GED is a NP-hard problem, but ChemGED uses heuristics to approximate the GED in a reasonable time.
Installation
You can install ChemGED using pip:
pip install chemged
Usage
To use ChemGED, you just create an ApproximateChemicalGED object, and then call the
compute_ged method with two chemical structures. They can be SMILES or
RDKit Mol objects.
from chemged import ApproximateChemicalGED
ged_calc = ApproximateChemicalGED()
chemical1 = "CCO" # SMILES of the first chemical
chemical2 = "CCN" # SMILES of the second chemical
# you can use SMILES strings directly
approx_ged = ged_calc.compute_ged(chemical1, chemical2)
# you can also use RDKit Mol objects
from rdkit.Chem import MolFromSmiles
approx_ged = ged_calc.compute_ged(MolFromSmiles(chemical1), MolFromSmiles(chemical2))
ChemGED also implements pdist and cdist functions to compute pairwise distances between
sets of chemicals. These will return as Numpy arrays.
[!NOTE]
pdistwill return the vector-form distance vector, whilecdistwill return a square-form distance matrix.scipy.spatial.distance.squareformcan be used to convert the vector-form distance vector to a square-form distance matrix.
from chemged import pdist, cdist
# Create a list of chemicals
chemicals = ["CCO", "CCN", "CC", "C"]
# Compute pairwise distances
distances_vector = pdist(chemicals)
print(distances_vector) # Vector form
# Compute all-vs-all distances
distances_matrix = cdist(chemicals, chemicals)
print(distances_matrix) # Square form
Documentation
You can read more detailed documentation in the docs folder.
Implementation
The approach used here uses bipartite graph matching[1], and most of its implementation in python is based off scripts from https://github.com/priba/aproximated_ged/tree/master. ChemGED uses RDKit to handle chemicals inside python.
References
[1] Riesen, Kaspar, and Horst Bunke. "Approximate graph edit distance computation by means of bipartite graph matching." Image and Vision computing 27.7 (2009): 950-959
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file chemged-0.1.0.tar.gz.
File metadata
- Download URL: chemged-0.1.0.tar.gz
- Upload date:
- Size: 12.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.3 CPython/3.11.9 Windows/10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f2afccf4916e2a90097b0fc786c28ea6de7de51b0ef4dfbd520fc1bd42cdca40
|
|
| MD5 |
4d867dc60c2275d73a1cf1af63052479
|
|
| BLAKE2b-256 |
6c0863938ea6fb9e075e4b87e7aa5394b011d9219e209b80efa354e4f1a011c4
|
File details
Details for the file chemged-0.1.0-py3-none-any.whl.
File metadata
- Download URL: chemged-0.1.0-py3-none-any.whl
- Upload date:
- Size: 13.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.3 CPython/3.11.9 Windows/10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
51e17f57d3397006409da8e69ae1ca2db8a5c5c57617273eb30489a6d9d7c3b0
|
|
| MD5 |
b04846c5c64c800755b86200d37079c9
|
|
| BLAKE2b-256 |
620bb67cb11f2b7e8f9c664ce4a66ba8ab05df968621058a1b4ab1cc7779caa6
|