Library for computing molecular fingerprint based similarities as well as dimensionality reduction based chemical space visualizations.
Project description
chemap - Mapping chemical space
Library for computing molecular fingerprint based similarities as well as dimensionality reduction based chemical space visualizations.
Fingerprint computations
Fingerprints can be computed using generators from RDKit or scikit-fingerprints. Here a code example:
import numpy as np
import scipy.sparse as sp
from rdkit.Chem import rdFingerprintGenerator
from skfp.fingerprints import MAPFingerprint, AtomPairFingerprint
from chemap import compute_fingerprints, DatasetLoader, FingerprintConfig
ds_loader = DatasetLoader()
smiles = ds_loader.load("tests/data/smiles.csv")
# ----------------------------
# RDKit: Morgan (folded, dense)
# ----------------------------
morgan = rdFingerprintGenerator.GetMorganGenerator(radius=3, fpSize=4096)
X_morgan = compute_fingerprints(
smiles,
morgan,
config=FingerprintConfig(
count=False,
folded=True,
return_csr=False, # dense numpy
invalid_policy="raise",
),
)
print("RDKit Morgan:", X_morgan.shape, X_morgan.dtype)
# -----------------------------------
# RDKit: RDKitFP (folded, CSR sparse)
# -----------------------------------
rdkitfp = rdFingerprintGenerator.GetRDKitFPGenerator(fpSize=4096)
X_rdkitfp_csr = compute_fingerprints(
smiles,
rdkitfp,
config=FingerprintConfig(
count=False,
folded=True,
return_csr=True, # SciPy CSR
invalid_policy="raise",
),
)
assert sp.issparse(X_rdkitfp_csr)
print("RDKit RDKitFP (CSR):", X_rdkitfp_csr.shape, X_rdkitfp_csr.dtype, "nnz=", X_rdkitfp_csr.nnz)
# --------------------------------------------------
# scikit-fingerprints: MAPFingerprint (folded, dense)
# --------------------------------------------------
# MAPFingerprint is a MinHash-like fingerprint (different from MAP4 lib).
map_fp = MAPFingerprint(fp_size=4096, count=False, sparse=False)
X_map = compute_fingerprints(
smiles,
map_fp,
config=FingerprintConfig(
count=False,
folded=True,
return_csr=False,
invalid_policy="raise",
),
)
print("skfp MAPFingerprint:", X_map.shape, X_map.dtype)
# ----------------------------------------------------
# scikit-fingerprints: AtomPairFingerprint (folded, CSR)
# ----------------------------------------------------
atom_pair = AtomPairFingerprint(fp_size=4096, count=False, sparse=False, use_3D=False)
X_ap_csr = compute_fingerprints(
smiles,
atom_pair,
config=FingerprintConfig(
count=False,
folded=True,
return_csr=True,
invalid_policy="raise",
),
)
assert sp.issparse(X_ap_csr)
print("skfp AtomPair (CSR):", X_ap_csr.shape, X_ap_csr.dtype, "nnz=", X_ap_csr.nnz)
# (Optional) convert CSR -> dense if you need a NumPy array downstream:
X_ap = X_ap_csr.toarray().astype(np.float32, copy=False)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file chemap-0.2.1.tar.gz.
File metadata
- Download URL: chemap-0.2.1.tar.gz
- Upload date:
- Size: 27.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
12a887568451ff43884f299a34ec818318551c2ba700c9e38fea896268a73af3
|
|
| MD5 |
f09d5b86591729c607784d65fabf5e15
|
|
| BLAKE2b-256 |
742b9447276d5e1c0f7efa689714a46468db44384216b8dcee5d60fdd3e0ba71
|
Provenance
The following attestation bundles were made for chemap-0.2.1.tar.gz:
Publisher:
CI_publish.yaml on matchms/chemap
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
chemap-0.2.1.tar.gz -
Subject digest:
12a887568451ff43884f299a34ec818318551c2ba700c9e38fea896268a73af3 - Sigstore transparency entry: 904526987
- Sigstore integration time:
-
Permalink:
matchms/chemap@15b322e3d1eaea9de77df9681f82a286402cb9a4 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/matchms
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
CI_publish.yaml@15b322e3d1eaea9de77df9681f82a286402cb9a4 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file chemap-0.2.1-py3-none-any.whl.
File metadata
- Download URL: chemap-0.2.1-py3-none-any.whl
- Upload date:
- Size: 29.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c5561b108ebcb33ddd09f77b6f2f864e7346584cda899b674c39c39d5a7bfb11
|
|
| MD5 |
e1111d1060968bc88175f010dcfee01a
|
|
| BLAKE2b-256 |
bb9d49b564e4ade2a268f7018a52b15b8b0a99a770fc12723230339f2ad9c5b3
|
Provenance
The following attestation bundles were made for chemap-0.2.1-py3-none-any.whl:
Publisher:
CI_publish.yaml on matchms/chemap
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
chemap-0.2.1-py3-none-any.whl -
Subject digest:
c5561b108ebcb33ddd09f77b6f2f864e7346584cda899b674c39c39d5a7bfb11 - Sigstore transparency entry: 904527072
- Sigstore integration time:
-
Permalink:
matchms/chemap@15b322e3d1eaea9de77df9681f82a286402cb9a4 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/matchms
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
CI_publish.yaml@15b322e3d1eaea9de77df9681f82a286402cb9a4 -
Trigger Event:
workflow_dispatch
-
Statement type: