Essential tools for analyzing the chemical space of metal complex ligands
Project description
cTopo
cTopo is a small Python package for analyzing the chemical space of multidentate ligands (and, optionally, coordination complexes). It focuses on concepts that matter for coordination chemistry—donor atoms, ligand skeletons, and reduced topologies—and provides tooling to:
- organize ligand datasets hierarchically (denticity => topology => skeleton => ligand),
- visualize these abstractions (SVG/SMILES),
- compute role-aware fingerprints using donor/skeleton/substituent atom typing,
- extract ligands from datasets of metal complexes.
The core idea: unlike classic “organic” chemical space maps (2D embeddings of ECFP), ligand behavior in complexes is heavily controlled by the coordination cage formed by the metal + ligand skeleton. cTopo makes that cage explicit and measurable.
Key concepts
Donor atoms
A ligand is defined by a set of donor atom indices (e.g., N/O/S/P). In cTopo, donors are either:
- explicitly provided (via
ligand_from_mol(..., donor_atoms=...)), or - marked in SMILES using atom-map numbers (
ligand_from_smiles).
Skeleton
The ligand skeleton is the donor atoms plus the atoms that connect donors to each other—formally, the union of atoms on shortest paths between all donor pairs.
Topology
The topology is a reduced representation of the skeleton where donor-to-donor paths are contracted to the shortest non-reducible form (removing purely “length” information while keeping branching/connection patterns).
For many denticities this produces only a few common topologies (e.g. for tridentates: “linear” vs “tripod”), which makes it ideal for dataset overview.
Installation
With pip (package installed)
pip install ctopo
RDKit note
ctopo depends on RDKit and NetworkX. If RDKit isn’t already available in your environment, the most reliable route is Conda:
conda create -n ctopo python=3.10 -y
conda activate ctopo
conda install -c conda-forge rdkit networkx -y
pip install ctopo
Development install
pip install -e .
Quickstart
1) Build a ligand (donors from atom-map numbers)
ligand_from_smiles treats atoms with :1, :2, … as donors.
from ctopo import ligand_from_smiles
# Ethylenediamine (bidentate) with both nitrogens marked as donors
lig = ligand_from_smiles("[NH2:1]CC[NH2:2]")
print(lig.denticity) # 2
print(sorted(lig.donor_atoms)) # donor atom indices in the RDKit molecule
2) Visualize ligand / skeleton / topology (SVG)
from ctopo import ligand_from_smiles
lig = ligand_from_smiles("[NH2:1]CC[NH:2]CC[NH2:3]") # diethylenetriamine (example)
v_lig = lig.visualize_ligand()
v_skel = lig.visualize_skeleton()
v_topo = lig.visualize_topology()
# In a notebook you can do:
# from IPython.display import SVG, display
# display(SVG(v_topo.svg))
Each visualize_*() returns a simple container with:
smiles(for that abstraction),svg(a depiction suitable for HTML reports or dataset browsers).
Dataset “chemical space” as a hierarchy
Build a tree grouped by abstraction levels and export it to a single self-contained HTML.
from pathlib import Path
from ctopo import ligand_from_smiles
from ctopo.trees import build_ligand_tree, tree_to_html
ligands = [
ligand_from_smiles("[NH2:1]CC[NH2:2]"),
ligand_from_smiles("[NH2:1]CC[NH:2]CC[NH2:3]"),
ligand_from_smiles("[O-:1]C(=O)CC(=O)[O-:2]"), # example bidentate carboxylate
]
tree = build_ligand_tree(ligands)
html = tree_to_html(tree)
Path("ligand_tree.html").write_text(html, encoding="utf-8")
print("Wrote ligand_tree.html")
Default grouping levels are:
denticity → topology → skeleton → skeleton+bonds → skeleton+donors+bonds → ligand
This is designed to answer: “What does my dataset actually contain?” in a coordination-chemistry-relevant way.
Complexes: encoding convention (dative bonds)
cTopo can also construct a Complex if coordination is represented using RDKit dative bonds:
- coordination bonds must be dative,
- each dative bond must be oriented donor → metal (metal is the end atom),
- a metal center must not have non-dative bonds to non-metals (metal–metal bonds may exist but must be non-dative).
from ctopo import complex_from_smiles
# Minimal sketch example (exact SMILES depends on your RDKit encoding)
# Donor -> metal must be a dative bond.
cx = complex_from_smiles("[NH3]->[Cu+2]<-[NH3]")
print(cx.metal_atoms)
print(cx.donor_atoms)
If you already have an RDKit molecule and metal indices, you can also build directly via complex_from_mol(mol, metal_atoms=...).
Role-aware fingerprints (donor / skeleton / substituent)
cTopo assigns each atom an AtomType such as:
DONORSKELETONSUBSTITUENT(and for complexes:CENTERfor metals)
You can then compute fingerprints that focus on what matters (e.g. skeleton-only with skeleton bond types preserved).
from ctopo import ligand_from_smiles
from ctopo.descriptors import MorganSpec, make_fingerprinter, DEFAULT_PROPERTIES
from ctopo.distances import tanimoto_similarity_bits
lig1 = ligand_from_smiles("[NH2:1]CC[NH2:2]")
lig2 = ligand_from_smiles("[NH2:1]CCC[NH2:2]")
fp = make_fingerprinter(
kind="morgan",
spec=MorganSpec(radius=2, use_chirality=False),
atomic_properties=DEFAULT_PROPERTIES,
graph_view="skeleton", # focus on the cage-defining part
bond_mode="skeleton_only", # keep bond types only in skeleton
output="bits",
fp_size=2048,
)
f1 = fp(lig1)
f2 = fp(lig2)
print(tanimoto_similarity_bits(f1, f2))
Extract ligands from complexes
To go from a complex dataset to unique ligands:
from ctopo import complex_from_smiles
from ctopo.fragments import ligands_from_complex
cx = complex_from_smiles("[NH3]->[Cu+2]<-[NH3]") # example sketch
ligs = ligands_from_complex(cx)
print(len(ligs))
Note: bridging ligands are not handled specially in v1; removing metals may split a bridging ligand into multiple fragments.
Project status
cTopo is a research-oriented library aimed at coordination-chemistry workflows. The API is compact and intended to support:
- ligand dataset browsing and reporting,
- reproducible topology/skeleton extraction,
- feature engineering for ML on complexes/ligands.
License
MIT License (see LICENSE.md).
Citing
If you use cTopo in academic work, please cite the associated paper (citation details to be added once finalized).
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ctopo-0.1.0.tar.gz.
File metadata
- Download URL: ctopo-0.1.0.tar.gz
- Upload date:
- Size: 51.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.19
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6d1c7eb386db3ffbbe361e82fd20b676a2531015e43c662fa4b3311922652964
|
|
| MD5 |
dfa6e1fe17205335a77e614b539f4b5f
|
|
| BLAKE2b-256 |
15f55695b99ef28553a28ca77e73ad6640591e011d4623465f93a983325b74f5
|
File details
Details for the file ctopo-0.1.0-py3-none-any.whl.
File metadata
- Download URL: ctopo-0.1.0-py3-none-any.whl
- Upload date:
- Size: 52.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.19
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
28d4cc408d6c97241b43a266d167d1b62ba206ec8618d71e6c4859695c16a58b
|
|
| MD5 |
f191d2d42a6dd124a1ffc13757df698d
|
|
| BLAKE2b-256 |
615c40e567e4a5e58df55d445a9d74b89440fbf90985298252d9cca1f18183f6
|