Skip to main content

Essential tools for analyzing the chemical space of metal complex ligands

Project description

cTopo

cTopo is a small Python package for analyzing the chemical space of multidentate ligands (and, optionally, coordination complexes). It focuses on concepts that matter for coordination chemistry—donor atoms, ligand skeletons, and reduced topologies—and provides tooling to:

  • organize ligand datasets hierarchically (denticity => topology => skeleton => ligand),
  • visualize these abstractions (SVG/SMILES),
  • compute role-aware fingerprints using donor/skeleton/substituent atom typing,
  • extract ligands from datasets of metal complexes.

The core idea: unlike classic “organic” chemical space maps (2D embeddings of ECFP), ligand behavior in complexes is heavily controlled by the coordination cage formed by the metal + ligand skeleton. cTopo makes that cage explicit and measurable.


Key concepts

Donor atoms

A ligand is defined by a set of donor atom indices (e.g., N/O/S/P). In cTopo, donors are either:

  • explicitly provided (via ligand_from_mol(..., donor_atoms=...)), or
  • marked in SMILES using atom-map numbers (ligand_from_smiles).

Skeleton

The ligand skeleton is the donor atoms plus the atoms that connect donors to each other—formally, the union of atoms on shortest paths between all donor pairs.

Topology

The topology is a reduced representation of the skeleton where donor-to-donor paths are contracted to the shortest non-reducible form (removing purely “length” information while keeping branching/connection patterns).

For many denticities this produces only a few common topologies (e.g. for tridentates: “linear” vs “tripod”), which makes it ideal for dataset overview.


Installation

With pip (package installed)

pip install ctopo

RDKit note

ctopo depends on RDKit and NetworkX. If RDKit isn’t already available in your environment, the most reliable route is Conda:

conda create -n ctopo python=3.10 -y
conda activate ctopo
conda install -c conda-forge rdkit networkx -y
pip install ctopo

Development install

pip install -e .

Quickstart

1) Build a ligand (donors from atom-map numbers)

ligand_from_smiles treats atoms with :1, :2, … as donors.

from ctopo import ligand_from_smiles

# Ethylenediamine (bidentate) with both nitrogens marked as donors
lig = ligand_from_smiles("[NH2:1]CC[NH2:2]")

print(lig.denticity)           # 2
print(sorted(lig.donor_atoms)) # donor atom indices in the RDKit molecule

2) Visualize ligand / skeleton / topology (SVG)

from ctopo import ligand_from_smiles

lig = ligand_from_smiles("[NH2:1]CC[NH:2]CC[NH2:3]")  # diethylenetriamine (example)

v_lig = lig.visualize_ligand()
v_skel = lig.visualize_skeleton()
v_topo = lig.visualize_topology()

# In a notebook you can do:
# from IPython.display import SVG, display
# display(SVG(v_topo.svg))

Each visualize_*() returns a simple container with:

  • smiles (for that abstraction),
  • svg (a depiction suitable for HTML reports or dataset browsers).

Dataset “chemical space” as a hierarchy

Build a tree grouped by abstraction levels and export it to a single self-contained HTML.

from pathlib import Path
from ctopo import ligand_from_smiles
from ctopo.trees import build_ligand_tree, tree_to_html

ligands = [
    ligand_from_smiles("[NH2:1]CC[NH2:2]"),
    ligand_from_smiles("[NH2:1]CC[NH:2]CC[NH2:3]"),
    ligand_from_smiles("[O-:1]C(=O)CC(=O)[O-:2]"),  # example bidentate carboxylate
]

tree = build_ligand_tree(ligands)
html = tree_to_html(tree)

Path("ligand_tree.html").write_text(html, encoding="utf-8")
print("Wrote ligand_tree.html")

Default grouping levels are:

denticity → topology → skeleton → skeleton+bonds → skeleton+donors+bonds → ligand

This is designed to answer: “What does my dataset actually contain?” in a coordination-chemistry-relevant way.


Complexes: encoding convention (dative bonds)

cTopo can also construct a Complex if coordination is represented using RDKit dative bonds:

  • coordination bonds must be dative,
  • each dative bond must be oriented donor → metal (metal is the end atom),
  • a metal center must not have non-dative bonds to non-metals (metal–metal bonds may exist but must be non-dative).
from ctopo import complex_from_smiles

# Minimal sketch example (exact SMILES depends on your RDKit encoding)
# Donor -> metal must be a dative bond.
cx = complex_from_smiles("[NH3]->[Cu+2]<-[NH3]")

print(cx.metal_atoms)
print(cx.donor_atoms)

If you already have an RDKit molecule and metal indices, you can also build directly via complex_from_mol(mol, metal_atoms=...).


Role-aware fingerprints (donor / skeleton / substituent)

cTopo assigns each atom an AtomType such as:

  • DONOR
  • SKELETON
  • SUBSTITUENT (and for complexes: CENTER for metals)

You can then compute fingerprints that focus on what matters (e.g. skeleton-only with skeleton bond types preserved).

from ctopo import ligand_from_smiles
from ctopo.descriptors import MorganSpec, make_fingerprinter, DEFAULT_PROPERTIES
from ctopo.distances import tanimoto_similarity_bits

lig1 = ligand_from_smiles("[NH2:1]CC[NH2:2]")
lig2 = ligand_from_smiles("[NH2:1]CCC[NH2:2]")

fp = make_fingerprinter(
    kind="morgan",
    spec=MorganSpec(radius=2, use_chirality=False),
    atomic_properties=DEFAULT_PROPERTIES,
    graph_view="skeleton",        # focus on the cage-defining part
    bond_mode="skeleton_only",    # keep bond types only in skeleton
    output="bits",
    fp_size=2048,
)

f1 = fp(lig1)
f2 = fp(lig2)

print(tanimoto_similarity_bits(f1, f2))

Extract ligands from complexes

To go from a complex dataset to unique ligands:

from ctopo import complex_from_smiles
from ctopo.fragments import ligands_from_complex

cx = complex_from_smiles("[NH3]->[Cu+2]<-[NH3]")  # example sketch
ligs = ligands_from_complex(cx)

print(len(ligs))

Note: bridging ligands are not handled specially in v1; removing metals may split a bridging ligand into multiple fragments.


Project status

cTopo is a research-oriented library aimed at coordination-chemistry workflows. The API is compact and intended to support:

  • ligand dataset browsing and reporting,
  • reproducible topology/skeleton extraction,
  • feature engineering for ML on complexes/ligands.

License

MIT License (see LICENSE.md).


Citing

If you use cTopo in academic work, please cite the associated paper (citation details to be added once finalized).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ctopo-0.1.0.tar.gz (51.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ctopo-0.1.0-py3-none-any.whl (52.8 kB view details)

Uploaded Python 3

File details

Details for the file ctopo-0.1.0.tar.gz.

File metadata

  • Download URL: ctopo-0.1.0.tar.gz
  • Upload date:
  • Size: 51.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for ctopo-0.1.0.tar.gz
Algorithm Hash digest
SHA256 6d1c7eb386db3ffbbe361e82fd20b676a2531015e43c662fa4b3311922652964
MD5 dfa6e1fe17205335a77e614b539f4b5f
BLAKE2b-256 15f55695b99ef28553a28ca77e73ad6640591e011d4623465f93a983325b74f5

See more details on using hashes here.

File details

Details for the file ctopo-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: ctopo-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 52.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for ctopo-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 28d4cc408d6c97241b43a266d167d1b62ba206ec8618d71e6c4859695c16a58b
MD5 f191d2d42a6dd124a1ffc13757df698d
BLAKE2b-256 615c40e567e4a5e58df55d445a9d74b89440fbf90985298252d9cca1f18183f6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page