Skip to main content

SMSD -- Substructure & MCS search for chemical graphs

Project description

SMSD Python Bindings

Python interface for SMSD (Small Molecule Substructure Detector) -- a high-performance library for substructure search, Maximum Common Substructure (MCS), and molecular similarity.

Features

  • SMILES parsing and writing -- built-in OpenSMILES parser, no RDKit or CDK required
  • Substructure search -- VF2++ subgraph isomorphism
  • MCS search -- McSplit with seed-and-extend, orbit pruning, coverage-driven termination
  • Tautomer-aware matching -- keto/enol, amide, imidazole tautomer equivalence
  • RASCAL screening -- O(V+E) Tanimoto-like similarity upper bound
  • Fingerprints -- path-based and MCS-aware fingerprints for pre-screening

Requirements

  • Python >= 3.8
  • C++17 compiler (GCC 7+, Clang 5+, MSVC 2019+)
  • CMake >= 3.15
  • pybind11 >= 2.12

Installation

From source (recommended)

cd python/
pip install .

Development install

pip install -e ".[dev]"

Build with specific compiler

CMAKE_ARGS="-DCMAKE_CXX_COMPILER=g++-13" pip install .

Quick Start

from smsd import parse_smiles, find_mcs, is_substructure, similarity

# Parse SMILES strings
benzene = parse_smiles("c1ccccc1")
phenol  = parse_smiles("c1ccc(O)cc1")

# Substructure search
assert is_substructure(benzene, phenol)  # benzene is in phenol

# Maximum Common Substructure
mcs = find_mcs(benzene, phenol)
print(f"MCS size: {len(mcs)} atoms")  # 6

# Similarity
sim = similarity(benzene, phenol)
print(f"Similarity: {sim:.3f}")  # ~0.857

API Reference

SMILES Parsing

from smsd import parse_smiles, to_smiles

mol = parse_smiles("c1ccccc1")
print(mol.n)              # 6 atoms
print(mol.atomic_num)     # [6, 6, 6, 6, 6, 6]
print(mol.aromatic)       # [True, True, True, True, True, True]

smi = to_smiles(mol)      # canonical SMILES string

Substructure Search

from smsd import is_substructure, find_substructure, ChemOptions

query  = parse_smiles("c1ccccc1")
target = parse_smiles("c1ccc(O)cc1")

# Boolean check
if is_substructure(query, target):
    print("Query is a substructure of target")

# Get atom mapping
mapping = find_substructure(query, target)
# Returns list of (query_atom, target_atom) pairs
for qi, ti in mapping:
    print(f"  query atom {qi} -> target atom {ti}")

# Custom options
opts = ChemOptions()
opts.ring_matches_ring_only = True
is_substructure(query, target, opts=opts, timeout_ms=5000)

MCS Search

from smsd import find_mcs, ChemOptions, McsOptions

g1 = parse_smiles("CC(=O)Oc1ccccc1C(=O)O")   # aspirin
g2 = parse_smiles("CC(=O)Nc1ccc(O)cc1")       # acetaminophen

# Default MCS
mapping = find_mcs(g1, g2)
print(f"MCS size: {len(mapping)}")

# Tautomer-aware MCS
taut = ChemOptions.tautomer_profile()
mapping = find_mcs(g1, g2, chem=taut)

# With MCS options
mcs_opts = McsOptions()
mcs_opts.timeout_ms = 5000
mcs_opts.connected_only = True
mapping = find_mcs(g1, g2, opts=mcs_opts)

# Convenience wrapper (accepts SMILES strings directly)
from smsd import mcs
mapping = mcs("c1ccccc1", "Cc1ccccc1", tautomer_aware=True)

Similarity and Screening

from smsd import similarity_upper_bound, screen_targets, similarity

g1 = parse_smiles("c1ccccc1")
g2 = parse_smiles("Cc1ccccc1")

# Single pair
sim = similarity_upper_bound(g1, g2)
print(f"Similarity: {sim:.3f}")

# Convenience wrapper (accepts SMILES)
sim = similarity("c1ccccc1", "Cc1ccccc1")

# Batch screening
library = [parse_smiles(s) for s in smiles_list]
query = parse_smiles("c1ccccc1")
hits = screen_targets(query, library, threshold=0.5)
# Returns indices of molecules with similarity >= 0.5

Fingerprints

from smsd import (
    path_fingerprint, mcs_fingerprint,
    fingerprint_subset, analyze_fp_quality,
    fingerprint, tanimoto,
)

mol = parse_smiles("c1ccccc1")

# Path fingerprint (returns set bit positions)
fp = path_fingerprint(mol, path_length=7, fp_size=2048)

# MCS-aware fingerprint
fp_mcs = mcs_fingerprint(mol, path_length=7, fp_size=2048)

# Subset check (for substructure pre-screening)
query_fp = path_fingerprint(parse_smiles("c1ccccc1"))
target_fp = path_fingerprint(parse_smiles("c1ccc(O)cc1"))
assert fingerprint_subset(query_fp, target_fp)

# Quality analysis
quality = analyze_fp_quality(fp)
print(quality)  # {'set_bits': 12, 'density': 0.006, ...}

# Convenience wrappers
fp = fingerprint("CCO", kind="mcs")
sim = tanimoto(fp, fingerprint("CCCO"))

MolGraph Builder

Build molecules directly without SMILES:

from smsd import MolGraphBuilder

builder = MolGraphBuilder(6)  # 6 atoms
for i in range(6):
    builder.atom(i, 6, charge=0, aromatic=True, in_ring=True)
for i in range(6):
    builder.bond(i, (i + 1) % 6, order=1, in_ring=True, aromatic=True)
benzene = builder.build()

Configuration

from smsd import ChemOptions, McsOptions, BondOrderMode, RingFusionMode

# ChemOptions controls atom/bond matching
chem = ChemOptions()
chem.match_atom_type = True
chem.match_formal_charge = True
chem.tautomer_aware = True
chem.complete_rings_only = True
chem.match_bond_order = BondOrderMode.LOOSE
chem.ring_fusion_mode = RingFusionMode.STRICT

# Named profiles
chem = ChemOptions.tautomer_profile()   # tautomer-aware defaults
chem = ChemOptions.profile("strict")    # strict matching

# McsOptions controls MCS algorithm behavior
opts = McsOptions()
opts.connected_only = True
opts.timeout_ms = 10000
opts.maximize_bonds = True  # MCES mode

Running Tests

cd python/
pip install -e ".[dev]"
pytest tests/ -v

License

Apache 2.0. Copyright (c) 2018-2026 BioInception PVT LTD. Algorithm Copyright (c) 2009-2026 Syed Asad Rahman.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

smsd-5.8.1.tar.gz (457.8 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

smsd-5.8.1-cp313-cp313-win_amd64.whl (452.2 kB view details)

Uploaded CPython 3.13Windows x86-64

smsd-5.8.1-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (511.5 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

smsd-5.8.1-cp313-cp313-macosx_11_0_arm64.whl (462.9 kB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

smsd-5.8.1-cp312-cp312-win_amd64.whl (452.2 kB view details)

Uploaded CPython 3.12Windows x86-64

smsd-5.8.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (511.7 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

smsd-5.8.1-cp312-cp312-macosx_11_0_arm64.whl (462.8 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

smsd-5.8.1-cp311-cp311-win_amd64.whl (450.4 kB view details)

Uploaded CPython 3.11Windows x86-64

smsd-5.8.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (510.0 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

smsd-5.8.1-cp311-cp311-macosx_11_0_arm64.whl (462.6 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

smsd-5.8.1-cp310-cp310-win_amd64.whl (449.5 kB view details)

Uploaded CPython 3.10Windows x86-64

smsd-5.8.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (508.8 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

smsd-5.8.1-cp310-cp310-macosx_11_0_arm64.whl (461.3 kB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

File details

Details for the file smsd-5.8.1.tar.gz.

File metadata

  • Download URL: smsd-5.8.1.tar.gz
  • Upload date:
  • Size: 457.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for smsd-5.8.1.tar.gz
Algorithm Hash digest
SHA256 b0e227a35b91309bc3801e15366a10a32efe9abb0823f1eb76e526f4a239e2c4
MD5 77ca253abe7c1294a57e89a18287b9a8
BLAKE2b-256 3b6efe8990df5e42b0e9ed78275a1ac85ae2c19f95100362fc05c1fcef019903

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.8.1.tar.gz:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.8.1-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: smsd-5.8.1-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 452.2 kB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for smsd-5.8.1-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 a1b6e592e8e5eddc5579be1e209eab1b1d7ec27ddd96564004b6dfe40d817066
MD5 3755754761c66cb71253052111c93641
BLAKE2b-256 f65e810407ddbc0c3e1cee512306991d59a956c3d141b413cce01f5a0a18ca9b

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.8.1-cp313-cp313-win_amd64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.8.1-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for smsd-5.8.1-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 84e6688dfd01ccbda1cb2390a16e0540e9289ae010f3f3815c10f2d2c60d932a
MD5 7419eaa60b258f18b2931a2e01d18003
BLAKE2b-256 d7b527ac3a61d3eca3be7e6f96c2aff1934f252ef85579cb883e3f49010b3504

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.8.1-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.8.1-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for smsd-5.8.1-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 c4c52ffc07512edb2149fd600649b454bac71f3cbc66bc80f59021898075e661
MD5 d15fb4ef8c2a7fb268bb592e5d30dd24
BLAKE2b-256 4fb4d0a25f5b9656137c351fa6afdeacf11bacfdfad7dd73a6cb36bbba1fa88b

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.8.1-cp313-cp313-macosx_11_0_arm64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.8.1-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: smsd-5.8.1-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 452.2 kB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for smsd-5.8.1-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 d8b98a9b4cf4c74f03da214a642ab50b7bc44ed4663a784adb60484b916af84f
MD5 a3c12fbdf9034d53cda0a4cb632f616e
BLAKE2b-256 418f8cc25cbc7fb8df98ac3cc71b48b7e359d02b897c97d9f1f8fd8d285d62a8

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.8.1-cp312-cp312-win_amd64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.8.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for smsd-5.8.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 fc44dfcd38e2dd4e3e1765fa8a2a59ab107172f396f9aeb354de58d5a85b4643
MD5 beabe3745032a02b44e5e5ce846c7011
BLAKE2b-256 9465c4a0260a2095f9e9205526d669d2eb04b33317e7eccb6386f659909bc51e

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.8.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.8.1-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for smsd-5.8.1-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 74063fb0fff9811da646ab8bab467cb93c9adc80684618c804e4df717aea458d
MD5 8eb3b6530587bc99c41877db3eba6de4
BLAKE2b-256 920360f4a522d55b3ac0bdcdb2ccdba6b0c8d8b72b0feee1f2405c861e759fdc

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.8.1-cp312-cp312-macosx_11_0_arm64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.8.1-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: smsd-5.8.1-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 450.4 kB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for smsd-5.8.1-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 0dd97688f6eacef25b6519e19796a427c910b82888c2804e0c922cd30f223f2f
MD5 b279b0f344cf3337318ea0a533b35c68
BLAKE2b-256 0cb5175590cd2a603aae4cbfc074b0f0722a78f8070154be97e57447dad18eac

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.8.1-cp311-cp311-win_amd64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.8.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for smsd-5.8.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 910888207501c9a28e8006781d4cb7019f0d0f3bcb0c611872791a004163d668
MD5 5745b08431a9719c3e74bab25340c6bd
BLAKE2b-256 6fe460184bec3382f748a183270884163546f3615f968feac274d34262b1517f

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.8.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.8.1-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for smsd-5.8.1-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 eea98c897ef58edd3d312e7469f445476b82f4ea40f79563c0defdf27650c2b5
MD5 e0151e88976b9edc4bbb5e0a837db092
BLAKE2b-256 e0e3404eadb8353ad5951ecc8b2dd353249bccd735594b184058464409523926

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.8.1-cp311-cp311-macosx_11_0_arm64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.8.1-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: smsd-5.8.1-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 449.5 kB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for smsd-5.8.1-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 80faebdf28cb2b9d2b3db21fb9004e6f4f2fa52e6756a8fa41a79a6c3515606a
MD5 88e750a4bcd0cb76fdfad954e8517033
BLAKE2b-256 3af8a9e3cabeb49293c42fdb534d590f1e11122bf372cf0f53c92fe21917ab38

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.8.1-cp310-cp310-win_amd64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.8.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for smsd-5.8.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 51c3bd0516729692048ca190b8c512d7569a9131e59eb583e183e0eaf6de269c
MD5 1ac682d34575fd7c283756abfed12499
BLAKE2b-256 b20a43ab89fad08a6954b3e310362fb25ca3d22245682c30569dbb6d2414ed3b

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.8.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.8.1-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for smsd-5.8.1-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 987f6a081310923dc4135907144fc38b501fc8212c448a2150ee0679ff88a05f
MD5 0a3868d6695e2a9523cd0c5080b56ce2
BLAKE2b-256 cb76bac404ab94d6d1e5d219f84c0357c8e48967093f2f2fb4b664f70a353124

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.8.1-cp310-cp310-macosx_11_0_arm64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page