Skip to main content

SMSD -- Substructure & MCS search for chemical graphs

Project description

SMSD Python Bindings

Python interface for SMSD (Small Molecule Substructure Detector) -- a high-performance library for substructure search, Maximum Common Substructure (MCS), and molecular similarity.

Features

  • SMILES parsing and writing -- built-in OpenSMILES parser, no RDKit or CDK required
  • Substructure search -- VF2++ subgraph isomorphism
  • MCS search -- McSplit with seed-and-extend, orbit pruning, coverage-driven termination
  • Tautomer-aware matching -- keto/enol, amide, imidazole tautomer equivalence
  • RASCAL screening -- O(V+E) Tanimoto-like similarity upper bound
  • Fingerprints -- path-based and MCS-aware fingerprints for pre-screening

Requirements

  • Python >= 3.8
  • C++17 compiler (GCC 7+, Clang 5+, MSVC 2019+)
  • CMake >= 3.15
  • pybind11 >= 2.12

Installation

From source (recommended)

cd python/
pip install .

Development install

pip install -e ".[dev]"

Build with specific compiler

CMAKE_ARGS="-DCMAKE_CXX_COMPILER=g++-13" pip install .

Quick Start

from smsd import parse_smiles, find_mcs, is_substructure, similarity

# Parse SMILES strings
benzene = parse_smiles("c1ccccc1")
phenol  = parse_smiles("c1ccc(O)cc1")

# Substructure search
assert is_substructure(benzene, phenol)  # benzene is in phenol

# Maximum Common Substructure
mcs = find_mcs(benzene, phenol)
print(f"MCS size: {len(mcs)} atoms")  # 6

# Similarity
sim = similarity(benzene, phenol)
print(f"Similarity: {sim:.3f}")  # ~0.857

API Reference

SMILES Parsing

from smsd import parse_smiles, to_smiles

mol = parse_smiles("c1ccccc1")
print(mol.n)              # 6 atoms
print(mol.atomic_num)     # [6, 6, 6, 6, 6, 6]
print(mol.aromatic)       # [True, True, True, True, True, True]

smi = to_smiles(mol)      # canonical SMILES string

Substructure Search

from smsd import is_substructure, find_substructure, ChemOptions

query  = parse_smiles("c1ccccc1")
target = parse_smiles("c1ccc(O)cc1")

# Boolean check
if is_substructure(query, target):
    print("Query is a substructure of target")

# Get atom mapping
mapping = find_substructure(query, target)
# Returns list of (query_atom, target_atom) pairs
for qi, ti in mapping:
    print(f"  query atom {qi} -> target atom {ti}")

# Custom options
opts = ChemOptions()
opts.ring_matches_ring_only = True
is_substructure(query, target, opts=opts, timeout_ms=5000)

MCS Search

from smsd import find_mcs, ChemOptions, McsOptions

g1 = parse_smiles("CC(=O)Oc1ccccc1C(=O)O")   # aspirin
g2 = parse_smiles("CC(=O)Nc1ccc(O)cc1")       # acetaminophen

# Default MCS
mapping = find_mcs(g1, g2)
print(f"MCS size: {len(mapping)}")

# Tautomer-aware MCS
taut = ChemOptions.tautomer_profile()
mapping = find_mcs(g1, g2, chem=taut)

# With MCS options
mcs_opts = McsOptions()
mcs_opts.timeout_ms = 5000
mcs_opts.connected_only = True
mapping = find_mcs(g1, g2, opts=mcs_opts)

# Convenience wrapper (accepts SMILES strings directly)
from smsd import mcs
mapping = mcs("c1ccccc1", "Cc1ccccc1", tautomer_aware=True)

Similarity and Screening

from smsd import similarity_upper_bound, screen_targets, similarity

g1 = parse_smiles("c1ccccc1")
g2 = parse_smiles("Cc1ccccc1")

# Single pair
sim = similarity_upper_bound(g1, g2)
print(f"Similarity: {sim:.3f}")

# Convenience wrapper (accepts SMILES)
sim = similarity("c1ccccc1", "Cc1ccccc1")

# Batch screening
library = [parse_smiles(s) for s in smiles_list]
query = parse_smiles("c1ccccc1")
hits = screen_targets(query, library, threshold=0.5)
# Returns indices of molecules with similarity >= 0.5

Fingerprints

from smsd import (
    path_fingerprint, mcs_fingerprint,
    fingerprint_subset, analyze_fp_quality,
    fingerprint, tanimoto,
)

mol = parse_smiles("c1ccccc1")

# Path fingerprint (returns set bit positions)
fp = path_fingerprint(mol, path_length=7, fp_size=2048)

# MCS-aware fingerprint
fp_mcs = mcs_fingerprint(mol, path_length=7, fp_size=2048)

# Subset check (for substructure pre-screening)
query_fp = path_fingerprint(parse_smiles("c1ccccc1"))
target_fp = path_fingerprint(parse_smiles("c1ccc(O)cc1"))
assert fingerprint_subset(query_fp, target_fp)

# Quality analysis
quality = analyze_fp_quality(fp)
print(quality)  # {'set_bits': 12, 'density': 0.006, ...}

# Convenience wrappers
fp = fingerprint("CCO", kind="mcs")
sim = tanimoto(fp, fingerprint("CCCO"))

MolGraph Builder

Build molecules directly without SMILES:

from smsd import MolGraphBuilder

builder = MolGraphBuilder(6)  # 6 atoms
for i in range(6):
    builder.atom(i, 6, charge=0, aromatic=True, in_ring=True)
for i in range(6):
    builder.bond(i, (i + 1) % 6, order=1, in_ring=True, aromatic=True)
benzene = builder.build()

Configuration

from smsd import ChemOptions, McsOptions, BondOrderMode, RingFusionMode

# ChemOptions controls atom/bond matching
chem = ChemOptions()
chem.match_atom_type = True
chem.match_formal_charge = True
chem.tautomer_aware = True
chem.complete_rings_only = True
chem.match_bond_order = BondOrderMode.LOOSE
chem.ring_fusion_mode = RingFusionMode.STRICT

# Named profiles
chem = ChemOptions.tautomer_profile()   # tautomer-aware defaults
chem = ChemOptions.profile("strict")    # strict matching

# McsOptions controls MCS algorithm behavior
opts = McsOptions()
opts.connected_only = True
opts.timeout_ms = 10000
opts.maximize_bonds = True  # MCES mode

Running Tests

cd python/
pip install -e ".[dev]"
pytest tests/ -v

License

Apache 2.0. Copyright (c) 2018-2026 BioInception PVT LTD. Algorithm Copyright (c) 2009-2026 Syed Asad Rahman.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

smsd-5.9.0.tar.gz (460.1 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

smsd-5.9.0-cp313-cp313-win_amd64.whl (452.9 kB view details)

Uploaded CPython 3.13Windows x86-64

smsd-5.9.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (512.2 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

smsd-5.9.0-cp313-cp313-macosx_11_0_arm64.whl (463.6 kB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

smsd-5.9.0-cp312-cp312-win_amd64.whl (452.9 kB view details)

Uploaded CPython 3.12Windows x86-64

smsd-5.9.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (512.4 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

smsd-5.9.0-cp312-cp312-macosx_11_0_arm64.whl (463.5 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

smsd-5.9.0-cp311-cp311-win_amd64.whl (451.1 kB view details)

Uploaded CPython 3.11Windows x86-64

smsd-5.9.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (510.7 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

smsd-5.9.0-cp311-cp311-macosx_11_0_arm64.whl (463.3 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

smsd-5.9.0-cp310-cp310-win_amd64.whl (450.3 kB view details)

Uploaded CPython 3.10Windows x86-64

smsd-5.9.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (509.5 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

smsd-5.9.0-cp310-cp310-macosx_11_0_arm64.whl (462.0 kB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

File details

Details for the file smsd-5.9.0.tar.gz.

File metadata

  • Download URL: smsd-5.9.0.tar.gz
  • Upload date:
  • Size: 460.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for smsd-5.9.0.tar.gz
Algorithm Hash digest
SHA256 4522ae51bb6cf8c7e58b12e02a9aacc9f759bd8b9354fc36715282648898183d
MD5 6be411164f4d5e54c76857ef4910a319
BLAKE2b-256 c9f55a7a09ccdbc329f4df566add192abe544ed3cbaa51161fb823b9bf6896a2

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.9.0.tar.gz:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.9.0-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: smsd-5.9.0-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 452.9 kB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for smsd-5.9.0-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 56542a6a0770c5086992380ba0a90bc4ba3fbe5f4dcf8f5a5f8afa0c5ab2dc34
MD5 336fefd7914758c385bbcd0c30aae67c
BLAKE2b-256 284cc65d76cd97603e8906f2715109c5beeff4d123af10615455b64ecc8f8920

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.9.0-cp313-cp313-win_amd64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.9.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for smsd-5.9.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 04397f99f56b79ba4cf10917327a143a499e90ecdc2f43ffaa48c70d14822948
MD5 5a51c761be7fc55407b393831027b870
BLAKE2b-256 673a96e8cde0bff7803ccfab8361497611eb3394a6e76f0d77a4f56a1c9551b0

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.9.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.9.0-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for smsd-5.9.0-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 dd8558ddbad17aedcafafc9f019b4c67be7132ed287a277c0bfce3e43ce654d5
MD5 ea3a37080af8543768c1a7c4ad6e35ba
BLAKE2b-256 cfcd942f6a756cc300a2735126a72799ab3dd25e0e394edf3dd8dd80e6faab32

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.9.0-cp313-cp313-macosx_11_0_arm64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.9.0-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: smsd-5.9.0-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 452.9 kB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for smsd-5.9.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 62e4a0f8372ca1a5b2f5fd89f03f8d65a35455a19cc4893ed2ef10c558632d74
MD5 7f1d4165599bfc961d39ab1d0952c43f
BLAKE2b-256 840ef56a41c676bc726e041832ad9db46e82e7704fb070f6610e69d46b185cc0

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.9.0-cp312-cp312-win_amd64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.9.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for smsd-5.9.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 a64d06a1734ec9bbf7884d54ea2f9c9c2e6dc0134dc61b4fde877c7852f951ae
MD5 43535cf1ea3c3b1f76d888afe4b1aea3
BLAKE2b-256 10326e92b792489b319c0983e52b4c8085e2d4bb55148c58e4c5317dd7bbc2eb

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.9.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.9.0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for smsd-5.9.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 1e5e06817c74d87b7342ba264f3af2ef5083f7e342f158f2012591b7d9791c1e
MD5 2087d8655a6bee32f6bec713e3ac11d6
BLAKE2b-256 45c2d354f6d4a805ab1b2bff0fb72b94e268e65a811d828698c12f09d78385b5

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.9.0-cp312-cp312-macosx_11_0_arm64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.9.0-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: smsd-5.9.0-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 451.1 kB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for smsd-5.9.0-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 35ac2ccc62667229e5cd6dfa22973bab7ea71f3b69f574dfa58a54040316b49e
MD5 0a810a3ba757b28c2a1c0cbdf5057d6c
BLAKE2b-256 85fecdeaf7a5cc6d72cc95858b3f49a428be22f2815598599acad8b43e213c98

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.9.0-cp311-cp311-win_amd64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.9.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for smsd-5.9.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 de863d83638045cbac119e277069b9649047e2e14d3ed7d00681b5fd9441cd03
MD5 b9cb6e93396f19d494bf5a8732bd1003
BLAKE2b-256 55e74d767571fda64cd3c163362cab3c03231057f0a0316296ba095d6b4a2469

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.9.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.9.0-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for smsd-5.9.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 9db1ff3b024c5e7644b2770c734fe62f9d3ede27078d225172eb5ce2e69b3ca8
MD5 4a55a8554697ea0cd71abd594992ef69
BLAKE2b-256 6a92fbc29cdb181319106fbe05990d93f129635a31b2a315515eb4c367de861b

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.9.0-cp311-cp311-macosx_11_0_arm64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.9.0-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: smsd-5.9.0-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 450.3 kB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for smsd-5.9.0-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 9a00850678e0e2cb6e11b81e3beb35d2ce95988d4716c8d8cf1f9f4c6bb3a098
MD5 fd273ba08e29c807327c93c354cf95f3
BLAKE2b-256 1459f0e36c11c03f085a55da3a072eae730fe80da79ddcac076a8aa9d0b53da7

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.9.0-cp310-cp310-win_amd64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.9.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for smsd-5.9.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 ad065607262d52a4f433fe4c58175b52e79ea7f209fafe3246d730df409d191a
MD5 8618aa67a5d0d3a0ebc07570cb01fb6c
BLAKE2b-256 f13a411a62f14984ef37ea0b33e1e51c9b83f3843af74bd2085ba9143707be68

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.9.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.9.0-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for smsd-5.9.0-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 ac5efef5e9c578b87072dbd1cfa69efc37fdb747e758c726e44e065b4ed2d20e
MD5 cb25c395d3abebf2481e4ac48e9575da
BLAKE2b-256 fa3058ad98819d3905e0e875f43a86ce26bc689cf638d3c8fba23b011362e5a5

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.9.0-cp310-cp310-macosx_11_0_arm64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page