Skip to main content

SMSD -- Substructure & MCS search for chemical graphs

Project description

SMSD Python Bindings

Python interface for SMSD (Small Molecule Substructure Detector) -- a high-performance library for substructure search, Maximum Common Substructure (MCS), and molecular similarity.

Features

  • SMILES parsing and writing -- built-in OpenSMILES parser, no RDKit or CDK required
  • Substructure search -- VF2++ subgraph isomorphism
  • MCS search -- McSplit with seed-and-extend, orbit pruning, coverage-driven termination
  • Tautomer-aware matching -- keto/enol, amide, imidazole tautomer equivalence
  • RASCAL screening -- O(V+E) Tanimoto-like similarity upper bound
  • Fingerprints -- path-based and MCS-aware fingerprints for pre-screening

Requirements

  • Python >= 3.8
  • C++17 compiler (GCC 7+, Clang 5+, MSVC 2019+)
  • CMake >= 3.15
  • pybind11 >= 2.12

Installation

From source (recommended)

cd python/
pip install .

Development install

pip install -e ".[dev]"

Build with specific compiler

CMAKE_ARGS="-DCMAKE_CXX_COMPILER=g++-13" pip install .

Quick Start

from smsd import parse_smiles, find_mcs, is_substructure, similarity

# Parse SMILES strings
benzene = parse_smiles("c1ccccc1")
phenol  = parse_smiles("c1ccc(O)cc1")

# Substructure search
assert is_substructure(benzene, phenol)  # benzene is in phenol

# Maximum Common Substructure
mcs = find_mcs(benzene, phenol)
print(f"MCS size: {len(mcs)} atoms")  # 6

# Similarity
sim = similarity(benzene, phenol)
print(f"Similarity: {sim:.3f}")  # ~0.857

API Reference

SMILES Parsing

from smsd import parse_smiles, to_smiles

mol = parse_smiles("c1ccccc1")
print(mol.n)              # 6 atoms
print(mol.atomic_num)     # [6, 6, 6, 6, 6, 6]
print(mol.aromatic)       # [True, True, True, True, True, True]

smi = to_smiles(mol)      # canonical SMILES string

Substructure Search

from smsd import is_substructure, find_substructure, ChemOptions

query  = parse_smiles("c1ccccc1")
target = parse_smiles("c1ccc(O)cc1")

# Boolean check
if is_substructure(query, target):
    print("Query is a substructure of target")

# Get atom mapping
mapping = find_substructure(query, target)
# Returns list of (query_atom, target_atom) pairs
for qi, ti in mapping:
    print(f"  query atom {qi} -> target atom {ti}")

# Custom options
opts = ChemOptions()
opts.ring_matches_ring_only = True
is_substructure(query, target, opts=opts, timeout_ms=5000)

MCS Search

from smsd import find_mcs, ChemOptions, McsOptions

g1 = parse_smiles("CC(=O)Oc1ccccc1C(=O)O")   # aspirin
g2 = parse_smiles("CC(=O)Nc1ccc(O)cc1")       # acetaminophen

# Default MCS
mapping = find_mcs(g1, g2)
print(f"MCS size: {len(mapping)}")

# Tautomer-aware MCS
taut = ChemOptions.tautomer_profile()
mapping = find_mcs(g1, g2, chem=taut)

# With MCS options
mcs_opts = McsOptions()
mcs_opts.timeout_ms = 5000
mcs_opts.connected_only = True
mapping = find_mcs(g1, g2, opts=mcs_opts)

# Convenience wrapper (accepts SMILES strings directly)
from smsd import mcs
mapping = mcs("c1ccccc1", "Cc1ccccc1", tautomer_aware=True)

Similarity and Screening

from smsd import similarity_upper_bound, screen_targets, similarity

g1 = parse_smiles("c1ccccc1")
g2 = parse_smiles("Cc1ccccc1")

# Single pair
sim = similarity_upper_bound(g1, g2)
print(f"Similarity: {sim:.3f}")

# Convenience wrapper (accepts SMILES)
sim = similarity("c1ccccc1", "Cc1ccccc1")

# Batch screening
library = [parse_smiles(s) for s in smiles_list]
query = parse_smiles("c1ccccc1")
hits = screen_targets(query, library, threshold=0.5)
# Returns indices of molecules with similarity >= 0.5

Fingerprints

from smsd import (
    path_fingerprint, mcs_fingerprint,
    fingerprint_subset, analyze_fp_quality,
    fingerprint, tanimoto,
)

mol = parse_smiles("c1ccccc1")

# Path fingerprint (returns set bit positions)
fp = path_fingerprint(mol, path_length=7, fp_size=2048)

# MCS-aware fingerprint
fp_mcs = mcs_fingerprint(mol, path_length=7, fp_size=2048)

# Subset check (for substructure pre-screening)
query_fp = path_fingerprint(parse_smiles("c1ccccc1"))
target_fp = path_fingerprint(parse_smiles("c1ccc(O)cc1"))
assert fingerprint_subset(query_fp, target_fp)

# Quality analysis
quality = analyze_fp_quality(fp)
print(quality)  # {'set_bits': 12, 'density': 0.006, ...}

# Convenience wrappers
fp = fingerprint("CCO", kind="mcs")
sim = tanimoto(fp, fingerprint("CCCO"))

MolGraph Builder

Build molecules directly without SMILES:

from smsd import MolGraphBuilder

builder = MolGraphBuilder(6)  # 6 atoms
for i in range(6):
    builder.atom(i, 6, charge=0, aromatic=True, in_ring=True)
for i in range(6):
    builder.bond(i, (i + 1) % 6, order=1, in_ring=True, aromatic=True)
benzene = builder.build()

Configuration

from smsd import ChemOptions, McsOptions, BondOrderMode, RingFusionMode

# ChemOptions controls atom/bond matching
chem = ChemOptions()
chem.match_atom_type = True
chem.match_formal_charge = True
chem.tautomer_aware = True
chem.complete_rings_only = True
chem.match_bond_order = BondOrderMode.LOOSE
chem.ring_fusion_mode = RingFusionMode.STRICT

# Named profiles
chem = ChemOptions.tautomer_profile()   # tautomer-aware defaults
chem = ChemOptions.profile("strict")    # strict matching

# McsOptions controls MCS algorithm behavior
opts = McsOptions()
opts.connected_only = True
opts.timeout_ms = 10000
opts.maximize_bonds = True  # MCES mode

Running Tests

cd python/
pip install -e ".[dev]"
pytest tests/ -v

License

Apache 2.0. Copyright (c) 2009-2026 Syed Asad Rahman, BioInception Labs.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

smsd-5.2.1.tar.gz (363.5 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

smsd-5.2.1-cp313-cp313-win_amd64.whl (367.8 kB view details)

Uploaded CPython 3.13Windows x86-64

smsd-5.2.1-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (418.5 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

smsd-5.2.1-cp313-cp313-macosx_11_0_arm64.whl (368.6 kB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

smsd-5.2.1-cp312-cp312-win_amd64.whl (367.9 kB view details)

Uploaded CPython 3.12Windows x86-64

smsd-5.2.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (418.5 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

smsd-5.2.1-cp312-cp312-macosx_11_0_arm64.whl (368.5 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

smsd-5.2.1-cp311-cp311-win_amd64.whl (366.8 kB view details)

Uploaded CPython 3.11Windows x86-64

smsd-5.2.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (418.7 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

smsd-5.2.1-cp311-cp311-macosx_11_0_arm64.whl (368.5 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

smsd-5.2.1-cp310-cp310-win_amd64.whl (365.9 kB view details)

Uploaded CPython 3.10Windows x86-64

smsd-5.2.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (417.3 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

smsd-5.2.1-cp310-cp310-macosx_11_0_arm64.whl (367.4 kB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

File details

Details for the file smsd-5.2.1.tar.gz.

File metadata

  • Download URL: smsd-5.2.1.tar.gz
  • Upload date:
  • Size: 363.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for smsd-5.2.1.tar.gz
Algorithm Hash digest
SHA256 a90409e9572443a44d0497661e5859ba7e23249de845e882af32dc1c12bd9c17
MD5 73265479a59732489c9871575fb1f7b2
BLAKE2b-256 8da750119d573a5aa31ae0b67428ee5c5e760df5b327c5ba47c9b93e1a41f3cb

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.2.1.tar.gz:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.2.1-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: smsd-5.2.1-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 367.8 kB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for smsd-5.2.1-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 627b0287db69edf0819dc7a3329a9e966d593efff7c475b03c0d60b11c06b3bd
MD5 feea6838fe072f8bdfa2ed89de222ff3
BLAKE2b-256 cdbccb4b941009eeabab6db22250bbf4c076e8dc1a2a35308c501a72c481e06b

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.2.1-cp313-cp313-win_amd64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.2.1-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for smsd-5.2.1-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 14ce523b981875610628af2d00e3b1caf748533c64f3bf1e60e81da0250f336a
MD5 5f10b481280143185b2d8001173be14c
BLAKE2b-256 34041538e3bab6260e9f08328889b825d4ea0e558d69332bb4aa4a42e9f030af

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.2.1-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.2.1-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for smsd-5.2.1-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 2628dd3df1464d9251a28465389ee865ba8bb686e056ca5fbde0b50928989411
MD5 ad772b334013983e73e370cdf0c5b390
BLAKE2b-256 a58988779d12bd185f72424a351cfee36a9b31eb39d7c93b3f778700daa3d3d6

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.2.1-cp313-cp313-macosx_11_0_arm64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.2.1-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: smsd-5.2.1-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 367.9 kB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for smsd-5.2.1-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 34622335dd8f2f6a48a65cc589067b81942603db0f19f31a625a12793cda834f
MD5 cad2e5bbcab4978ae23667f374e923ee
BLAKE2b-256 600bfa96450e7c867a33478cd6080184c21545b15c27e3bd64b6f0b0e3d22c7b

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.2.1-cp312-cp312-win_amd64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.2.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for smsd-5.2.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 cb66389f772d62e273feca9495e9d97b19784bacb9b17bcd49db22714fc0b9e3
MD5 9a99da9fd0a843a32a5c7df61e36a5b1
BLAKE2b-256 27cc3132aee49af60f18fed199547d51d4c29cd588e703ac1339c279d97b4b9f

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.2.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.2.1-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for smsd-5.2.1-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 003b7d0b072648d7086b57c02a3289d03f9700254d9585df8909450f8dacfd47
MD5 82866bd267f4687ac9e36f806c7fcaba
BLAKE2b-256 bb47d526348db47124c1526a3568534328e960b76cdf4329d5242ab258803dce

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.2.1-cp312-cp312-macosx_11_0_arm64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.2.1-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: smsd-5.2.1-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 366.8 kB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for smsd-5.2.1-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 8b166f412f98e7e80b2b9451c3271b7236ac8da3cfdce713c3b7265837170bc9
MD5 d8bff9a3b368d1443186ae5f0fa43156
BLAKE2b-256 b1823fff7cc18018df718df30aeb5683b258744864afcca91c2288ea20868af5

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.2.1-cp311-cp311-win_amd64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.2.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for smsd-5.2.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 cb45025b59bf4a2c69f6facc4382906dc9c7b230a67603fae1b810daefde09a2
MD5 afa6909e9c13a552a9198aa5743ffb5e
BLAKE2b-256 82a884cc3a6eceda360a54769a9f0d4a040519db6339d29081fe5da4bed445f5

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.2.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.2.1-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for smsd-5.2.1-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 683c8d2f21355ccca1b8eec7266cce5ac3ed6b11970689dbaea910abe562da0f
MD5 54c66d3eadbdcb80c0bb907f202401fc
BLAKE2b-256 0aca1e9fd4d44d9b5d4ee116a2562e59cf6c0d5418b87dbb88ff1c15da7b5911

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.2.1-cp311-cp311-macosx_11_0_arm64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.2.1-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: smsd-5.2.1-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 365.9 kB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for smsd-5.2.1-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 b60bc79159859425a3621380ed5cf0632316271e30d92b07ec84f96c8259f6dc
MD5 54844488404a114a4566722019fc5c32
BLAKE2b-256 1c1fb81f87c900625d69b905ddc8a8a65949522011f295c975503cbca0424d32

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.2.1-cp310-cp310-win_amd64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.2.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for smsd-5.2.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 fe08f57f53cfcb4f87efaea5145e48ab15e211ad2c5ef983bbe3ae0e9b2361ef
MD5 9dd2f8a268b02d4f6c294d6014c1d844
BLAKE2b-256 06f61e9a904f1ae202ad31e4c7e0d66d6958d695eb8d4ef2c42be125d3eab085

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.2.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.2.1-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for smsd-5.2.1-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 5185dd860bf598e988548faa7621c079ac50d4815549fad3c4a3146306c307af
MD5 38e99858ade380e2be6e7cbaaf4c3808
BLAKE2b-256 415a30258ee283817ce364abcc823dc42b109f3eedb61874923ac1caa5b127a1

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.2.1-cp310-cp310-macosx_11_0_arm64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page