Skip to main content

SMSD -- Substructure & MCS search for chemical graphs

Project description

SMSD Python Bindings

Python interface for SMSD (Small Molecule Substructure Detector) -- a high-performance library for substructure search, Maximum Common Substructure (MCS), and molecular similarity.

Features

  • SMILES parsing and writing -- built-in OpenSMILES parser, no RDKit or CDK required
  • Substructure search -- VF2++ subgraph isomorphism
  • MCS search -- McSplit with seed-and-extend, orbit pruning, coverage-driven termination
  • Tautomer-aware matching -- keto/enol, amide, imidazole tautomer equivalence
  • RASCAL screening -- O(V+E) Tanimoto-like similarity upper bound
  • Fingerprints -- path-based and MCS-aware fingerprints for pre-screening

Requirements

  • Python >= 3.8
  • C++17 compiler (GCC 7+, Clang 5+, MSVC 2019+)
  • CMake >= 3.15
  • pybind11 >= 2.12

Installation

From source (recommended)

cd python/
pip install .

Development install

pip install -e ".[dev]"

Build with specific compiler

CMAKE_ARGS="-DCMAKE_CXX_COMPILER=g++-13" pip install .

Quick Start

from smsd import parse_smiles, find_mcs, is_substructure, similarity

# Parse SMILES strings
benzene = parse_smiles("c1ccccc1")
phenol  = parse_smiles("c1ccc(O)cc1")

# Substructure search
assert is_substructure(benzene, phenol)  # benzene is in phenol

# Maximum Common Substructure
mcs = find_mcs(benzene, phenol)
print(f"MCS size: {len(mcs)} atoms")  # 6

# Similarity
sim = similarity(benzene, phenol)
print(f"Similarity: {sim:.3f}")  # ~0.857

API Reference

SMILES Parsing

from smsd import parse_smiles, to_smiles

mol = parse_smiles("c1ccccc1")
print(mol.n)              # 6 atoms
print(mol.atomic_num)     # [6, 6, 6, 6, 6, 6]
print(mol.aromatic)       # [True, True, True, True, True, True]

smi = to_smiles(mol)      # canonical SMILES string

Substructure Search

from smsd import is_substructure, find_substructure, ChemOptions

query  = parse_smiles("c1ccccc1")
target = parse_smiles("c1ccc(O)cc1")

# Boolean check
if is_substructure(query, target):
    print("Query is a substructure of target")

# Get atom mapping
mapping = find_substructure(query, target)
# Returns list of (query_atom, target_atom) pairs
for qi, ti in mapping:
    print(f"  query atom {qi} -> target atom {ti}")

# Custom options
opts = ChemOptions()
opts.ring_matches_ring_only = True
is_substructure(query, target, opts=opts, timeout_ms=5000)

MCS Search

from smsd import find_mcs, ChemOptions, McsOptions

g1 = parse_smiles("CC(=O)Oc1ccccc1C(=O)O")   # aspirin
g2 = parse_smiles("CC(=O)Nc1ccc(O)cc1")       # acetaminophen

# Default MCS
mapping = find_mcs(g1, g2)
print(f"MCS size: {len(mapping)}")

# Tautomer-aware MCS
taut = ChemOptions.tautomer_profile()
mapping = find_mcs(g1, g2, chem=taut)

# With MCS options
mcs_opts = McsOptions()
mcs_opts.timeout_ms = 5000
mcs_opts.connected_only = True
mapping = find_mcs(g1, g2, opts=mcs_opts)

# Convenience wrapper (accepts SMILES strings directly)
from smsd import mcs
mapping = mcs("c1ccccc1", "Cc1ccccc1", tautomer_aware=True)

Similarity and Screening

from smsd import similarity_upper_bound, screen_targets, similarity

g1 = parse_smiles("c1ccccc1")
g2 = parse_smiles("Cc1ccccc1")

# Single pair
sim = similarity_upper_bound(g1, g2)
print(f"Similarity: {sim:.3f}")

# Convenience wrapper (accepts SMILES)
sim = similarity("c1ccccc1", "Cc1ccccc1")

# Batch screening
library = [parse_smiles(s) for s in smiles_list]
query = parse_smiles("c1ccccc1")
hits = screen_targets(query, library, threshold=0.5)
# Returns indices of molecules with similarity >= 0.5

Fingerprints

from smsd import (
    path_fingerprint, mcs_fingerprint,
    fingerprint_subset, analyze_fp_quality,
    fingerprint, tanimoto,
)

mol = parse_smiles("c1ccccc1")

# Path fingerprint (returns set bit positions)
fp = path_fingerprint(mol, path_length=7, fp_size=2048)

# MCS-aware fingerprint
fp_mcs = mcs_fingerprint(mol, path_length=7, fp_size=2048)

# Subset check (for substructure pre-screening)
query_fp = path_fingerprint(parse_smiles("c1ccccc1"))
target_fp = path_fingerprint(parse_smiles("c1ccc(O)cc1"))
assert fingerprint_subset(query_fp, target_fp)

# Quality analysis
quality = analyze_fp_quality(fp)
print(quality)  # {'set_bits': 12, 'density': 0.006, ...}

# Convenience wrappers
fp = fingerprint("CCO", kind="mcs")
sim = tanimoto(fp, fingerprint("CCCO"))

MolGraph Builder

Build molecules directly without SMILES:

from smsd import MolGraphBuilder

builder = MolGraphBuilder(6)  # 6 atoms
for i in range(6):
    builder.atom(i, 6, charge=0, aromatic=True, in_ring=True)
for i in range(6):
    builder.bond(i, (i + 1) % 6, order=1, in_ring=True, aromatic=True)
benzene = builder.build()

Configuration

from smsd import ChemOptions, McsOptions, BondOrderMode, RingFusionMode

# ChemOptions controls atom/bond matching
chem = ChemOptions()
chem.match_atom_type = True
chem.match_formal_charge = True
chem.tautomer_aware = True
chem.complete_rings_only = True
chem.match_bond_order = BondOrderMode.LOOSE
chem.ring_fusion_mode = RingFusionMode.STRICT

# Named profiles
chem = ChemOptions.tautomer_profile()   # tautomer-aware defaults
chem = ChemOptions.profile("strict")    # strict matching

# McsOptions controls MCS algorithm behavior
opts = McsOptions()
opts.connected_only = True
opts.timeout_ms = 10000
opts.maximize_bonds = True  # MCES mode

Running Tests

cd python/
pip install -e ".[dev]"
pytest tests/ -v

License

Apache 2.0. Copyright (c) 2009-2026 Syed Asad Rahman, BioInception Labs.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

smsd-5.5.1.tar.gz (428.9 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

smsd-5.5.1-cp313-cp313-win_amd64.whl (426.5 kB view details)

Uploaded CPython 3.13Windows x86-64

smsd-5.5.1-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (480.0 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

smsd-5.5.1-cp313-cp313-macosx_11_0_arm64.whl (429.5 kB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

smsd-5.5.1-cp312-cp312-win_amd64.whl (426.5 kB view details)

Uploaded CPython 3.12Windows x86-64

smsd-5.5.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (480.0 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

smsd-5.5.1-cp312-cp312-macosx_11_0_arm64.whl (429.4 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

smsd-5.5.1-cp311-cp311-win_amd64.whl (425.4 kB view details)

Uploaded CPython 3.11Windows x86-64

smsd-5.5.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (479.4 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

smsd-5.5.1-cp311-cp311-macosx_11_0_arm64.whl (429.3 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

smsd-5.5.1-cp310-cp310-win_amd64.whl (424.6 kB view details)

Uploaded CPython 3.10Windows x86-64

smsd-5.5.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (478.3 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

smsd-5.5.1-cp310-cp310-macosx_11_0_arm64.whl (428.1 kB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

File details

Details for the file smsd-5.5.1.tar.gz.

File metadata

  • Download URL: smsd-5.5.1.tar.gz
  • Upload date:
  • Size: 428.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for smsd-5.5.1.tar.gz
Algorithm Hash digest
SHA256 72f504158167cb8a74e8389b311c453e5e9d6d3b8f21df81ad564c1bfb1d2431
MD5 c6c24436ae3a5871f8ed869d5ad40be2
BLAKE2b-256 5952cff44b36dc22e4ba37aaedc7079b2143fd17aa46678feb6041397847449e

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.5.1.tar.gz:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.5.1-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: smsd-5.5.1-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 426.5 kB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for smsd-5.5.1-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 13e4ac44165ae400a02ac6c4528634fbde0f2340149e5d4099b71955f98bf8ac
MD5 55d2067207b142fb3d42af38aec6d16b
BLAKE2b-256 407fbddb9d40a6e8a58d13ad9ab73a240a34373930e1e1f0472b3ea087f10988

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.5.1-cp313-cp313-win_amd64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.5.1-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for smsd-5.5.1-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 4f696a214406f95f3ff63e1c44e9dab4b8b0950eea37a341e765090d24498c18
MD5 d9b49ede4e24b4306f671909f77971e0
BLAKE2b-256 7c98c586f3b2d0bf97a21b557840c9e5be8ff480e7a1f893c17e410400eba3ce

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.5.1-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.5.1-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for smsd-5.5.1-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 e1775ffec4a9151098e156e1835efe217b5afde908ee7a31dc7804191201d220
MD5 47eb38b861ecb0688589232764a79cdd
BLAKE2b-256 244247bf68ef46619740c0b1fdc1e8c8916f08798365ae3b1b8e37f6676f4779

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.5.1-cp313-cp313-macosx_11_0_arm64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.5.1-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: smsd-5.5.1-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 426.5 kB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for smsd-5.5.1-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 31a184b514e17f6253d6531396dece67512e8f6d18891164fb443054f9604494
MD5 956d4b15865d4135728f1b522f49072c
BLAKE2b-256 e0d92cc29dfec27c2cac257f95dfaa66b54ba5ad949ae13ce26232a57c2f3a63

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.5.1-cp312-cp312-win_amd64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.5.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for smsd-5.5.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 7495c16bd76f417955494893f557cbbf15ed0ad9f05bfaab7d626ec146907ef8
MD5 4f7ed347d1f1c199d31e1b0ee98a1081
BLAKE2b-256 0ad96fa13d50321ed26e9460ee30b4cdc990ce5cf0a003cae8265853e40ea06a

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.5.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.5.1-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for smsd-5.5.1-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 791fe1c2e92682444cfeadac8ca4ddcc5bc1751f1f1c3bc6cfe6c5d1df04fd05
MD5 2a6857b52ddde059fe5f2dcf5a4395a0
BLAKE2b-256 19cd7a6963c9c45ae0f396ce0f28cd45b236cc6a2b07228ae0347db7860b5254

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.5.1-cp312-cp312-macosx_11_0_arm64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.5.1-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: smsd-5.5.1-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 425.4 kB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for smsd-5.5.1-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 e79f303ddb186577d4373697bfbe79c1b57ace77387a5a378debdadc68f251a5
MD5 6b29f2936d0f4f414c2252deb412c73d
BLAKE2b-256 381a7a3ab2d0a7b043a3fbd0632463fd2d6cd22308149e5e23919dd93348f8da

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.5.1-cp311-cp311-win_amd64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.5.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for smsd-5.5.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 c4502d922fccb1f9ad24d0a34ed2753fd9ad3acabb6ffce517b34322d4f71cd4
MD5 daed9fc6d048ebd835c025a1f9accc92
BLAKE2b-256 099782ec1646c40d5e1c6b5faa850b069d740bcd4e3ba2bbcba3f3f7fcc586b3

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.5.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.5.1-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for smsd-5.5.1-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 3d7d46e268b5a603ab85879996588e4c845d63e5c35624b2fc9ac98014799d78
MD5 e20af41ba04089cc37a141d319ee411f
BLAKE2b-256 df6022d6e81aacb200e3c735845940efe5cf669b117f9b32c54545a08f2d1f89

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.5.1-cp311-cp311-macosx_11_0_arm64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.5.1-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: smsd-5.5.1-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 424.6 kB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for smsd-5.5.1-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 b1787ce8b479c099d8d6d12c3bf649ebf407cdaac68e30e91a6cb69fba2921ad
MD5 65e0fa7b467a6b023c3770b085e0a2b3
BLAKE2b-256 2e14fbacbe82d2914cfffb21ff0654e0782046bbaf91ea9397f24224fc5f8158

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.5.1-cp310-cp310-win_amd64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.5.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for smsd-5.5.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 8b17c3eb3460e25ac1a53b251154c122d905690960aa05e9b42bece5c91de3a1
MD5 afb27e6791dff52e91c44abd2d694930
BLAKE2b-256 ff881bad2987423701e252e11dc30c57826fff0aea5a3057e1ac7f39b96a73dc

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.5.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.5.1-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for smsd-5.5.1-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 4f6eb5b83149d7367ece08c8594652ddc728e7d54b1a20d553fd964d0ec01162
MD5 98ec6a81591c5151c4b629affaa641c2
BLAKE2b-256 11be60b399881fcaf3bf058f3714d4f1a82b8eb6a599487aba15a1018ec2948e

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.5.1-cp310-cp310-macosx_11_0_arm64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page