Skip to main content

SMSD -- Substructure & MCS search for chemical graphs

Project description

SMSD Python Bindings

Python interface for SMSD (Small Molecule Substructure Detector) -- a high-performance library for substructure search, Maximum Common Substructure (MCS), and molecular similarity.

Features

  • SMILES parsing and writing -- built-in OpenSMILES parser, no RDKit or CDK required
  • Substructure search -- VF2++ subgraph isomorphism
  • MCS search -- McSplit with seed-and-extend, orbit pruning, coverage-driven termination
  • Tautomer-aware matching -- keto/enol, amide, imidazole tautomer equivalence
  • RASCAL screening -- O(V+E) Tanimoto-like similarity upper bound
  • Fingerprints -- path-based and MCS-aware fingerprints for pre-screening

Requirements

  • Python >= 3.8
  • C++17 compiler (GCC 7+, Clang 5+, MSVC 2019+)
  • CMake >= 3.15
  • pybind11 >= 2.12

Installation

From source (recommended)

cd python/
pip install .

Development install

pip install -e ".[dev]"

Build with specific compiler

CMAKE_ARGS="-DCMAKE_CXX_COMPILER=g++-13" pip install .

Quick Start

from smsd import parse_smiles, find_mcs, is_substructure, similarity

# Parse SMILES strings
benzene = parse_smiles("c1ccccc1")
phenol  = parse_smiles("c1ccc(O)cc1")

# Substructure search
assert is_substructure(benzene, phenol)  # benzene is in phenol

# Maximum Common Substructure
mcs = find_mcs(benzene, phenol)
print(f"MCS size: {len(mcs)} atoms")  # 6

# Similarity
sim = similarity(benzene, phenol)
print(f"Similarity: {sim:.3f}")  # ~0.857

API Reference

SMILES Parsing

from smsd import parse_smiles, to_smiles

mol = parse_smiles("c1ccccc1")
print(mol.n)              # 6 atoms
print(mol.atomic_num)     # [6, 6, 6, 6, 6, 6]
print(mol.aromatic)       # [True, True, True, True, True, True]

smi = to_smiles(mol)      # canonical SMILES string

Substructure Search

from smsd import is_substructure, find_substructure, ChemOptions

query  = parse_smiles("c1ccccc1")
target = parse_smiles("c1ccc(O)cc1")

# Boolean check
if is_substructure(query, target):
    print("Query is a substructure of target")

# Get atom mapping
mapping = find_substructure(query, target)
# Returns list of (query_atom, target_atom) pairs
for qi, ti in mapping:
    print(f"  query atom {qi} -> target atom {ti}")

# Custom options
opts = ChemOptions()
opts.ring_matches_ring_only = True
is_substructure(query, target, opts=opts, timeout_ms=5000)

MCS Search

from smsd import find_mcs, ChemOptions, McsOptions

g1 = parse_smiles("CC(=O)Oc1ccccc1C(=O)O")   # aspirin
g2 = parse_smiles("CC(=O)Nc1ccc(O)cc1")       # acetaminophen

# Default MCS
mapping = find_mcs(g1, g2)
print(f"MCS size: {len(mapping)}")

# Tautomer-aware MCS
taut = ChemOptions.tautomer_profile()
mapping = find_mcs(g1, g2, chem=taut)

# With MCS options
mcs_opts = McsOptions()
mcs_opts.timeout_ms = 5000
mcs_opts.connected_only = True
mapping = find_mcs(g1, g2, opts=mcs_opts)

# Convenience wrapper (accepts SMILES strings directly)
from smsd import mcs
mapping = mcs("c1ccccc1", "Cc1ccccc1", tautomer_aware=True)

Similarity and Screening

from smsd import similarity_upper_bound, screen_targets, similarity

g1 = parse_smiles("c1ccccc1")
g2 = parse_smiles("Cc1ccccc1")

# Single pair
sim = similarity_upper_bound(g1, g2)
print(f"Similarity: {sim:.3f}")

# Convenience wrapper (accepts SMILES)
sim = similarity("c1ccccc1", "Cc1ccccc1")

# Batch screening
library = [parse_smiles(s) for s in smiles_list]
query = parse_smiles("c1ccccc1")
hits = screen_targets(query, library, threshold=0.5)
# Returns indices of molecules with similarity >= 0.5

Fingerprints

from smsd import (
    path_fingerprint, mcs_fingerprint,
    fingerprint_subset, analyze_fp_quality,
    fingerprint, tanimoto,
)

mol = parse_smiles("c1ccccc1")

# Path fingerprint (returns set bit positions)
fp = path_fingerprint(mol, path_length=7, fp_size=2048)

# MCS-aware fingerprint
fp_mcs = mcs_fingerprint(mol, path_length=7, fp_size=2048)

# Subset check (for substructure pre-screening)
query_fp = path_fingerprint(parse_smiles("c1ccccc1"))
target_fp = path_fingerprint(parse_smiles("c1ccc(O)cc1"))
assert fingerprint_subset(query_fp, target_fp)

# Quality analysis
quality = analyze_fp_quality(fp)
print(quality)  # {'set_bits': 12, 'density': 0.006, ...}

# Convenience wrappers
fp = fingerprint("CCO", kind="mcs")
sim = tanimoto(fp, fingerprint("CCCO"))

MolGraph Builder

Build molecules directly without SMILES:

from smsd import MolGraphBuilder

builder = MolGraphBuilder(6)  # 6 atoms
for i in range(6):
    builder.atom(i, 6, charge=0, aromatic=True, in_ring=True)
for i in range(6):
    builder.bond(i, (i + 1) % 6, order=1, in_ring=True, aromatic=True)
benzene = builder.build()

Configuration

from smsd import ChemOptions, McsOptions, BondOrderMode, RingFusionMode

# ChemOptions controls atom/bond matching
chem = ChemOptions()
chem.match_atom_type = True
chem.match_formal_charge = True
chem.tautomer_aware = True
chem.complete_rings_only = True
chem.match_bond_order = BondOrderMode.LOOSE
chem.ring_fusion_mode = RingFusionMode.STRICT

# Named profiles
chem = ChemOptions.tautomer_profile()   # tautomer-aware defaults
chem = ChemOptions.profile("strict")    # strict matching

# McsOptions controls MCS algorithm behavior
opts = McsOptions()
opts.connected_only = True
opts.timeout_ms = 10000
opts.maximize_bonds = True  # MCES mode

Running Tests

cd python/
pip install -e ".[dev]"
pytest tests/ -v

License

Apache 2.0. Copyright (c) 2018-2026 BioInception PVT LTD. Algorithm Copyright (c) 2009-2026 Syed Asad Rahman.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

smsd-5.7.1.tar.gz (447.8 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

smsd-5.7.1-cp313-cp313-win_amd64.whl (444.1 kB view details)

Uploaded CPython 3.13Windows x86-64

smsd-5.7.1-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (504.7 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

smsd-5.7.1-cp313-cp313-macosx_11_0_arm64.whl (453.8 kB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

smsd-5.7.1-cp312-cp312-win_amd64.whl (444.2 kB view details)

Uploaded CPython 3.12Windows x86-64

smsd-5.7.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (504.6 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

smsd-5.7.1-cp312-cp312-macosx_11_0_arm64.whl (453.8 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

smsd-5.7.1-cp311-cp311-win_amd64.whl (442.6 kB view details)

Uploaded CPython 3.11Windows x86-64

smsd-5.7.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (503.2 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

smsd-5.7.1-cp311-cp311-macosx_11_0_arm64.whl (453.6 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

smsd-5.7.1-cp310-cp310-win_amd64.whl (441.9 kB view details)

Uploaded CPython 3.10Windows x86-64

smsd-5.7.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (502.1 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

smsd-5.7.1-cp310-cp310-macosx_11_0_arm64.whl (452.4 kB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

File details

Details for the file smsd-5.7.1.tar.gz.

File metadata

  • Download URL: smsd-5.7.1.tar.gz
  • Upload date:
  • Size: 447.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for smsd-5.7.1.tar.gz
Algorithm Hash digest
SHA256 57fe6b882958d39433746753263fdc15d94df96a5d218e7a15673bd6d59f554a
MD5 084c08bc6e5e1a0fc40c95207ca0e060
BLAKE2b-256 be4cc6498b3b6995039ca6f8af97954ea31da50c67d22ad9cb63dd44155841f0

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.7.1.tar.gz:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.7.1-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: smsd-5.7.1-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 444.1 kB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for smsd-5.7.1-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 ba2082086f741e81d938f0a6c696e522d5e9b002d706af09394b058726c34b05
MD5 6015b68612b7c2be9bc93134af81bf5e
BLAKE2b-256 b51b2dfa4e22d8549e5db1c1e8108798be056c9e94c8ca31cb8ae3d6d82fa118

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.7.1-cp313-cp313-win_amd64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.7.1-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for smsd-5.7.1-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 600df71755c951fae51fd9baad1fe2c661a28c52ac44d46a22518f7cbf844faa
MD5 5d2873b2707303eaf1ce72ccdd873ebb
BLAKE2b-256 6a8c965050f495e0d6875ae9be8ae0556b591f7da36128acd878703dfd74ff27

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.7.1-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.7.1-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for smsd-5.7.1-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 9cb8b882ca192ee13fb89a7a5c4847e912a313485721e7cd296529d2429c7e97
MD5 51217081d391bf7c7912eb8a7bf9c6cb
BLAKE2b-256 282fd2d95843145a2f29eb6c7efb38199813a34374b1b5fe6ed030bfcf861a14

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.7.1-cp313-cp313-macosx_11_0_arm64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.7.1-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: smsd-5.7.1-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 444.2 kB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for smsd-5.7.1-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 e1a99b34b94cfb124b4052118bfebec998a9a6bc6a42aa66d66fddc521c01c5f
MD5 21da3e140951c9863dbc713fb903fd58
BLAKE2b-256 76a0559ed6f33ce33af30eca71fa3d66b47caec736bed6b31015665a50ae149f

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.7.1-cp312-cp312-win_amd64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.7.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for smsd-5.7.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 41ce6f6a6110a0885cfc9c327e6b4281609e273375a52032d6582a19fcf98213
MD5 db4b59ee4241815ebf234c8ad099d23e
BLAKE2b-256 3ad5f3bd46155168251ea31f9ca3bbb2e60888b2a8dd3d7278fb53cbbfec7d70

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.7.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.7.1-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for smsd-5.7.1-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 65ef0bd2a7cb7810fc498d236ce0095019002d9d93f5fd6be40df8ba51b260f9
MD5 26a0d7bc0bf55f7640170a260321fac0
BLAKE2b-256 c91eb86302e934598b0863310d1ed677435d13064f7a85eb4dbea8f3edf3b4b2

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.7.1-cp312-cp312-macosx_11_0_arm64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.7.1-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: smsd-5.7.1-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 442.6 kB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for smsd-5.7.1-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 d28e9b7a4e5ff8bd4c519c313ddaad799b64440545e4e7739fca8d4ef400edd1
MD5 52a3aa5992dfbfe499e31ae4c8a3680c
BLAKE2b-256 e67dd9b24503c8d6f5096e644a9064a8ac3c59b73ee0d8ec63521d29427032ec

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.7.1-cp311-cp311-win_amd64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.7.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for smsd-5.7.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 96e28f6ca7c6a1694d531a33cdf117ecfc70dd3e8e41b82cf0a35d2ac05bcecd
MD5 d1ced42f31f37764101a332c1c9be724
BLAKE2b-256 22abd10c9ffc3138953b4a1387ac072c509a38b2b9124d34e2b9d5a8cccf9a1d

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.7.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.7.1-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for smsd-5.7.1-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 99f98100808dbf0fbe510b06fe9c95e8dde19ff7310f1b5dff92cd0c0a984db6
MD5 991a46410345fa44cd3f748fbde8aee9
BLAKE2b-256 d940fbc0805462475466d5b14e3743c8d2c951072026b8d1d883e536411de648

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.7.1-cp311-cp311-macosx_11_0_arm64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.7.1-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: smsd-5.7.1-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 441.9 kB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for smsd-5.7.1-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 e0be3bf5e793c5311838404196b258da3225d130255bf4f0de066a9f766e2955
MD5 c4b5fdbe0bfb525d6346be728b2eec2e
BLAKE2b-256 d85c12039b08f1f164f2fd65a114755e60f7f80fb7f7eed5a2102301b6b45e45

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.7.1-cp310-cp310-win_amd64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.7.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for smsd-5.7.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 018a95c65d23630cd82b4b0dc67fd7b58ddb44d40598e33b942aa7b4922032ec
MD5 d186c020f60ff32f10d31548ff4a87fe
BLAKE2b-256 3819af1db04e57118833f0cd50ed3abf4ae0d018bb8ae73b47e3be2848ff0b7a

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.7.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smsd-5.7.1-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for smsd-5.7.1-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 4cb4c2aca2d3246339e498ffb3f8e33629af99e9577b54945d978962f4c31bbd
MD5 3dcd1ebbc8491700cb2e70e86dffd797
BLAKE2b-256 189c286c3106d2bf3f4c38716e96365dc2491a73c61b72aa8a7cb2abd8b78fa5

See more details on using hashes here.

Provenance

The following attestation bundles were made for smsd-5.7.1-cp310-cp310-macosx_11_0_arm64.whl:

Publisher: python-publish.yml on asad/SMSD

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page