Skip to main content

Open, Reproducible Assembly index Calculations

Project description

assembly-theory Python Library

This is a Python library for computing the assembly index of molecules with high-performance Rust-based calculations. It integrates with RDKit and is built using maturin.

Installation

First, create and activate a virtual environment:

Windows:

python -m venv at_env
at_env\Scripts\activate

macOS & Unix:

python -m venv at_env
source at_env/bin/activate

Next, install maturin and build the library:

pip install maturin
maturin develop

Running Tests

To run the test suite, install pytest and execute the tests from the top-level assembly-theory directory:

pip install pytest
pytest test

Example Usage

assembly-theory computes the assembly index of molecules using RDKit's Mol class. Here's a basic example:

import assembly_theory as at
from rdkit import Chem

anthracene = Chem.MolFromSmiles("c1ccc2cc3ccccc3cc2c1")
at.molecular_assembly(anthracene)  # 6

Core Functions

assembly-theory provides three main functions:

  • molecular_assembly(mol: Chem.Mol, bounds: set[str] = None, no_bounds: bool = False, timeout: int = None, serial: bool = False) -> int
    Computes the assembly index of a given molecule.

    • timeout (in seconds) sets a limit on computation time, raising a TimeoutError if exceeded.
    • serial=True forces a serial execution mode, mainly useful for debugging.
  • molecular_assembly_verbose(mol: Chem.Mol, bounds: set[str] = None, no_bounds: bool = False, timeout: int = None, serial: bool = False) -> dict
    Returns additional details, including the number of duplicated isomorphic subgraphs (duplicates) and the size of the search space (space).

    • timeout (in seconds) sets a limit on computation time, raising a TimeoutError if exceeded.
    • serial=True forces a serial execution mode, mainly useful for debugging.
  • molecule_info(mol: Chem.Mol) -> str
    Returns a string representation of the molecule’s atom and bond structure for debugging.

Search Strategy Options

Both molecular_assembly and molecular_assembly_verbose support optional parameters for controlling the search strategy in the branch-and-bound algorithm:

  • bounds: set[str] – Specifies heuristic bounds used to optimize the search.

    • Options: {"log"}, {"intchain"}, or {"log", "intchain"}.
    • Defaults to the best-performing option when not specified.
  • no_bounds: bool – If True, disables all bounds, forcing an exhaustive search of all pathways.

The effect of these options can be observed using molecular_assembly_verbose:

from rdkit import Chem
import assembly_theory as at

anthracene = Chem.MolFromSmiles("c1ccc2cc3ccccc3cc2c1")

at.molecular_assembly_verbose(anthracene, bounds={"log"})
# {'index': 6, 'duplicates': 418, 'space': 40507}

at.molecular_assembly_verbose(anthracene, bounds={"intchain"})
# {'index': 6, 'duplicates': 418, 'space': 3484}

at.molecular_assembly_verbose(anthracene, bounds={"intchain", "log"})
# {'index': 6, 'duplicates': 418, 'space': 3081}

at.molecular_assembly_verbose(anthracene, no_bounds=True)
# {'index': 6, 'duplicates': 418, 'space': 129409}

Due to multiprocessing, space outputs may vary slightly. More details can be found in Seet et al. 2024 (TODO: JOSS Link).

Cross-Platform Support

assembly-theory leverages Rust, maturin, and cargo for robust cross-platform support. However, since assembly-theory depends on RDKit, it is only available on platforms where RDKit is supported via PyPI, including:

  • windows-x64
  • macos-x86
  • macos-aarch64
  • ubuntu-x86
  • ubuntu-aarch64

If you are using a different platform and have RDKit installed (e.g., via conda), assembly-theory may work, but we do not guarantee compatibility.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

assembly_theory-0.1.0.tar.gz (216.0 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

assembly_theory-0.1.0-cp38-abi3-win_amd64.whl (253.8 kB view details)

Uploaded CPython 3.8+Windows x86-64

assembly_theory-0.1.0-cp38-abi3-musllinux_1_2_i686.whl (611.0 kB view details)

Uploaded CPython 3.8+musllinux: musl 1.2+ i686

assembly_theory-0.1.0-cp38-abi3-manylinux_2_28_x86_64.whl (434.9 kB view details)

Uploaded CPython 3.8+manylinux: glibc 2.28+ x86-64

assembly_theory-0.1.0-cp38-abi3-manylinux_2_28_aarch64.whl (418.3 kB view details)

Uploaded CPython 3.8+manylinux: glibc 2.28+ ARM64

assembly_theory-0.1.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (416.0 kB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ x86-64

assembly_theory-0.1.0-cp38-abi3-manylinux_2_17_i686.manylinux2014_i686.whl (436.5 kB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ i686

assembly_theory-0.1.0-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (399.4 kB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ ARM64

assembly_theory-0.1.0-cp38-abi3-macosx_11_0_arm64.whl (361.3 kB view details)

Uploaded CPython 3.8+macOS 11.0+ ARM64

assembly_theory-0.1.0-cp38-abi3-macosx_10_12_x86_64.whl (372.1 kB view details)

Uploaded CPython 3.8+macOS 10.12+ x86-64

File details

Details for the file assembly_theory-0.1.0.tar.gz.

File metadata

  • Download URL: assembly_theory-0.1.0.tar.gz
  • Upload date:
  • Size: 216.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: maturin/1.8.3

File hashes

Hashes for assembly_theory-0.1.0.tar.gz
Algorithm Hash digest
SHA256 c6e6821829e8ed3ab4bcc87069454bc048a3444978bdf5bd0bb153ba320089c2
MD5 14a28fe7387b30e06c7a09d2a5032317
BLAKE2b-256 f091c3151dd5f8f220a3064f6865a1f8e0f1e29d61cc27dea823f79b460137c2

See more details on using hashes here.

File details

Details for the file assembly_theory-0.1.0-cp38-abi3-win_amd64.whl.

File metadata

File hashes

Hashes for assembly_theory-0.1.0-cp38-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 a38656b79bb3d216fab109f7317f06f8c6695dd4b2ed1b01d2b915846b3e254d
MD5 dd51e0bf438dc2f6e407664dac6d3ff3
BLAKE2b-256 afb4b8e30bfb47eb0518e5f66793ca08db005cc639db410810f2b6c4fa43d354

See more details on using hashes here.

File details

Details for the file assembly_theory-0.1.0-cp38-abi3-musllinux_1_2_i686.whl.

File metadata

File hashes

Hashes for assembly_theory-0.1.0-cp38-abi3-musllinux_1_2_i686.whl
Algorithm Hash digest
SHA256 02ee9baedc7fde700fc7ddee978e184b5c2a7d1e15b49a7987eee17ac4e6c4aa
MD5 20d82bf9d2bf2576d6ba3bdccd512429
BLAKE2b-256 1c79117f524bf9ca3262a5980fe8cc0990d33a1f17bf23622897a7da2369b900

See more details on using hashes here.

File details

Details for the file assembly_theory-0.1.0-cp38-abi3-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for assembly_theory-0.1.0-cp38-abi3-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 4947329e24e29c0a9e0bed8c45ca4b19a2855802d190b9831a12a6f3d7889798
MD5 1022183ece6414b285b8f7979e97a394
BLAKE2b-256 0b6ab25450a49a40fec4cdb8dd443f15b0dda6f3e9e9f495606eaf25586a2b39

See more details on using hashes here.

File details

Details for the file assembly_theory-0.1.0-cp38-abi3-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for assembly_theory-0.1.0-cp38-abi3-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 34b1a8230ce7a254b6df23db7ac6ad80b93e541a9cdb86c63cc78949041772e3
MD5 21bf556f3563f9bd0b4750744c3ae0fd
BLAKE2b-256 57ca0dea879c66056429953b6a24c5c5b37ad2a3a53d7ee5cec004f08e659061

See more details on using hashes here.

File details

Details for the file assembly_theory-0.1.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for assembly_theory-0.1.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 53a30d5159a913ccf3ebc54f6509748a9dc89d97814ddc17165f566815e164b2
MD5 643b74fe1de9c54437f99bb124d5df42
BLAKE2b-256 d3a300591196bb4b3ccf6fd8f5fbadfabde515765d4ae4792fea461fd59209af

See more details on using hashes here.

File details

Details for the file assembly_theory-0.1.0-cp38-abi3-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for assembly_theory-0.1.0-cp38-abi3-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 46d3a4ce834b94f1b6f1d2a7bb6c2eaf9624d12df68953d6b4126985af527d7b
MD5 3ecd6c6587d53f131331dcd75df2fbcb
BLAKE2b-256 d69745aa46461cb3c66018dc1e5816a64bb0253b4c3839ef7c27ba92cee8b79a

See more details on using hashes here.

File details

Details for the file assembly_theory-0.1.0-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for assembly_theory-0.1.0-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 6830b02f968611fced1f064a55cc3bfa31dc651597c0eab36fa4414893e0010b
MD5 56c70479ab0c7a1091a615955d71d66e
BLAKE2b-256 1d0e4b195f8e6e2e9e08c812e9cadab564b0cd504f242b9b0032eac9057fcede

See more details on using hashes here.

File details

Details for the file assembly_theory-0.1.0-cp38-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for assembly_theory-0.1.0-cp38-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 cea3087ed42bfa02260dd352899ab21680b2a9987df502a99ff6f5b33dc51806
MD5 f1f8d404fdb277c2812ba58ce8103926
BLAKE2b-256 71c088d1415bcb70bb770b3030d87f09078c980cac32b7eaa63e905fd3a1deaf

See more details on using hashes here.

File details

Details for the file assembly_theory-0.1.0-cp38-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for assembly_theory-0.1.0-cp38-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 a94aec02479495e751835d2743de0d85dc83d422b403b46a3789d4dfa4431825
MD5 c8ad26d259e754f1241ccdd37981b3b8
BLAKE2b-256 bfbc86be30231b96ae841f595e119ff26422cc4e13e0124c05a56123f3e99a34

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page