Skip to main content

Purchaseable SMILES filter

Project description

molbloom

Can I buy this molecule? Returns results in about 500 ns and consumes about 100MB of RAM (or 2 GB if using all ZINC20).

pip install molbloom
from molbloom import buy
buy('CCCO')
# True
buy('ONN1CCCC1')
# False

If buy returns True - it may be purchasable with a measured error rate of 0.0003. If it returns False - it is not purchasable. The catalog information is from ZINC20. Add canonicalize=True if your SMILES are not canonicalized (requires installing rdkit).

There are other available catalogs - see options with molbloom.catalogs(). Most catalogs require an initial download. buy('CCCO', catalog='zinc-instock-mini) doesn't require a download and is included in the package. Useful for testing, but has a high false positive rate of 1%.

Simple Reagents

By default, it will first check against common organic reagents like water, ether, etc. You can disable this check by adding check_common=False

Querying Small World

Just because buy returns True doesn't mean you can buy it -- you should follow-up with a real query at ZINC or you can use the search feature in SmallWorld to find similar purchasable molecules.

from smallworld_api import SmallWorld
sw = SmallWorld()

aspirin = 'O=C(C)Oc1ccccc1C(=O)O'
results = sw.search(aspirin, dist=5, db=sw.REAL_dataset)

this will query ZINC Small World.

Custom Filter

Do you have your own list of SMILES? There are two ways to build a filter -- you can use a C tool that is very fast (1M / s) if your SMILES are in a file and already canonical. Or you can use the Python API to programmaticaly build a filter and canonicalize as you go. See below

Once your custom filter is built:

from molbloom import BloomFilter
bf = BloomFilter('myfilter.bloom')
# usage:
'CCCO' in bf

Build with C Tool

You can build your own filter using the code in the tool/ directory.

cd tool
make
./molbloom-bloom <MB of final filter> <filter name> <approx number of compounds> <input file 1> <input file 2> ...

where each input file has SMILES on each line in the first column and is already canonicalized. The higher the MB, the lower the rate of false positives. If you want to choose the false positive rate rather than the size, you can use the equation:

$$ M = - \frac{N \ln \epsilon}{(\ln 2)^2} $$

where $M$ is the size in bits, $N$ is the number of compounds, and $\epsilon$ is the false positive rate.

Build with Python

You can also build a filter using python as follows:

from molbloom import CustomFilter, canon
bf = CustomFilter(size=100, n=1000, name='myfilter')
bf.add('CCCO')
# canonicalize one record
s = canon("CCCOC")
bf.add(s)
# finalize filter into a file
bf.save('test.bloom')

Citation

@article{medina2023bloom,
  title={Bloom filters for molecules},
  author={Medina, Jorge and White, Andrew D},
  journal={Journal of Cheminformatics},
  volume={15},
  number={1},
  pages={95},
  year={2023},
  publisher={Springer}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

molbloom-3.1.1.tar.gz (9.3 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

molbloom-3.1.1-cp313-cp313-win_amd64.whl (9.2 MB view details)

Uploaded CPython 3.13Windows x86-64

molbloom-3.1.1-cp313-cp313-win32.whl (9.2 MB view details)

Uploaded CPython 3.13Windows x86

molbloom-3.1.1-cp313-cp313-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (9.4 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64manylinux: glibc 2.5+ x86-64

molbloom-3.1.1-cp313-cp313-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl (9.3 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ i686manylinux: glibc 2.5+ i686

molbloom-3.1.1-cp313-cp313-macosx_11_0_arm64.whl (9.2 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

molbloom-3.1.1-cp312-cp312-win_amd64.whl (9.2 MB view details)

Uploaded CPython 3.12Windows x86-64

molbloom-3.1.1-cp312-cp312-win32.whl (9.2 MB view details)

Uploaded CPython 3.12Windows x86

molbloom-3.1.1-cp312-cp312-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (9.4 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64manylinux: glibc 2.5+ x86-64

molbloom-3.1.1-cp312-cp312-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl (9.4 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ i686manylinux: glibc 2.5+ i686

molbloom-3.1.1-cp312-cp312-macosx_11_0_arm64.whl (9.2 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

molbloom-3.1.1-cp311-cp311-win_amd64.whl (9.2 MB view details)

Uploaded CPython 3.11Windows x86-64

molbloom-3.1.1-cp311-cp311-win32.whl (9.2 MB view details)

Uploaded CPython 3.11Windows x86

molbloom-3.1.1-cp311-cp311-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (9.4 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64manylinux: glibc 2.5+ x86-64

molbloom-3.1.1-cp311-cp311-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl (9.3 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ i686manylinux: glibc 2.5+ i686

molbloom-3.1.1-cp311-cp311-macosx_11_0_arm64.whl (9.2 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

File details

Details for the file molbloom-3.1.1.tar.gz.

File metadata

  • Download URL: molbloom-3.1.1.tar.gz
  • Upload date:
  • Size: 9.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for molbloom-3.1.1.tar.gz
Algorithm Hash digest
SHA256 8680dfc46439c2a329af17278b242ace4fe7d82c6c8770721930e41cc94c1985
MD5 32594e51631df099a696c47394b5dff1
BLAKE2b-256 fc9ae6277317869e1d31480b71a250a0d5e0304857627ef54c2b6c9cc21c51a9

See more details on using hashes here.

File details

Details for the file molbloom-3.1.1-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: molbloom-3.1.1-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 9.2 MB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for molbloom-3.1.1-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 4da793c87572a8723a7af60be0465ec77e21f3fb2cd258aeaba385bb5e379491
MD5 7efd0b88f65ccac2283eab8b0365f9f7
BLAKE2b-256 04978d84cb081f3fe795debbacacebd50fac936bcd74bc658e73c6a1bccd5ea3

See more details on using hashes here.

File details

Details for the file molbloom-3.1.1-cp313-cp313-win32.whl.

File metadata

  • Download URL: molbloom-3.1.1-cp313-cp313-win32.whl
  • Upload date:
  • Size: 9.2 MB
  • Tags: CPython 3.13, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for molbloom-3.1.1-cp313-cp313-win32.whl
Algorithm Hash digest
SHA256 96d28c26040b2b95bd595732153da05471ba4f162eebf0dbdf0b699fe41dedee
MD5 5e726d07ef4c8a8b255979ac10438b5d
BLAKE2b-256 17ef48d6d63ff3a3211dea56d6c52bf38e4d3551a94da05aa269f5b5e3a1ba12

See more details on using hashes here.

File details

Details for the file molbloom-3.1.1-cp313-cp313-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for molbloom-3.1.1-cp313-cp313-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 46d21d40d085d4145a22e7ea656e5a21cf4f5e34a7a2edbf2e93ab1860148252
MD5 a5929813056c1eb058cc2d78313489fe
BLAKE2b-256 010ce8719b1516f155a6b764a043672a57abdcd9fd2f5887655a8fa315296305

See more details on using hashes here.

File details

Details for the file molbloom-3.1.1-cp313-cp313-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for molbloom-3.1.1-cp313-cp313-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 1bca23f3ffb5dc7e1ee365f6a637334d0e9f73dad840232b51e637dde6c1f17c
MD5 396cd7139f1d2e7bdaa836337f358d7d
BLAKE2b-256 7918efbc0d3dba60a9441ced9210768c3a8e3bcb203e7830ccf6fe8273fa9fac

See more details on using hashes here.

File details

Details for the file molbloom-3.1.1-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for molbloom-3.1.1-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 7a86839644419061737ab3c58e11436253cd144e081049a3a2cabc7ca127872a
MD5 473920a04dcbd22b7b857a699477d292
BLAKE2b-256 102d4301cbf1ee5dde9c38d0d0bc6940b561fa84d7d722dde641820f8471a9b7

See more details on using hashes here.

File details

Details for the file molbloom-3.1.1-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: molbloom-3.1.1-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 9.2 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for molbloom-3.1.1-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 0c3bc593a54c02eb155a566a32b0ea07d4c9d95ea5571bbac78d906a3b4d0f12
MD5 9cca590d5471d6d117bb17423938f89a
BLAKE2b-256 5b8c6f80da756fee65ea57d925ba8110d74cf1e42d0314b93d96882da7fad3b2

See more details on using hashes here.

File details

Details for the file molbloom-3.1.1-cp312-cp312-win32.whl.

File metadata

  • Download URL: molbloom-3.1.1-cp312-cp312-win32.whl
  • Upload date:
  • Size: 9.2 MB
  • Tags: CPython 3.12, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for molbloom-3.1.1-cp312-cp312-win32.whl
Algorithm Hash digest
SHA256 8e4d8e831462b64a2d1979a0b157ef78be6d6dcb1c27df8f20dea5e954489771
MD5 4a35f11cfb545f31b067209d6f8f6218
BLAKE2b-256 de4fe75fe4876f750398c05d9580e1e6505b574534b1b53a7a9528452eabe5b1

See more details on using hashes here.

File details

Details for the file molbloom-3.1.1-cp312-cp312-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for molbloom-3.1.1-cp312-cp312-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 43cccf7c6677d3c87420ae0995af095a8fa8ad664f49c19bd84cdb6cc62155a2
MD5 ed29e0a79f01b06675d2c7b0d7c66459
BLAKE2b-256 9ef9c8386b75aafeef2d0eff0451b35e6b5cc36d480c07099964ec0fad90b6aa

See more details on using hashes here.

File details

Details for the file molbloom-3.1.1-cp312-cp312-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for molbloom-3.1.1-cp312-cp312-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 2a4672be0861698ff1017a113b916af083a2290728cd7e4b8b2c177ded569536
MD5 5ac496dff4deb832a290996e3feb286c
BLAKE2b-256 7f242258bfbfd1234d9ec651570fe108f4cba5780e145b9c5cc38b3864bcd6e4

See more details on using hashes here.

File details

Details for the file molbloom-3.1.1-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for molbloom-3.1.1-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 1644a2ad2b4b9628b24e5541a568a41acc673cfdbf4e4f74821001aa092db9dc
MD5 e4c94ca17abba004deb133f932344107
BLAKE2b-256 76687a634b1150ae229dd467a10130aaa35362ed124cf7f2621c84e69288d40b

See more details on using hashes here.

File details

Details for the file molbloom-3.1.1-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: molbloom-3.1.1-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 9.2 MB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for molbloom-3.1.1-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 6bf4d5398d97b19d6776149fa6f032d3ad3ef27972a43654fff0bf57ebd2f651
MD5 77cdfdaff2aabe7d7daa055247005e83
BLAKE2b-256 81885cc1d61d98e9c63a3c37131aa997dccb6057e0861838256448300557a0ed

See more details on using hashes here.

File details

Details for the file molbloom-3.1.1-cp311-cp311-win32.whl.

File metadata

  • Download URL: molbloom-3.1.1-cp311-cp311-win32.whl
  • Upload date:
  • Size: 9.2 MB
  • Tags: CPython 3.11, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for molbloom-3.1.1-cp311-cp311-win32.whl
Algorithm Hash digest
SHA256 35240b04e4d9ce7b2be657790f32f1e07dd151a940fc060b34b09f0289571f81
MD5 ac0fe76116dd5ee5c236116536c50f75
BLAKE2b-256 e07c45d6ee685dcc11bda27f933f86d943f9f4d357d631548ed4f8a4e34c47e4

See more details on using hashes here.

File details

Details for the file molbloom-3.1.1-cp311-cp311-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for molbloom-3.1.1-cp311-cp311-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 c6c55b3987ee517564ee9150454fd9d03825dcb0e356d104f48e0cf1f8bba11a
MD5 a939f2476404e0aaed2955fb48265d4c
BLAKE2b-256 39a8723297ba70b5c99496fd274f0ada3aa2f95eb028dee911fc22b22d699650

See more details on using hashes here.

File details

Details for the file molbloom-3.1.1-cp311-cp311-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for molbloom-3.1.1-cp311-cp311-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 0acd8e7473adf2b2764ad251e759c6fae2079ca4bb1011dc9b171aeb8b054c1d
MD5 0b11313262845d5e7c96467b8ec4022a
BLAKE2b-256 25ed615e9bd5964c28a43795cf4b6539dfe4d1f09378115aa8118e17c9e0a8a5

See more details on using hashes here.

File details

Details for the file molbloom-3.1.1-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for molbloom-3.1.1-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 9f4f87de22a5caf65426e81dd56cb915b6f935e1eec223a1f57075c326be3ec9
MD5 c41fe32fb997d29f9c33278cd29e36d3
BLAKE2b-256 50ddba2558f5227c450ea4bc04b948baa3a019cfe16cd1886a561bb0ad46db78

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page