Skip to main content

Purchaseable SMILES filter

Project description

molbloom

Can I buy this molecule? Returns results in about 500 ns and consumes about 100MB of RAM (or 2 GB if using all ZINC20).

pip install molbloom
from molbloom import buy
buy('CCCO')
# True
buy('ONN1CCCC1')
# False

If buy returns True - it may be purchasable with a measured error rate of 0.0003. If it returns False - it is not purchasable. The catalog information is from ZINC20. Add canonicalize=True if your SMILES are not canonicalized (requires installing rdkit).

There are other available catalogs - see options with molbloom.catalogs(). Most catalogs require an initial download. buy('CCCO', catalog='zinc-instock-mini) doesn't require a download and is included in the package. Useful for testing, but has a high false positive rate of 1%.

Simple Reagents

By default, it will first check against common organic reagents like water, ether, etc. You can disable this check by adding check_common=False

Querying Small World

Just because buy returns True doesn't mean you can buy it -- you should follow-up with a real query at ZINC or you can use the search feature in SmallWorld to find similar purchasable molecules.

from smallworld_api import SmallWorld
sw = SmallWorld()

aspirin = 'O=C(C)Oc1ccccc1C(=O)O'
results = sw.search(aspirin, dist=5, db=sw.REAL_dataset)

this will query ZINC Small World.

Custom Filter

Do you have your own list of SMILES? There are two ways to build a filter -- you can use a C tool that is very fast (1M / s) if your SMILES are in a file and already canonical. Or you can use the Python API to programmaticaly build a filter and canonicalize as you go. See below

Once your custom filter is built:

from molbloom import BloomFilter
bf = BloomFilter('myfilter.bloom')
# usage:
'CCCO' in bf

Build with C Tool

You can build your own filter using the code in the tool/ directory.

cd tool
make
./molbloom-bloom <MB of final filter> <filter name> <approx number of compounds> <input file 1> <input file 2> ...

where each input file has SMILES on each line in the first column and is already canonicalized. The higher the MB, the lower the rate of false positives. If you want to choose the false positive rate rather than the size, you can use the equation:

$$ M = - \frac{N \ln \epsilon}{(\ln 2)^2} $$

where $M$ is the size in bits, $N$ is the number of compounds, and $\epsilon$ is the false positive rate.

Build with Python

You can also build a filter using python as follows:

from molbloom import CustomFilter, canon
bf = CustomFilter(size=100, n=1000, name='myfilter')
bf.add('CCCO')
# canonicalize one record
s = canon("CCCOC")
bf.add(s)
# finalize filter into a file
bf.save('test.bloom')

Citation

@article{medina2023bloom,
  title={Bloom filters for molecules},
  author={Medina, Jorge and White, Andrew D},
  journal={Journal of Cheminformatics},
  volume={15},
  number={1},
  pages={95},
  year={2023},
  publisher={Springer}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

molbloom-3.1.0.tar.gz (9.3 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

molbloom-3.1.0-cp313-cp313-win_amd64.whl (9.2 MB view details)

Uploaded CPython 3.13Windows x86-64

molbloom-3.1.0-cp313-cp313-win32.whl (9.2 MB view details)

Uploaded CPython 3.13Windows x86

molbloom-3.1.0-cp313-cp313-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (9.4 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64manylinux: glibc 2.5+ x86-64

molbloom-3.1.0-cp313-cp313-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl (9.3 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ i686manylinux: glibc 2.5+ i686

molbloom-3.1.0-cp313-cp313-macosx_11_0_arm64.whl (9.2 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

molbloom-3.1.0-cp312-cp312-win_amd64.whl (9.2 MB view details)

Uploaded CPython 3.12Windows x86-64

molbloom-3.1.0-cp312-cp312-win32.whl (9.2 MB view details)

Uploaded CPython 3.12Windows x86

molbloom-3.1.0-cp312-cp312-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (9.4 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64manylinux: glibc 2.5+ x86-64

molbloom-3.1.0-cp312-cp312-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl (9.4 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ i686manylinux: glibc 2.5+ i686

molbloom-3.1.0-cp312-cp312-macosx_11_0_arm64.whl (9.2 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

molbloom-3.1.0-cp311-cp311-win_amd64.whl (9.2 MB view details)

Uploaded CPython 3.11Windows x86-64

molbloom-3.1.0-cp311-cp311-win32.whl (9.2 MB view details)

Uploaded CPython 3.11Windows x86

molbloom-3.1.0-cp311-cp311-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (9.4 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64manylinux: glibc 2.5+ x86-64

molbloom-3.1.0-cp311-cp311-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl (9.3 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ i686manylinux: glibc 2.5+ i686

molbloom-3.1.0-cp311-cp311-macosx_11_0_arm64.whl (9.2 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

File details

Details for the file molbloom-3.1.0.tar.gz.

File metadata

  • Download URL: molbloom-3.1.0.tar.gz
  • Upload date:
  • Size: 9.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for molbloom-3.1.0.tar.gz
Algorithm Hash digest
SHA256 8f6125a6f02571a75f965a101ad52fc064e5e09b517076cf7d930080f2b91ef6
MD5 abaa39313564fce644fa66c65420ffe9
BLAKE2b-256 f6cd4f8b3ea24b08063ddf897e3c8032ab2f45fdd12a0045ea63a7d376758eea

See more details on using hashes here.

File details

Details for the file molbloom-3.1.0-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: molbloom-3.1.0-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 9.2 MB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for molbloom-3.1.0-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 2ee9bad9061673f9b6fcd37bdb5644e735e664326520c6dcdb5483666b5ad907
MD5 15622d1090437d2e9e4745f256b221c9
BLAKE2b-256 459c7e2818f9a4d4f4d2d069f79e410048372a701b41e21b471666a4a753c1ec

See more details on using hashes here.

File details

Details for the file molbloom-3.1.0-cp313-cp313-win32.whl.

File metadata

  • Download URL: molbloom-3.1.0-cp313-cp313-win32.whl
  • Upload date:
  • Size: 9.2 MB
  • Tags: CPython 3.13, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for molbloom-3.1.0-cp313-cp313-win32.whl
Algorithm Hash digest
SHA256 dac33694c946a9035e4786ea50d9ec4859ef767dd95fff248fdc148566b4afee
MD5 52c9848ce582404a03d45b3ff55c6433
BLAKE2b-256 4ab2c993fd3bb4c75ca0aec8dcc18ac09c78074769c1589f8d2cb9c2e1092d95

See more details on using hashes here.

File details

Details for the file molbloom-3.1.0-cp313-cp313-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for molbloom-3.1.0-cp313-cp313-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 702ca43b4397f2d9bfa522b294c802c8c92d1453464524978f83826864c0912c
MD5 aff86d87c9cc339a38a74d2cd220ced9
BLAKE2b-256 5df80563b0f78070f1e25f4d70eed9a8add83002497bc76259683df0cd7ee2ca

See more details on using hashes here.

File details

Details for the file molbloom-3.1.0-cp313-cp313-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for molbloom-3.1.0-cp313-cp313-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 3305978203f8facc9a230a7676e5d41116dfeec5b91c8e05feb908742753d0a7
MD5 aa9924d7c8e991bb049378baa48b9ad1
BLAKE2b-256 6c92918622294dd07ccc3e4f0cb00e81fce7943a94fbec270b2ba0bff28a7be3

See more details on using hashes here.

File details

Details for the file molbloom-3.1.0-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for molbloom-3.1.0-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 d507b9ceb91b553ebc040a8296433ef3d876dbdcf2fd3ab0744b9c8138dcdd11
MD5 b61abfddead040c97476e0e499756d26
BLAKE2b-256 1c52ff3bd9f9674f90da7d553a956eafb0c86c1c16a19d009cd482bcad9b0aab

See more details on using hashes here.

File details

Details for the file molbloom-3.1.0-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: molbloom-3.1.0-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 9.2 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for molbloom-3.1.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 3c614c3590c362dde3136e4e80a17a954f6a994376351fee5e04c90ca605ae49
MD5 283d3ff50a2f3911f1eb276c0aeb9fb9
BLAKE2b-256 e5e1ae44cb31ab9e77f8fe7a1fa972adb06aa2e125869370370487aa8472a824

See more details on using hashes here.

File details

Details for the file molbloom-3.1.0-cp312-cp312-win32.whl.

File metadata

  • Download URL: molbloom-3.1.0-cp312-cp312-win32.whl
  • Upload date:
  • Size: 9.2 MB
  • Tags: CPython 3.12, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for molbloom-3.1.0-cp312-cp312-win32.whl
Algorithm Hash digest
SHA256 3cc78218b06bda64a30871632fe49f278ea019c42b026087224d00979a59859b
MD5 070777ee43a174c6bee6f7001916151a
BLAKE2b-256 19b9cb70ab199ad8479b1941e7c1c0e6cdea0d8df55ddbc26afa23308e29d82c

See more details on using hashes here.

File details

Details for the file molbloom-3.1.0-cp312-cp312-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for molbloom-3.1.0-cp312-cp312-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 c4edf006dab0a6014e8961e846d375f669bb84c0deb6cb72fa373ad82d6e042d
MD5 f51a574ef6e9a27b9b030e188a0df15a
BLAKE2b-256 ba96c7a59230d4ba6060b1a330bbe6510ef926ad30e1364217503763bb342df1

See more details on using hashes here.

File details

Details for the file molbloom-3.1.0-cp312-cp312-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for molbloom-3.1.0-cp312-cp312-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 4abca32ebd9e556c257e0650d98bc0b0a57e9dcb93e096a548674c8614008f5d
MD5 01d8a32ffca6938d2027a9b158129fc3
BLAKE2b-256 005e90e9d026e985a8deafd1f22b6398656ce4da37f40465b5aeca3952b7084a

See more details on using hashes here.

File details

Details for the file molbloom-3.1.0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for molbloom-3.1.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 883e1c238bbe03da98704261181b27af440afc82753089bd1c1c9ade2209266f
MD5 c85107d3b16d639c146e1425ea3835be
BLAKE2b-256 fcb3ab07e8805b0713feab4797047ffcba56a862947e15d5567c26ef2094d4f8

See more details on using hashes here.

File details

Details for the file molbloom-3.1.0-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: molbloom-3.1.0-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 9.2 MB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for molbloom-3.1.0-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 d954e4c0f14c7dfc0e55915021d8502264932d5bf29dc24215c8d2efb83c5e72
MD5 2c56e1fa7c9afa6a814126c508ce07ea
BLAKE2b-256 04d3b02f3c99c1d9ba16421a2ebe5ffe1eb7bebd9868ae3c5e7581686d7e1508

See more details on using hashes here.

File details

Details for the file molbloom-3.1.0-cp311-cp311-win32.whl.

File metadata

  • Download URL: molbloom-3.1.0-cp311-cp311-win32.whl
  • Upload date:
  • Size: 9.2 MB
  • Tags: CPython 3.11, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for molbloom-3.1.0-cp311-cp311-win32.whl
Algorithm Hash digest
SHA256 140bfccbcf2ec5bb625fe1a8b1ee5fd0075f9e5725ba4933d033d05ce0256bdd
MD5 12ce5b17040f0a4498345d503b1f740a
BLAKE2b-256 4a8313fbfe797a5b63cb101764ad391c1aeb30e3fa39d56ecc49f586f43e9d0b

See more details on using hashes here.

File details

Details for the file molbloom-3.1.0-cp311-cp311-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for molbloom-3.1.0-cp311-cp311-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 8ab1b12ce7821f16d0b5e22ec4193566d4244747c1666128908ab3f102ae3972
MD5 ebf009815d123abd9a0eeb449aff0a4c
BLAKE2b-256 4a5869f79338ee0d272e6b5fbf3c91976b59b9b72db4638b36ed631e1a9c39e7

See more details on using hashes here.

File details

Details for the file molbloom-3.1.0-cp311-cp311-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for molbloom-3.1.0-cp311-cp311-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 e7b03269ec878ee70ddad0b05076cae24d767c7b4778e91472d196bdfa18ded5
MD5 9e939158625d153fb30795874c8a5260
BLAKE2b-256 b7e784d271277a9192a34e2773e3500abd7e042ec8acbf958838a6582a813bc1

See more details on using hashes here.

File details

Details for the file molbloom-3.1.0-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for molbloom-3.1.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 3f8394854151a0cf2b05637eebd41123f99fe11488e11377e3d0c60dfd80fb82
MD5 20cc26eba79d18d3ec79151770ed70e3
BLAKE2b-256 be43d1e53b480781c910178a12664dcebd3275f48b858d5a1178f8851a88b67e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page