Skip to main content

A python library to featurize molecules.

Project description


license GitHub Repo stars GitHub Repo stars test code-check doc release

Molfeat is a python library to simplify molecular featurization. It supports a wide variety of molecular featurizers out-of-the-box and can be easily extended to add your own.

  • :snake: Simple pythonic API.
  • :rocket: Fast and efficient featurization.
  • :arrows_counterclockwise: Unifies pre-trained embeddings and hand-crafted featurizers in a single package.
  • :heavy_plus_sign: Easily extend Molfeat with your own featurizers through plugins.
  • :chart_with_upwards_trend: Benefit from increased performance through a trouble-free caching system.

Visit our website at https://molfeat.datamol.io.

Installation

Installing Molfeat

Use mamba:

mamba install -c conda-forge molfeat

Tips: You can replace mamba by conda.

Note: We highly recommend using a Conda Python distribution to install Molfeat. The package is also pip installable if you need it: pip install molfeat.

Installing Plugins

The functionality of Molfeat can be extended through plugins. The usage of a plugin system ensures that the core package remains easy to install and as light as possible, while making it easy to extend its functionality with plug-and-play components. Additionally, it ensures that plugins can be developed independently from the core package, removing the bottleneck of a central party that reviews and approves new plugins. Consult the Molfeat documentation for more details on how to create your own plugins.

This, however, does imply that the installation of a plugin is plugin-dependent: Please consult its documentation to learn more.

Optional dependencies

Not all featurizers of the Molfeat core package are supported by default. Some featurizers require additional dependencies. If you try to use a featurizer that requires additional dependencies, Molfeat will raise an error and will tell you which dependencies are missing and how to install these.

API tour

import datamol as dm
from molfeat.calc import FPCalculator
from molfeat.trans import MoleculeTransformer
from molfeat.store.modelstore import ModelStore

# Load some dummy data
data = dm.data.freesolv().sample(500).smiles.values

# Featurize a single molecule
calc = FPCalculator("ecfp")
calc(data[0])

# Define a parallelized featurization pipeline
trans = MoleculeTransformer(calc, n_jobs=-1)
trans(data)

# Easily save and load featurizers
trans.to_state_yaml_file("state_dict.yml")
trans = MoleculeTransformer.from_state_yaml_file("state_dict.yml")
trans(data)

# List all availaible featurizers
store = ModelStore()
store.available_models

# Find a featurizer and learn how to use it
model_card = store.search(name="DeepChem-ChemBERTa-77M-MLM")[0]
model_card.usage()

# Load a featurizer through the store
trans, model_info = store.load(model_card)

How to cite

Please cite Molfeat if you use it in your research: DOI.

Changelogs

See the latest changelogs at CHANGELOG.rst.

License

Under the Apache-2.0 license. See LICENSE.

Authors

See AUTHORS.rst.

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

molfeat-0.8.0.tar.gz (240.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

molfeat-0.8.0-py3-none-any.whl (158.8 kB view details)

Uploaded Python 3

File details

Details for the file molfeat-0.8.0.tar.gz.

File metadata

  • Download URL: molfeat-0.8.0.tar.gz
  • Upload date:
  • Size: 240.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.11.2

File hashes

Hashes for molfeat-0.8.0.tar.gz
Algorithm Hash digest
SHA256 6226622d54b8836dc7364ae79077c25cda680d667967d986ced3f40909c44c7e
MD5 b330fe49d2e1a5f8b776066f1f96334c
BLAKE2b-256 856186b6b90782899092130a33a52e469babac5c8e7b308086bc34c77e080ef0

See more details on using hashes here.

File details

Details for the file molfeat-0.8.0-py3-none-any.whl.

File metadata

  • Download URL: molfeat-0.8.0-py3-none-any.whl
  • Upload date:
  • Size: 158.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.11.2

File hashes

Hashes for molfeat-0.8.0-py3-none-any.whl
Algorithm Hash digest
SHA256 dc7d559507794c1115ddd0b0ea52e4c9a4296abaaec6fbc66a4b9cd688411be7
MD5 4909c5fa19d53ccc012b88934a00d7b7
BLAKE2b-256 7cd33749e04c3a9e3a891afbfe50fecd477c0b2819ce1275ba08282f1e4175ea

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page