Skip to main content

Filesystem handling utilities

Project description

CARATE

Downloads License: GPL v3 Python Versions Style Black

Bert goes into the karate club

Why

Molecular representation is wrecked. Seriously! We chemists talked for decades with an ancient language about something we can't comprehend with that language. We have to stop it, now!

What

The success of transformer models is evident. Applied to molecules we need a graph-based transformer. Such models can then learn hidden representations of a molecule better suited to describe a molecule.

For a chemist it is quite intuitive but seldomly modelled as such: A molecule exhibits properties through its combined electronic and structural features

Scope

The aim is to implement the algorithm in a reusable way, e.g. for the chembee pattern. Actually, the chembee pattern is mimicked in this project to provide a stand alone tool. The overall structure of the program is reusable for other deep-learning projects and will be transferred to an own project that should work similar to opinionated frameworks.

Installation on CPU

Prepare system

sudo apt-get install python3-dev libffi-dev
pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cpu 
pip install torch-scatter torch-sparse torch-geometric rdkit-pypi networkx[default] matplotlib
pip install torch-cluster 
pip install torch-spline-conv 

Usage

The program is meant to be run as a simple CLI. You can specify the configuration either via a JSON and use the program as a microservice, or you may run it locally from the command line. It is up to you.

bash install.sh
carate -c path_to_config_file.py

Examples for config.py files are given in config_files

Or you can check the the tutorial.ipynb in notebooks how to use the package with a .json file

Training results

Most of the training results are saved in pairs. The reason for this data structure is simply that the training can be interrupted for any reason. However the current result may still be saved or sent across a given network.

Therefore any ETL or data processing might not be affected by any interruption on the training machine.

Results

In case you can't wait for the picky scientist in me, you can still build on my intermediate results. You can find them in the following locations

Support the development

If you are happy about substantial progress in chemistry and life sciences that is not commercial first but cititzen first, well then just

Buy Me A Coffee

Cite

There is a preprint available on bioRxiv. Read the preprint

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

carate-0.2.5.tar.gz (37.4 kB view details)

Uploaded Source

Built Distribution

carate-0.2.5-py3-none-any.whl (42.0 kB view details)

Uploaded Python 3

File details

Details for the file carate-0.2.5.tar.gz.

File metadata

  • Download URL: carate-0.2.5.tar.gz
  • Upload date:
  • Size: 37.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.6

File hashes

Hashes for carate-0.2.5.tar.gz
Algorithm Hash digest
SHA256 326f99567ccdaf598916eb2c68bf4c1cc001f08f0e78adba1815d39ce32d6529
MD5 0fe2eb419f1246fe1b91e010d40fe908
BLAKE2b-256 f5c820f08d0306e090a7d6b02f7a1de4c9d2665e872bcde9dea9749376157b76

See more details on using hashes here.

File details

Details for the file carate-0.2.5-py3-none-any.whl.

File metadata

  • Download URL: carate-0.2.5-py3-none-any.whl
  • Upload date:
  • Size: 42.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.6

File hashes

Hashes for carate-0.2.5-py3-none-any.whl
Algorithm Hash digest
SHA256 cdbbe4dc0316c4a77155e88604c29165b78e25e369f3860e4b27476d87041460
MD5 12f772ee8f65313f61d5240869040e7a
BLAKE2b-256 424a37a5dd397299d77a84494600a496c935fa04991e60580a65825a8c5ed3fb

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page