Skip to main content

Graph-based encoder algorithm

Project description

CARATE

Downloads License: GPL v3 Python Versions Style Black

Why

Molecular representation is wrecked. Seriously! We chemists talk with an ancient language about something we can't comprehend with that language for decades. It has to stop.

What

The success of transformer models is evident. Applied to molecules we need a graph-based transformer. Such models can then learn hidden representations of a molecule better suited to describe a molecule.

For a chemist it is quite intuitive but seldomly modelled as such: A molecule exhibits properties through its combined electronic and structural features

Scope

The aim is to implement the algorithm in a reusable way, e.g. for the chembee pattern. Actually, the chembee pattern is mimicked in this project to provide a stand alone tool. The overall structure of the program is reusable for other deep-learning projects and will be transferred to an own project that should work similar to opinionated frameworks.

Installation on CPU

Prepare system

sudo apt-get install python3-dev libffi-dev
pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cpu 
pip install torch-scatter torch-sparse torch-geometric rdkit-pypi networkx[default] matplotlib
pip install torch-cluster 
pip install torch-spline-conv 

Usage

bash install.sh
carate -c path_to_config_file.py

Example when CLI is not working

 python mcf.py

Training results

Most of the training results are saved in pairs. The reason for this data structure is simply that the training can be interrupted for any reason. However the current result may still be saved or sent across a given network.

Therefore any ETL or data processing might not be affected by any interruption on the training machine.

Outlook

The program is meant to be run as a simple CLI. Not quite there yet.

Cite

There is a preprint available on bioRxiv. Read the preprint

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

carate-0.1.4.tar.gz (32.6 kB view details)

Uploaded Source

Built Distribution

carate-0.1.4-py3-none-any.whl (41.0 kB view details)

Uploaded Python 3

File details

Details for the file carate-0.1.4.tar.gz.

File metadata

  • Download URL: carate-0.1.4.tar.gz
  • Upload date:
  • Size: 32.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.6

File hashes

Hashes for carate-0.1.4.tar.gz
Algorithm Hash digest
SHA256 4fe7bc0a9645a1a2302f51f50ffb93d8a0274043f887ae073639cab5d65a43ad
MD5 211c05a9096649878a1a7f0d883717b5
BLAKE2b-256 2df33dafbaa632d247f34cc740061a4eb489e62fafa776e1ee940276f0119433

See more details on using hashes here.

File details

Details for the file carate-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: carate-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 41.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.6

File hashes

Hashes for carate-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 00819e012910a5fd2c0dc6bc51c3e259c6f7c23ff99c7b522a845e7bf8781868
MD5 a1a8508451fe3e0d43e60181936812d1
BLAKE2b-256 582d418e5bdde8ec00ad5e76ce444d083f16f3272947c933343b3cabc2ea5b86

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page