Skip to main content

Sandbox for Computational Protein Design

Project description

                          _____________________.___.____    .____     
                          \__    ___/\______   \   |    |   |    |    
                            |    |    |       _/   |    |   |    |    
                            |    |    |    |   \   |    |___|    |___ 
                            |____|    |____|_  /___|_______ \_______ \
                                             \/            \/       \/

pypi version Downloads license Documentation Status

Intro

TRILL (TRaining and Inference using the Language of Life) is a sandbox for creative protein engineering and discovery. As a bioengineer myself, deep-learning based approaches for protein design and analysis are of great interest to me. However, many of these deep-learning models are rather unwieldy, especially for non ML-practitioners due to their sheer size. Not only does TRILL allow researchers to perform inference on their proteins of interest using a variety of models, but it also democratizes the efficient fine-tuning of large-language models. Whether using Google Colab with one GPU or a supercomputer with many, TRILL empowers scientists to leverage models with millions to billions of parameters without worrying (too much) about hardware constraints. Currently, TRILL supports using these models as of v1.9.0:

Breakdown of TRILL's Commands

Command Function Available Models
Embed Generates numerical representations or "embeddings" of biological sequences for quantitative analysis and comparison. Can be Small-Molecule SMILES/RNA/DNA/Proteins dpending on the model. ESM2, MMELLON, MolT5, ProtT5-XL, ProstT5, Ankh, CaLM, mRNA-FM/RNA-FM, SaProt, SELFIES-TED, SMI-TED
Visualize Creates interactive 2D visualizations of embeddings for exploratory data analysis. PCA, t-SNE, UMAP
Finetune Finetunes protein language models for specific tasks. ESM2, ProtGPT2, ZymCTRL, ProGen2
Language Model Protein Generation Generates proteins using pretrained language models. ESM2, ProtGPT2, ZymCTRL, ProGen2
Inverse Folding Protein Generation Designs proteins to fold into specific 3D structures. ESM-IF1, LigandMPNN, ProstT5
Diffusion Based Protein Generation Uses denoising diffusion models to generate proteins. Genie2, RFDiffusion
Fold Predicts 3D protein structures. ESMFold, ProstT5, Chai-1, Boltz-2
Dock Simulates protein-ligand interactions. DiffDock-L, Smina, Autodock Vina, Gnina, Lightdock, GeoDock
Classify Predicts properties with pretrained models or train custom classifiers CataPro, CatPred, M-Ionic, PSICHIC, PSALM, TemStaPro, EpHod, ECPICK, LightGBM, XGBoost, Isolation Forest, End-to-End Finetuning of ESM2 with a Multilayer perceptron head
Regress Train custom regression models. LightGBM, Linear
Simulate Uses molecular dynamics to simulate biomolecular interactions followed by automated scoring OpenMM, MMGBSA, ProLIF
Score Utilize ESM1v or ESM2 to score protein sequences or ProteinMPNN/LigandMPNN/SCASA to score protein structures/complexes in a zero-shot manner. COMPSS, SC
Workflow Automated protein design workflows. Foldtuning

Documentation

Check out the documentation and examples at https://trill.readthedocs.io/en/latest/index.html

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

trill_proteins-1.9.0.tar.gz (11.1 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

trill_proteins-1.9.0-py3-none-any.whl (11.1 MB view details)

Uploaded Python 3

File details

Details for the file trill_proteins-1.9.0.tar.gz.

File metadata

  • Download URL: trill_proteins-1.9.0.tar.gz
  • Upload date:
  • Size: 11.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.11.0 Linux/6.11.0-1015-azure

File hashes

Hashes for trill_proteins-1.9.0.tar.gz
Algorithm Hash digest
SHA256 73a1db21f208e408755240c02b1135654a9f709ecd3c1e4d873c25ab941154bb
MD5 e0c2ca4e372dd8bafae21b0b7f86357f
BLAKE2b-256 154ec264820c036b9a6dac6870df95aa9795e712abcb37ab283877a81cd73bb2

See more details on using hashes here.

File details

Details for the file trill_proteins-1.9.0-py3-none-any.whl.

File metadata

  • Download URL: trill_proteins-1.9.0-py3-none-any.whl
  • Upload date:
  • Size: 11.1 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.11.0 Linux/6.11.0-1015-azure

File hashes

Hashes for trill_proteins-1.9.0-py3-none-any.whl
Algorithm Hash digest
SHA256 05091e741f03ec9399031a1284d29f02d933215c192a1cd13cfc338fcee627c5
MD5 8c2172129d1d02d8bdc9d9d18acff219
BLAKE2b-256 f09b9309f4c346ff07700d30965cb7e4e642ac3ef808fc51d19afdd69f599b73

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page