Frame2seq for protein sequence design
Project description
Frame2seq
Official repository for Frame2seq, a structured-conditioned masked language model for protein sequence design, as described in our preprint Structure-conditioned masked language models for protein sequence design generalize beyond the native sequence space.
Colab notebook
Colab notebook for generating sequences with Frame2seq:
Setup
To use Frame2seq, install via pip:
pip install frame2seq
Usage
Sequence design
To use Frame2seq to generate sequences, you can use the design
function.
from frame2seq import Frame2seqRunner
runner = Frame2seqRunner()
runner.design(pdb_file, chain_id, temperature, num_samples)
Arguments
pdb_file
: Path to PDB file.chain_id
: Chain ID of protein.temperature
: Sampling temperature.num_samples
: Number of sequences to sample.save_neg_pll
: Whether to save the per-residue negative log-likelihoods of the sampled sequences.verbose
: Whether to print the sampled sequences and time taken for sampling.
Outputs (.fasta of sampled sequence)
>pdbid=2fra chain_id=A recovery=62.67% score=0.83 temperature=1.0
PPSSVDWRDLGCITDVLDMGGCGACWAFSAVGALEARTTQKTGELTRLSAQDLVDCAREKYGNEGCDGGRMKSSFQFIIDKNGIDSHQAYPFTASDQECLYNSKYKAATCTDYTVLPEGDEDKLREAVSNVGPVAVGIDATHPEFRNFKSGVYHDPKCTTETNHGVLVVGYGTLKGKRFYKVKTCWGTYFGEDGFIRVAKNQGNHCGISTDPSYPEM
Citing this work
@article{akpinaroglu2023structure,
title={Structure-conditioned masked language models for protein sequence design generalize beyond the native sequence space},
author={Akpinaroglu, Deniz and Seki, Kosuke and Guo, Amy and Zhu, Eleanor and Kelly, Mark JS and Kortemme, Tanja},
journal={bioRxiv},
pages={2023--12},
year={2023},
publisher={Cold Spring Harbor Laboratory}
}
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
frame2seq-0.0.7.tar.gz
(86.4 MB
view hashes)
Built Distribution
frame2seq-0.0.7-py3-none-any.whl
(86.4 MB
view hashes)
Close
Hashes for frame2seq-0.0.7-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 56fa684c89d8e2994a81e3a0e3c4840ee15ecb8dfd2e3f504ce47e265344dc7b |
|
MD5 | 9e267aca920c4518f39cccf6ca281d46 |
|
BLAKE2b-256 | a082721668b4350c847931ca350e20c1e01d95a1ca9aeeb5984bdf7089dc2e6f |