Skip to main content

Frame2seq for protein sequence design

Project description

Frame2seq

Official repository for Frame2seq, a structured-conditioned masked language model for protein sequence design, as described in our preprint Structure-conditioned masked language models for protein sequence design generalize beyond the native sequence space.

Colab notebook

Colab notebook for generating sequences with Frame2seq: Open In Colab

Setup

To use Frame2seq, install via pip:

pip install frame2seq

Usage

Sequence design

To use Frame2seq to generate sequences, you can use the design function.

from frame2seq import Frame2seqRunner

runner = Frame2seqRunner()
runner.design(pdb_file, chain_id, temperature, num_samples)

Arguments

  • pdb_file: Path to PDB file.
  • chain_id: Chain ID of protein.
  • temperature: Sampling temperature.
  • num_samples: Number of sequences to sample.
  • save_neg_pll: Whether to save the per-residue negative log-likelihoods of the sampled sequences.
  • verbose: Whether to print the sampled sequences and time taken for sampling.

Outputs (.fasta of sampled sequence)

>pdbid=2fra chain_id=A recovery=62.67% score=0.83 temperature=1.0
PPSSVDWRDLGCITDVLDMGGCGACWAFSAVGALEARTTQKTGELTRLSAQDLVDCAREKYGNEGCDGGRMKSSFQFIIDKNGIDSHQAYPFTASDQECLYNSKYKAATCTDYTVLPEGDEDKLREAVSNVGPVAVGIDATHPEFRNFKSGVYHDPKCTTETNHGVLVVGYGTLKGKRFYKVKTCWGTYFGEDGFIRVAKNQGNHCGISTDPSYPEM

Citing this work

@article{akpinaroglu2023structure,
  title={Structure-conditioned masked language models for protein sequence design generalize beyond the native sequence space},
  author={Akpinaroglu, Deniz and Seki, Kosuke and Guo, Amy and Zhu, Eleanor and Kelly, Mark JS and Kortemme, Tanja},
  journal={bioRxiv},
  pages={2023--12},
  year={2023},
  publisher={Cold Spring Harbor Laboratory}
}

DOI

zenodo

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

frame2seq-0.0.7.tar.gz (86.4 MB view hashes)

Uploaded Source

Built Distribution

frame2seq-0.0.7-py3-none-any.whl (86.4 MB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page