Skip to main content

Optimizer for degenerate codon use in phage library generation

Project description

Phagetrix

PyPI version Python License CI Open In Colab DOI

A powerful codon optimization tool for phage display library generation and protein engineering.

Phagetrix helps researchers design optimal degenerate codon libraries for phage display, directed evolution, and synthetic biology applications. Maximize your library diversity while staying within experimental constraints.

Table of Contents

Key Features

  • Intelligent codon optimization - Automatically selects the best degenerate codons for your amino acid requirements
  • Library statistics - Calculate theoretical diversity and material requirements
  • Multi-vendor support - Compatible with IDT, Eurofins, and NEB degenerate codon sets
  • Species-specific - Supports codon usage tables for multiple organisms
  • Easy to use - Simple file format and command-line interface
  • Python integration - Use as a library in your bioinformatics pipelines

Use Cases

  • Phage display library design - Optimize antibody/peptide libraries
  • Directed evolution - Design mutagenesis libraries for protein engineering
  • Synthetic biology - Create diverse protein variants for screening
  • Molecular biology research - Plan degenerate PCR experiments

The Library Diversity Problem

When creating phage display libraries, you're limited by experimental constraints:

  • 1 liter of phage solution10¹² different sequences
  • Random mutagenesis: 20⁹ ≈ 10¹² permutations (only ~9 variable positions)
  • Smart degenerate codons: 6¹⁵ ≈ 10¹² permutations (~15 variable positions!)

Phagetrix maximizes your library diversity by intelligently selecting degenerate codons from manufacturers like IDT, Eurofins, and NEB, allowing you to target more positions with rational amino acid choices.

Quick Start

Python Library (Recommended)

import phagetrix

# Optimize degenerate codons for your sequence
result = phagetrix.optimize_codons(
    sequence="VLAYMVAQVQ",
    variations={3: "AGVIL", 4: "YFW", 7: "AVIL"}
)

print("Optimized DNA sequence:", result["final_sequence"])
print("Efficiency per position:", result["efficiency"])

Command Line Interface

Create a simple text file specifying your target sequence and desired variations:

VLAYMVAQVQ
A3AGVIL
Y4YFW
A7AVIL

Run Phagetrix:

phagetrix input.txt

Output

   1   2   3   4   5   6   7   8   9  10
   V   L   A   Y   M   V   A   Q   V   Q
 GTT CTT VBA TDK ATG GTT VYA CAG GTT CAG   degenerate codons
          56  50          67               percentage on target
  1V  1L  1V  1Y  1M  1V  1V  1Q  1V  1Q
          1L  1W          1L
          1I  1F          1I
          1G  --          1A
          1A  1L          --
          --  1C          1T
          2R  1*          1P

Final sequence: GTTCTTVBATDKATGGTTVYACAGGTTCAG

Output includes:

  • Degenerate codons (VBA, TDK, etc.) optimized for your requirements
  • Efficiency percentages showing on-target vs off-target products
  • Amino acid breakdown for each position
  • Ready-to-order sequence for DNA synthesis

Installation

Using pip (recommended)

pip install phagetrix

Using Poetry (for development)

git clone https://github.com/retospect/phagetrix.git
cd phagetrix
poetry install
poetry run phagetrix --help

Try Online

Open in Google Colab

Try Phagetrix interactively in your browser with comprehensive examples!

Requirements: Python 3.10 or higher

Library Usage

Common Functions

import phagetrix

# Get available companies and species
companies = phagetrix.get_available_companies()
species = phagetrix.get_available_species()

# Parse existing Phagetrix files
seq, variations, config = phagetrix.parse_phagetrix_file("input.phagetrix")

# Calculate library statistics
stats = phagetrix.calculate_library_stats("ACDEF", {1: "AG", 3: "DEF"})
print(f"Library diversity: {stats['diversity']:,} variants")

# Compare different companies
for company in ["IDT", "Eurofins", "NEB"]:
    result = phagetrix.optimize_codons("ACDEF", {1: "AG"}, company=company)
    print(f"{company}: {result['final_sequence']}")

Batch Processing

# Process multiple sequences
sequences = [
    ("CDR1", "RASQSISSWLA", {4: "QE", 6: "ST"}),
    ("CDR2", "AASSLQS", {3: "ST", 5: "LI"}),
    ("CDR3", "QQSYSTPLT", {3: "ST", 7: "PT"})
]

for name, seq, vars in sequences:
    result = phagetrix.optimize_codons(seq, vars)
    print(f"{name}: {result['final_sequence']}")

For complete examples, see Library Usage Guide and run:

python examples/library_examples.py

Advanced Features

Custom Numbering

Add position offsets for working with longer sequences:

# offset = 20
VLAYMVAQVQ
A23AGVIL  # Position 23 in the full protein

Multiple Vendors

Choose your preferred DNA synthesis company:

phagetrix --company IDT input.txt      # Default
phagetrix --company Eurofins input.txt
phagetrix --company NEB input.txt

Species-Specific Codon Usage

Optimize for different organisms:

phagetrix --species e_coli input.txt           # Default
phagetrix --species h_sapiens_9606 input.txt  # Human
phagetrix --species s_cerevisiae_4932 input.txt  # Yeast

Documentation & Support

Citation

If you use Phagetrix in your research, please cite:

@software{phagetrix,
  title = {Phagetrix: Codon optimization for phage display libraries},
  author = {Stamm, Reto},
  doi = {10.5281/zenodo.7676572},
  url = {https://github.com/retospect/phagetrix}
}

Related Tools

  • varVAMP - Primers for highly variable genomes
  • Biopython - Python bioinformatics toolkit

Acknowledgments

This package has been enhanced and maintained with assistance from Windsurf, an AI-powered development environment that helped implement modern development practices, comprehensive testing, type safety, security scanning, and automated CI/CD workflows.

License

This project is licensed under the GPL-3.0-or-later License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

phagetrix-1.0.6.tar.gz (28.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

phagetrix-1.0.6-py3-none-any.whl (28.0 kB view details)

Uploaded Python 3

File details

Details for the file phagetrix-1.0.6.tar.gz.

File metadata

  • Download URL: phagetrix-1.0.6.tar.gz
  • Upload date:
  • Size: 28.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for phagetrix-1.0.6.tar.gz
Algorithm Hash digest
SHA256 17530daeecb4fa28b70c197a4c5efd57930661040c61b4cc06e87e259f14c8e4
MD5 c28160c51f0b04e0ee9ff40a900fb063
BLAKE2b-256 5270545fe91238d9dcda44a39d9130dd665830d3fa5710d637720c023957e1d5

See more details on using hashes here.

Provenance

The following attestation bundles were made for phagetrix-1.0.6.tar.gz:

Publisher: version-bump.yml on retospect/phagetrix

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file phagetrix-1.0.6-py3-none-any.whl.

File metadata

  • Download URL: phagetrix-1.0.6-py3-none-any.whl
  • Upload date:
  • Size: 28.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for phagetrix-1.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 f1742e0597a844f0239c538c06dd3192d49e58ce880b1818970ba599abe5d524
MD5 4213df8f41d3f79378ab94a86f55fb8d
BLAKE2b-256 dfd0f4708601a4248224013a4dcb30dcbf8f2d58d8171beb195a912050facae5

See more details on using hashes here.

Provenance

The following attestation bundles were made for phagetrix-1.0.6-py3-none-any.whl:

Publisher: version-bump.yml on retospect/phagetrix

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page