Automated annotation of engineered plasmids using sequence similarity searches
This project has been archived.
The maintainers of this project have marked this project as archived. No new releases are expected.
Project description
pLannotate-python
Automated annotation of engineered plasmids
pLannotate-python is a Python package for automatically annotating engineered plasmids using sequence similarity searches against curated databases. Fast, parallel processing with automatic database setup. All it is, is a python friendly wrapper around CLI tools. This means the CLI tools (and the databases they rely on) are required to be set up first.
Features
- Fast, parallel annotation: Uses Diamond, BLAST, and Infernal concurrently
- Multiple databases: Protein (fpbase, swissprot), nucleotide (snapgene), RNA (Rfam)
- Circular plasmid support: Handles origin-crossing features
- Automatic database setup: Downloads and configures databases (~900MB)
- Flexible output: GenBank files, CSV reports, or pandas DataFrames
Installation
# Install with uv (recommended)
uv add plannotate-python
# Or with pip
pip install plannotate-python
External Tools Required
# macOS (Homebrew)
brew install diamond blast infernal ripgrep
# Linux (conda/mamba)
conda install -c bioconda diamond blast infernal ripgrep
# Ubuntu/Debian
sudo apt install diamond-aligner ncbi-blast+ infernal ripgrep
SSL Certificate Fix (macOS)
If you encounter SSL certificate errors during database download:
# Replace X.Y with your Python version (e.g., 3.11)
open "/Applications/Python X.Y/Install Certificates.command"
"Quick" Start
Automatic Database Setup:
import os
os.environ["PLANNOTATE_AUTO_DOWNLOAD"] = "1" # Enable auto-download of databases
from plannotate.annotate import annotate
# First run will download databases (~900MB with progress bars)
>>> sequence="tgaccaggcatcaaataaaacgaaaggctcagtcgaaagactgggcctttcgttttatctgttgtttgtcggtgaacgctctctactagagtcacactggctcaccttcgggtgggcctttctgcgtttataggtctcaatccacgggtacgggtatggagaaacagtagagagttgcgataaaaagcgtcaggtagtatccgctaatcttatggataaaaatgctatggcatagcaaagtgtgacgccgtgcaaataatcaatgtggacttttctgccgtgattatagacacttttgttacgcgtttttgtcatggctttggtcccgctttgttacagaatgcttttaataagcggggttaccggtttggttagcgagaagagccagtaaaagacgcagtgacggcaatgtctgatgcaatatggacaattggtttcttgtaatcgttaatccgcaaataacgtaaaaacccgcttcggcgggtttttttatggggggagtttagggaaagagcatttgtcatttgtttatttttctaaatacattcaaatatgtatccgctcatgagacaataaccctgataaatgcttcaataatattgaaaaaggaagagtatgagtattcaacatttccgtgtcgcccttattcccttttttgcgg" # Your plasmid sequence
>>> result = annotate(sequence, linear=False) # False for circular plasmids
>>> result
qstart qend sseqid pident slen qseq length ... wiggle wstart wend kind qstart_dup qend_dup fragment
0 523 615 AmpR_promoter_(5) 100.000 92 TTTGTTTATTTTTCTAAATACATTCAAATATGTATCCGCTCATGAG... 92 ... 13 536 601 1 523 614 False
1 11 83 rrnB_T1_terminator 100.000 72 CAAATAAAACGAAAGGCTCAGTCGAAAGACTGGGCCTTTCGTTTTA... 72 ... 10 21 72 1 11 82 False
2 155 440 araBAD_promoter 99.649 285 ATGGAGAAACAGTAGAGAGTTGCGATAAAAAGCGTCAGGTAGTATC... 285 ... 42 197 397 1 816 1100 False
3 98 126 T7Te_terminator 100.000 28 GGCTCACCTTCGGGTGGGCCTTTCTGCG 28 ... 4 102 121 1 98 125 False
4 615 661 AmpR 100.000 861 ATGAGTATTCAACATTTCCGTGTCGCCCTTATTCCCTTTTTTGCGG 46 ... 6 621 654 1 615 660 True
[5 rows x 28 columns]
Manual Database Setup:
from plannotate.resources import download_db
download_db() # Downloads with progress bars and SSL error handling
Generate GenBank Files:
from plannotate.resources import get_gbk
gbk_content = get_gbk(result, sequence, is_linear=False)
with open("my_plasmid.gbk", "w") as f:
f.write(gbk_content)
Configuration
Environment Variables:
PLANNOTATE_AUTO_DOWNLOAD=1- Auto-download databases without promptingPLANNOTATE_DB_DIR=/path- Custom database directoryPLANNOTATE_SKIP_DB_DOWNLOAD=1- Skip database downloads entirely
Core Functions:
annotate(sequence, linear=False)- Annotate DNA sequenceget_gbk(annotations, sequence)- Generate GenBank filedownload_db()- Download databases with progress bars
Troubleshooting
SSL Certificate Errors: Run the SSL certificate fix command above
Empty Results: Sequence may not match database features
Tool Errors: Ensure external tools are installed and in PATH
Citation
If you use pLannotate-python in your research, please cite the original pLannotate paper:
McGuffin, M.J., Thiel, M.C., Pineda, D.L. et al. pLannotate: automated annotation of engineered plasmids. Nucleic Acids Research (2021).
License
This project is licensed under the GPL v3 License - see the LICENSE file for details.
Links
- Original pLannotate: https://github.com/mmcguffi/pLannotate
- Web server: http://plannotate.barricklab.org/
- This Fork: https://github.com/McClain-Thiel/pLannotate
- Issues: https://github.com/McClain-Thiel/pLannotate/issues
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file plannotate_python-1.2.8.tar.gz.
File metadata
- Download URL: plannotate_python-1.2.8.tar.gz
- Upload date:
- Size: 27.3 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
57913e27892fe71addf5ad0994a7888afeba13e8c64f792ccbca2c0326cef7c6
|
|
| MD5 |
2d904f8026cabf720519f35a06f7eac6
|
|
| BLAKE2b-256 |
da62b93b9815b66b7db04fb534e002f620681fc6e258ea5cb335f72628d8a41d
|
File details
Details for the file plannotate_python-1.2.8-py3-none-any.whl.
File metadata
- Download URL: plannotate_python-1.2.8-py3-none-any.whl
- Upload date:
- Size: 27.2 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e149cfcf56b31e48cf9dbb4f8f4803aa445d2078eef30880fd69df05adaa7023
|
|
| MD5 |
ebd3b8155f1d28a4b4b08ea1362358dd
|
|
| BLAKE2b-256 |
e1816a946bef5dec4d8f40c488a9bd1850c037b282fa1918fe40083b1bd3d54e
|