Open Reading Frame finder natively coded in Python.
Project description
ORFFinder
ORFFinder in Python. Inspired by NCBI's version: https://www.ncbi.nlm.nih.gov/orffinder/
Finds the open reading frame (6-frame scan) on a given 5' to 3' nucleotide.
Installation:
pip3 install orffinder
Terminal Usage
Three command-line executable commands are available: orffinder-to-gtf
orffinder-to-sequence
and orffinder-to-gff3
.
Documentation for these commands can be retrieved by specifying <command> -h
.
API Usage
Import the package
IMPORTANT: Your DNA/RNA strand should always be from the 5' to 3' direction when input! (Image credit: Khan Academy)
from Bio import SeqIO
from orffinder import orffinder
sequence = SeqIO.read("gene.fasta", "fasta")
orffinder.getORFs(sequence, minimum_length=75, remove_nested=True)
Documentation
getORFs()
Returns the loci of discovered ORFs in a dictionary format.
sequence: sequence in Biopython Seq or String format.
minimum_length: minimum size of ORF in nucleotides. Default: 75
start_codons: recognised 3-base-pair codons for initialisation. Default: ["ATG"]
stop_codons: recognised 3-base pair condons for termination. Default: ["TAA", "TAG", "TGA"]
remove_nested: remove all ORFs completely encased in another. Default: False
trim_trailing: remove ORFs are the edge of the sequence that do not have a defined stop codon. Default: False
getORFNucleotides()
Returns a list of Biopython Seq objects or loci of discovered ORFs with Biopython Seq objects in a dictionary format.
sequence: sequence in Biopython Seq or String format.
return_loci: return the loci together with the nucleotide sequences. Default: False
minimum_length: minimum size of ORF in nucleotides. Default: 75
start_codons: recognised 3-base-pair codons for initialisation. Default: ["ATG"]
stop_codons: recognised 3-base pair condons for termination. Default: ["TAA", "TAG", "TGA"]
remove_nested: remove all ORFs completely encased in another. Default: False
trim_trailing: remove ORFs are the edge of the sequence that do not have a defined stop codon. Default: False
getORFProteins()
Returns a list of Biopython Seq objects or loci of discovered ORFs with Biopython Seq objects in a dictionary format.
sequence: sequence in Biopython Seq or String format.
translation_table: translation table as per BioPython. Default: 1
return_loci: return the loci together with the protein sequences. Default: False
minimum_length: minimum size of ORF in nucleotides. Default: 75
start_codons: recognised 3-base-pair codons for initialisation. Default: ["ATG"]
stop_codons: recognised 3-base pair condons for termination. Default: ["TAA", "TAG", "TGA"]
remove_nested: remove all ORFs completely encased in another. Default: False
trim_trailing: remove ORFs are the edge of the sequence that do not have a defined stop codon. Default: False
Dependencies
Biopython (https://biopython.org/)
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file orffinder-1.8.tar.gz
.
File metadata
- Download URL: orffinder-1.8.tar.gz
- Upload date:
- Size: 5.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.23.4 CPython/3.6.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1c5a6290122d7806049d57d9637711ab24c64f7a67405a6c61d1295bad237f4c |
|
MD5 | 94a861b04c55c65f3005530ad72bde65 |
|
BLAKE2b-256 | 620e61528cf7210ce12e84a23ba62f7ef4cffd34bb346752fc6b71019352920c |
File details
Details for the file orffinder-1.8-py3-none-any.whl
.
File metadata
- Download URL: orffinder-1.8-py3-none-any.whl
- Upload date:
- Size: 8.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.23.4 CPython/3.6.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2ae64e3cb7a214c0ee4214fa9bf1f69b8c886da7cd53cbda4b3b5f74efcac229 |
|
MD5 | ba14cbc949d8acd444eb37bd4565c596 |
|
BLAKE2b-256 | 05947f3c6146f4c20377c07ef09a44f9237067adff1e37e444a5b5b0b28bdb23 |