Extract genome ferature sequence for biologists
Project description
Overview
The featurExtract is python package for bioinformatics.
The packages contains two executable command programs.
The first executable program is featurExtract including
ten subroutines termed create, gene, promoter, UTR, uORF,
CDS, dORF, exon, intron, intergenic. The create subroutine is
used for creating database. The promoter subroutine is used
for extracting promoter sequence. uORF subroutine is used
for extracting upstream open reading frames sequence. UTR
subroutine is used for extracting untranslated region sequence.
CDS subroutine is used for extracting coding sequence.intergenic
subroutine is used for extracting intergenic sequence between two
genes. The second executable program is genBankExtract including
four subroutines termed gene, CDS, rRNA, tRNA.
Brief introduction of featurExtract package
Install
Two way offer to install featurExtract module.
install command line
pip install featurExtract
# other
git clone https://github.com/SitaoZ/featurExtract.git
cd featurExtract
python setup.py install
Requirements
python >= 3.7.6 python
pandas >= 1.2.4 pandas
gffutils >= 0.10.1 gffutils
setuptools >= 49.2.0 setuptools
biopython >= 1.78 biopython
Usage
featurExtract is designed for GFF and GTF file
and GenBankExtract is suited for GenBank file.
featurExtract
# gff or gtf database
which featurExtract
featurExtract -h
featurExtract create -h
featurExtract promoter -h
featurExtract UTR -h
featurExtract uORF -h
featurExtract CDS -h
featurExtract dORF -h
featurExtract exon -h
featurExtract intron -h
featurExtract intergenic -h
genBankExtract
# GenBank database
which genBankExtract
genBankExtract -h
genBankExtract gene -h
genBankExtract CDS -h
genBankExtract rRNA -h
genBankExtract tRNA -h
Examples
featurExtract
# step 1 create database
featurExtract create -f GFF -g ath.gff3 -o ath
# step 2 command
# promoter whole genome
featurExtract promoter -d ath.GFF -f ath.fa -l 200 -u 100 -o promoter.csv --output_format fasta
# promoter one gene to stdout
featurExtract promoter -d ath.GFF -f ath.fa -l 200 -u 100 -g AT1G01010 -p --output_format fasta
featurExtract UTR -d ath.GFF -f ath.fa -o UTR.csv -s GFF
featurExtract uORF -d ath.GFF -f ath.fa -o uORF.csv -s GFF
featurExtract CDS -d ath.GFF -f ath.fa -o CDS.csv -s GFF
featurExtract mRNA -d ath.GFF -f ath.fa -o mRNA.fasta -s GFF --output_format fasta
featurExtract exon -d ath.GFF -f ath.fa -t AT1G01010.1 -p -s GFF
featurExtract intron -d ath.GFF -f ath.fa -t AT1G01010.1 -p -s GFF
genBankExtract
# GenBank step 3
genBankExtract gene -g NC_000932.gb -f dna -p
genBankExtract CDS -g NC_000932.gb -f dna -p
genBankExtract rRNA -g NC_000932.gb -f dna -p
genBankExtract tRNA -g NC_000932.gb -f dna -p
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file featurExtract-0.2.4.6.tar.gz
.
File metadata
- Download URL: featurExtract-0.2.4.6.tar.gz
- Upload date:
- Size: 18.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/4.6.1 pkginfo/1.5.0.1 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.42.1 CPython/3.7.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 04103d5534b3c32f40969ca0767b55037bf1e91dd452b19925d86b788002251f |
|
MD5 | 43aef2300edc7c15d40e9efb64adc5df |
|
BLAKE2b-256 | 393e466b52c1c2f5be442f2672889850aa55995a9e37ee6bfed3e7f09b52f22e |