Region-aware GFF annotation integration toolkit
Project description
gffkit
gffkit is a lightweight toolkit for region-aware GFF/GTF annotation integration.
It combines three utilities:
detect-bridge: detect suspicious merged-gene artifacts caused by bridge transcripts.complement: complement/merge annotations, with optional region-swap mode.add-utr: reconstructfive_prime_UTRandthree_prime_UTRfeatures from exon/CDS coordinates.
Installation
pip install gffkit
Quick start
Full integration pipeline
gffkit integrate \
--annotation-a EviAnn.gff3 \
--annotation-b ANNEVO.gff3 \
--outdir gffkit_out \
--prefix sample
Outputs:
gffkit_out/sample.suspicious.tsvgffkit_out/sample.merged.gff3gffkit_out/sample.final.withUTR.gff3
Step-by-step usage
# 1. Detect suspicious merged genes in Annotation A
gffkit detect-bridge -i EviAnn.gff3 -o suspicious.tsv
# 2. Use A as the global reference, but switch to B in suspicious regions
gffkit complement \
--ref EviAnn.gff3 \
--add ANNEVO.gff3 \
--swap_region_tsv suspicious.tsv \
--swap_region_flank 100 \
--output merged.gff3
# 3. Add UTR features
gffkit add-utr -i merged.gff3 -o final.annotation.withUTR.gff3
Command overview
gffkit --help
gffkit detect-bridge --help
gffkit complement --help
gffkit add-utr --help
gffkit integrate --help
Annotation integration strategy
- Annotation A, for example EviAnn/RNA-seq-supported GFF, is used as the global primary reference.
- Annotation B, for example ANNEVO/deep-learning GFF, is used as the local primary reference only in suspicious merged-gene regions.
- UTR features are reconstructed after merging using an exon-minus-CDS strategy.
License
MIT License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
gffkit-0.1.0.tar.gz
(23.7 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
gffkit-0.1.0-py3-none-any.whl
(25.6 kB
view details)
File details
Details for the file gffkit-0.1.0.tar.gz.
File metadata
- Download URL: gffkit-0.1.0.tar.gz
- Upload date:
- Size: 23.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.7.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cff9d1b3be511e7fb01870ed52a3a877f6836f0952a4461934cf8b72ba7eeaef
|
|
| MD5 |
913aa31bdd4233239f1466fe8e82fb80
|
|
| BLAKE2b-256 |
ddc0a26f08fef0ce031d31c4a999da08899e29e4b28e1b480e2252fe651f717a
|
File details
Details for the file gffkit-0.1.0-py3-none-any.whl.
File metadata
- Download URL: gffkit-0.1.0-py3-none-any.whl
- Upload date:
- Size: 25.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.7.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
aabd8de41ae628f4b4ca3dbcd4d9ae41a4801c80f348f73bf138bfd6823d65e1
|
|
| MD5 |
c9c06dfd7f293a7d91e8f33599644e85
|
|
| BLAKE2b-256 |
9a08c93ed61f36641f842abe5ca943c93fbb4c6d72f43d73d2527ebe863b22a7
|