GFFtk: genome annotation GFF3 tool kit
Project description
GFFtk: genome annotation tool kit
GFFtk is a comprehensive toolkit for working with genome annotation files in GFF3, GTF, and TBL formats. It provides powerful conversion, filtering, and manipulation capabilities for genomic data.
Features
- Format Conversion: Convert between GFF3, GTF, TBL, and GenBank formats
- Combined GFF3+FASTA: Support for combined files containing both annotations and sequences
- Sequence Extraction: Extract protein and transcript sequences from annotations
- Advanced Filtering: Filter annotations using flexible regex patterns
- Consensus Models: Generate consensus gene models from multiple sources
- Non-Standard Features: Support for intron, noncoding_exon, five_prime_UTR_intron, and pseudogenic_exon features
- File Manipulation: Sort, sanitize, and rename features in annotation files
Installation
To install release versions use the pip package manager:
python -m pip install gfftk
To install the most updated code in master you can run:
python -m pip install git+https://github.com/nextgenusfs/gfftk.git
Quick Start
Basic Format Conversion
# Convert GFF3 to GTF
gfftk convert -i input.gff3 -f genome.fasta -o output.gtf
# Extract protein sequences
gfftk convert -i input.gff3 -f genome.fasta -o proteins.faa --output-format proteins
Combined GFF3+FASTA Format
# Create a combined file from separate GFF3 and FASTA files
gfftk convert -i input.gff3 -f genome.fasta -o combined.gff --output-format combined
# Read a combined file (no separate FASTA file needed)
gfftk convert -i combined.gff -o output.gff3 --output-format gff3
Advanced Filtering
# Keep only kinase genes
gfftk convert -i input.gff3 -f genome.fasta -o kinases.gff3 --grep product:kinase
# Remove augustus predictions
gfftk convert -i input.gff3 -f genome.fasta -o filtered.gff3 --grepv source:augustus
# Case-insensitive filtering with regex
gfftk convert -i input.gff3 -f genome.fasta -o results.gff3 --grep product:KINASE:i
# Combined filtering
gfftk convert -i input.gff3 -f genome.fasta -o filtered.gff3 \
--grep product:kinase --grepv source:augustus
Filter Pattern Syntax
key:pattern- Basic string matchingkey:pattern:i- Case-insensitive matchingkey:regex- Regular expression patterns- Multiple
--grepor--grepvflags for complex filtering
Common filter keys: product, source, name, note, contig, strand, type, db_xref, go_terms
For more examples and detailed documentation, see the tutorial.
Development
Code Formatting
This project uses pre-commit to ensure code quality and consistency. The pre-commit hooks run Black (code formatter), isort (import sorter), and flake8 (linter).
To set up pre-commit:
- Install pre-commit:
pip install pre-commit
- Install the git hooks:
pre-commit install
- (Optional) Run against all files:
pre-commit run --all-files
After installation, the pre-commit hooks will run automatically on each commit to ensure your code follows the project's style guidelines.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file gfftk-26.2.12.tar.gz.
File metadata
- Download URL: gfftk-26.2.12.tar.gz
- Upload date:
- Size: 4.7 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7dfc3a215cee322282f7ea271ffbcd859bd493ff3494232f03335e64dec6fa61
|
|
| MD5 |
b4eb1812632296c0b6e706258b7bedb3
|
|
| BLAKE2b-256 |
9efb1867924b2b3c7d5502a645672b7dfe0e23a51875048b78e9dd4a48a66613
|
Provenance
The following attestation bundles were made for gfftk-26.2.12.tar.gz:
Publisher:
production-release.yml on nextgenusfs/gfftk
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
gfftk-26.2.12.tar.gz -
Subject digest:
7dfc3a215cee322282f7ea271ffbcd859bd493ff3494232f03335e64dec6fa61 - Sigstore transparency entry: 947009199
- Sigstore integration time:
-
Permalink:
nextgenusfs/gfftk@41f4054768f3c72c2fde68da80481a7601bb6d4e -
Branch / Tag:
refs/heads/main - Owner: https://github.com/nextgenusfs
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
production-release.yml@41f4054768f3c72c2fde68da80481a7601bb6d4e -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file gfftk-26.2.12-py3-none-any.whl.
File metadata
- Download URL: gfftk-26.2.12-py3-none-any.whl
- Upload date:
- Size: 4.8 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5486def017a1ce67bd43378252bf174dd1f40e5ee3d01dcdf0101c814271b5f9
|
|
| MD5 |
57afccf5b6d7e18d53e837fb325c8c5b
|
|
| BLAKE2b-256 |
a6f6bfefd690a4d6dbf19932bb271bdf30ba57174cc104419a12dde448f950c4
|
Provenance
The following attestation bundles were made for gfftk-26.2.12-py3-none-any.whl:
Publisher:
production-release.yml on nextgenusfs/gfftk
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
gfftk-26.2.12-py3-none-any.whl -
Subject digest:
5486def017a1ce67bd43378252bf174dd1f40e5ee3d01dcdf0101c814271b5f9 - Sigstore transparency entry: 947009200
- Sigstore integration time:
-
Permalink:
nextgenusfs/gfftk@41f4054768f3c72c2fde68da80481a7601bb6d4e -
Branch / Tag:
refs/heads/main - Owner: https://github.com/nextgenusfs
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
production-release.yml@41f4054768f3c72c2fde68da80481a7601bb6d4e -
Trigger Event:
workflow_dispatch
-
Statement type: