Skip to main content

Genome annotation refinement using RNA-seq data

Project description

AnnoRefine

CI License: MIT PyPI Documentation

High-performance genome annotation refinement toolkit using RNA-seq data

AnnoRefine is a Rust-based toolkit for refining genome annotations and generating gene prediction hints from RNA-seq evidence. It provides both command-line tools and Python bindings for seamless integration into bioinformatics pipelines.

📖 Full Documentation

Features

  • 🔧 UTR Refinement - Extend and trim UTRs based on RNA-seq coverage
  • 🔀 Splice Site Refinement - Adjust intron boundaries using junction evidence
  • 🆕 Novel Gene Detection - Discover new genes from RNA-seq data
  • 🎯 Hint Generation - Convert BAM alignments to Augustus/GeneMark hints
  • 📊 Hint Processing - Join and filter hints from multiple sources
  • High Performance - Multi-threaded Rust implementation
  • 🐍 Python Bindings - Easy integration into Python workflows
  • 🧭 Strand-Aware - Supports all RNA-seq library types (FR, RF, UU)

Installation

Python Package (Recommended):

pip install annorefine

Standalone Binary: Download from GitHub Releases

Build from Source:

git clone https://github.com/nextgenusfs/annorefine.git
cd annorefine
cargo build --release

See the Installation Guide for detailed instructions.

Quick Start

Python API:

import annorefine

# Refine annotations
result = annorefine.refine(
    fasta_file="genome.fa",
    gff3_file="annotations.gff3",
    bam_file="alignments.bam",
    output_file="refined.gff3"
)

# Generate hints for gene prediction
result = annorefine.bam2hints(
    bam_file="alignments.bam",
    output_file="hints.gff",
    library_type="RF",
    contig_map={'NC_000001.11': 'chr1'}  # Optional: rename contigs
)

# Join hints from multiple sources
result = annorefine.join_hints(
    input_files=["bam_hints.gff", "protein_hints.gff"],
    output_file="joined_hints.gff"
)

Command Line:

# Refine annotations
annorefine utrs \
    --fasta genome.fa \
    --gff3 annotations.gff3 \
    --bam alignments.bam \
    --output refined.gff3

# Generate hints
annorefine bam2hints \
    --in alignments.bam \
    --out hints.gff \
    --stranded RF

# Join hints
annorefine join-hints \
    --input bam_hints.gff protein_hints.gff \
    --output joined_hints.gff

See the User Guide for more examples.

Use Cases

  • Annotation Refinement - Improve existing gene models with RNA-seq evidence
  • Augustus Gene Prediction - Generate hints for ab initio gene prediction
  • GeneMark-ETP - Create intron-only hints for GeneMark
  • funannotate2 Integration - Seamless integration with gene prediction pipelines

Documentation

Performance

  • Multi-threaded - Parallel processing with Rust backend
  • Memory efficient - Streaming BAM processing
  • Scalable - Handles mammalian-sized genomes efficiently

Typical performance:

  • Human genome (~20K genes): 10-30 minutes on 8 cores
  • Memory usage: 2-8 GB depending on genome size

Support

Citation

Palmer, J. (2025). AnnoRefine: High-performance genome annotation refinement using RNA-seq data.
GitHub: https://github.com/nextgenusfs/annorefine

License

MIT License - see LICENSE file for details.


Built with ❤️ in Rust | Documentation | PyPI

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

annorefine-2026.2.22-cp313-cp313-manylinux_2_28_x86_64.whl (7.7 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

annorefine-2026.2.22-cp313-cp313-macosx_11_0_arm64.whl (7.2 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

annorefine-2026.2.22-cp313-cp313-macosx_10_12_x86_64.whl (6.6 MB view details)

Uploaded CPython 3.13macOS 10.12+ x86-64

annorefine-2026.2.22-cp312-cp312-manylinux_2_28_x86_64.whl (7.7 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

annorefine-2026.2.22-cp312-cp312-macosx_11_0_arm64.whl (7.2 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

annorefine-2026.2.22-cp312-cp312-macosx_10_12_x86_64.whl (6.6 MB view details)

Uploaded CPython 3.12macOS 10.12+ x86-64

annorefine-2026.2.22-cp311-cp311-manylinux_2_28_x86_64.whl (7.7 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

annorefine-2026.2.22-cp311-cp311-macosx_11_0_arm64.whl (7.2 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

annorefine-2026.2.22-cp311-cp311-macosx_10_12_x86_64.whl (6.6 MB view details)

Uploaded CPython 3.11macOS 10.12+ x86-64

annorefine-2026.2.22-cp310-cp310-manylinux_2_28_x86_64.whl (7.7 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

annorefine-2026.2.22-cp310-cp310-macosx_11_0_arm64.whl (7.2 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

annorefine-2026.2.22-cp310-cp310-macosx_10_12_x86_64.whl (6.6 MB view details)

Uploaded CPython 3.10macOS 10.12+ x86-64

annorefine-2026.2.22-cp39-cp39-manylinux_2_28_x86_64.whl (7.7 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.28+ x86-64

annorefine-2026.2.22-cp39-cp39-macosx_11_0_arm64.whl (7.2 MB view details)

Uploaded CPython 3.9macOS 11.0+ ARM64

annorefine-2026.2.22-cp39-cp39-macosx_10_12_x86_64.whl (6.6 MB view details)

Uploaded CPython 3.9macOS 10.12+ x86-64

File details

Details for the file annorefine-2026.2.22-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for annorefine-2026.2.22-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 c60c3fffd48357c5abb44065802f5dca1b318afc1550a1852d416c42aa5f3fc9
MD5 29cc9b44e1bd3a89f89a65df86d09ae6
BLAKE2b-256 76c963014662ca8170d7ff8a2f205ee273bd18e05172de4c285fefdfcdc32dfa

See more details on using hashes here.

Provenance

The following attestation bundles were made for annorefine-2026.2.22-cp313-cp313-manylinux_2_28_x86_64.whl:

Publisher: release.yml on nextgenusfs/annorefine

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file annorefine-2026.2.22-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for annorefine-2026.2.22-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 4c664d0c0a6666bc89eccd9a54b9eb5f092a901b55bd7df2eadbcce24cf3bada
MD5 588edb802cf025abe4956c3695c548d6
BLAKE2b-256 2ee3f2d909f26a964d87c3d562377b0668c3ab985609a3ff1bbe897914208090

See more details on using hashes here.

Provenance

The following attestation bundles were made for annorefine-2026.2.22-cp313-cp313-macosx_11_0_arm64.whl:

Publisher: release.yml on nextgenusfs/annorefine

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file annorefine-2026.2.22-cp313-cp313-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for annorefine-2026.2.22-cp313-cp313-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 3359d6726957394451c0a0170add2bfb19843934d0c77e87913c171254abfacd
MD5 53be2e19fc9dfafd2e7d93448b12b2d1
BLAKE2b-256 35347f9ff5c78f915557e49c8c77278ab051d489701b1416dd8af995d9bac9bb

See more details on using hashes here.

Provenance

The following attestation bundles were made for annorefine-2026.2.22-cp313-cp313-macosx_10_12_x86_64.whl:

Publisher: release.yml on nextgenusfs/annorefine

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file annorefine-2026.2.22-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for annorefine-2026.2.22-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 c985368ef2418200beefc1828e2720adb5e37eed9c72f549f7e58deb35564f6e
MD5 ce2364f51ad1a3c41d14a5469e737674
BLAKE2b-256 42aa7d069fa103414e90ee7b9d01b46d0b2ebbc85062620ac9d2a01e2e8a116f

See more details on using hashes here.

Provenance

The following attestation bundles were made for annorefine-2026.2.22-cp312-cp312-manylinux_2_28_x86_64.whl:

Publisher: release.yml on nextgenusfs/annorefine

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file annorefine-2026.2.22-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for annorefine-2026.2.22-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 87329a0eab501a6efc108840d24ad0849bdf4158da92730c4f99cab5049f4327
MD5 ad53483c5c3f634b98b4964aab0a0e1f
BLAKE2b-256 6541a8be1561a399488c18e180dadb342686aeadfe61dc599780a17606868ca5

See more details on using hashes here.

Provenance

The following attestation bundles were made for annorefine-2026.2.22-cp312-cp312-macosx_11_0_arm64.whl:

Publisher: release.yml on nextgenusfs/annorefine

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file annorefine-2026.2.22-cp312-cp312-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for annorefine-2026.2.22-cp312-cp312-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 334a9dbcbe3215ea1498c1ff920fd8ee8cfe5ccf109ac34cb4c9736242c25107
MD5 e5552bcf2d436a84a950465d9c9eff2e
BLAKE2b-256 b9fbe2cc2198984e1169a20cace9cfa89a0772957ac74c028f1caeb25b3990ff

See more details on using hashes here.

Provenance

The following attestation bundles were made for annorefine-2026.2.22-cp312-cp312-macosx_10_12_x86_64.whl:

Publisher: release.yml on nextgenusfs/annorefine

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file annorefine-2026.2.22-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for annorefine-2026.2.22-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 41d1e1866317f990b65b25308058550f807dcf55a713a388f0195e81c6f19c37
MD5 9a518224520fa13104320cb32552b4c1
BLAKE2b-256 2c2f51d27f44689aa1ee000bb145811570002d420c7dfa0b153a992c28e0e297

See more details on using hashes here.

Provenance

The following attestation bundles were made for annorefine-2026.2.22-cp311-cp311-manylinux_2_28_x86_64.whl:

Publisher: release.yml on nextgenusfs/annorefine

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file annorefine-2026.2.22-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for annorefine-2026.2.22-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 a026f3f60ccde36ec245e0842f5f93930f0a80ba3342a6f7467b74e910eddc63
MD5 e5708c819d432cba34b439bc8400d036
BLAKE2b-256 ad827d12e63dfa9d2db71167c270a1fafa7292ce444fb0398afbcfaa57482c01

See more details on using hashes here.

Provenance

The following attestation bundles were made for annorefine-2026.2.22-cp311-cp311-macosx_11_0_arm64.whl:

Publisher: release.yml on nextgenusfs/annorefine

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file annorefine-2026.2.22-cp311-cp311-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for annorefine-2026.2.22-cp311-cp311-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 8150897f06d0ede53f72e91c157407690a0eb7419b5fd698cec30d0b6af57440
MD5 1d8c67ee96eba6acf8090adb6560476f
BLAKE2b-256 fb4d4eb205d972a2a19973444fe5b88614f4c25f7923034a99942a181d7707c7

See more details on using hashes here.

Provenance

The following attestation bundles were made for annorefine-2026.2.22-cp311-cp311-macosx_10_12_x86_64.whl:

Publisher: release.yml on nextgenusfs/annorefine

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file annorefine-2026.2.22-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for annorefine-2026.2.22-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 be3245327f1af41d0868e07f94b08b4e6f899123890e10ad572163205a566a23
MD5 ee9e8c1f8a0369e407619c73def19a75
BLAKE2b-256 83b454d4e7b4c2ecc1220983a8989983d2b803df151dc58b4f895dd2ce471aaa

See more details on using hashes here.

Provenance

The following attestation bundles were made for annorefine-2026.2.22-cp310-cp310-manylinux_2_28_x86_64.whl:

Publisher: release.yml on nextgenusfs/annorefine

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file annorefine-2026.2.22-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for annorefine-2026.2.22-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 c6e84f250c5e88d2e3530a4187f96bfba1e4499518d27cb0ab75d91c04d23ec2
MD5 d56fd694c6cf424e0aad9226291ad21b
BLAKE2b-256 a7d3f77b9230c8b75dc9253154a1c170887b16de4c5c70bf5ccd00a58f9d5da2

See more details on using hashes here.

Provenance

The following attestation bundles were made for annorefine-2026.2.22-cp310-cp310-macosx_11_0_arm64.whl:

Publisher: release.yml on nextgenusfs/annorefine

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file annorefine-2026.2.22-cp310-cp310-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for annorefine-2026.2.22-cp310-cp310-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 67dc71b48674219ec334a811005fc507199012e1e956e7c420bc5128d7156d5d
MD5 a9efececd2f9283d7fab0e8c50021d30
BLAKE2b-256 601760ae71a5a9a1a3692eeda88c171a2a90082b312e8e37cac2ed58d3561ccb

See more details on using hashes here.

Provenance

The following attestation bundles were made for annorefine-2026.2.22-cp310-cp310-macosx_10_12_x86_64.whl:

Publisher: release.yml on nextgenusfs/annorefine

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file annorefine-2026.2.22-cp39-cp39-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for annorefine-2026.2.22-cp39-cp39-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 db233de4e83a0e71de0da8fb7049bed657ad133edd919350870033683d47f987
MD5 290a43f3c1d9e35e8f23bcbebb996fbb
BLAKE2b-256 754207076d958998513269f102dd10d8d65623b6606033ee0199bce5ee5e1846

See more details on using hashes here.

Provenance

The following attestation bundles were made for annorefine-2026.2.22-cp39-cp39-manylinux_2_28_x86_64.whl:

Publisher: release.yml on nextgenusfs/annorefine

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file annorefine-2026.2.22-cp39-cp39-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for annorefine-2026.2.22-cp39-cp39-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 617b70772855177b71643cd02461cb166d2c2177144777ff9c90dba1f11ba4fc
MD5 c4dce40acc67a6158b52ef444976ee37
BLAKE2b-256 f12c6c50e35c2c8b70ef3fa8f8c4c21bd5a70d06c64014240a2e20644b24e370

See more details on using hashes here.

Provenance

The following attestation bundles were made for annorefine-2026.2.22-cp39-cp39-macosx_11_0_arm64.whl:

Publisher: release.yml on nextgenusfs/annorefine

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file annorefine-2026.2.22-cp39-cp39-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for annorefine-2026.2.22-cp39-cp39-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 aa4e00a1219282f8d9667fec3b7a2b09cfc81da234400d0367618629a5b86ded
MD5 84bf5672f2e0fa0d642c57ab2e264ade
BLAKE2b-256 da9f076d4e672f8cc772ff8d08fca31ac8a8221c94e26714c1bc10f76829f099

See more details on using hashes here.

Provenance

The following attestation bundles were made for annorefine-2026.2.22-cp39-cp39-macosx_10_12_x86_64.whl:

Publisher: release.yml on nextgenusfs/annorefine

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page