Skip to main content

helixerlite: simplified genome annotation with Helixer

Project description

Helixerlite: Simplified Gene Prediction using Helixer and HelixerPost

This is a lightweight "predict-only" version of Helixer and HelixerPost. Helixer is written in Python and contains many utilities for training models that aren't needed for end users who just want to predict genes in a genome. For smaller eukaryotic genomes, a GPU is not necessary for prediction. On average Ascomycete fungal genomes (~30 Mb), helixerlite should take less than 20 minutes to run.

HelixerPost is written in Rust and is in a separate repository, which makes installing a single tool cumbersome. By using maturin and pyO3, we wrap the Rust code into Python and run it as a single command-line tool.

Features

  • Convert FASTA files to HDF5 format for Helixer
  • Run gene prediction using a pre-trained Helixer model
  • Convert predictions to GFF3 format
  • Lightweight and easy to install
  • No GPU required for smaller genomes

Installation

Installation can be done with pip or other tools able to install from PyPI, such as uv:

python -m pip install helixerlite

Usage

Command-line Interface

HelixerLite provides a simple command-line interface:

# Run prediction
helixerlite --fasta genome.fasta --lineage fungi --out output.gff3 

Python API

You can also use HelixerLite as a Python library:

from helixerlite import fasta2hdf5, preds2gff3
from helixerlite.hybrid_model import HybridModel

# Convert FASTA to HDF5
fasta2hdf5("genome.fasta", "genome.h5")

# Run prediction
model = HybridModel(["--load-model-path", "path/to/model",
                     "--test-data", "genome.h5",
                     "--prediction-output-path", "predictions.h5"])
model.run()

Requirements

  • Python 3.8 or higher
  • TensorFlow 2.10 or higher
  • h5py
  • pyfastx
  • gfftk

Development

Setting up a development environment

# Clone the repository
git clone https://github.com/nextgenusfs/helixerlite.git
cd helixerlite

# Create a conda environment
conda create -n helixerlite python=3.10
conda activate helixerlite

# Install development dependencies
pip install -e ".[dev]"

Running tests

python -m pytest

Citation

Anybody using this repo should cite the original Helixer authors, manuscript, code, etc.

Felix Holst, Anthony Bolger, Christopher Günther, Janina Maß, Sebastian Triesch, Felicitas Kindel, Niklas Kiel, Nima Saadat, Oliver Ebenhöh, Björn Usadel, Rainer Schwacke, Marie Bolger, Andreas P.M. Weber, Alisandra K. Denton. Helixer—de novo Prediction of Primary Eukaryotic Gene Models Combining Deep Learning and a Hidden Markov Model. bioRxiv 2023.02.06.527280; doi: https://doi.org/10.1101/2023.02.06.527280

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

helixerlite-25.5.27.tar.gz (108.9 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

helixerlite-25.5.27-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

helixerlite-25.5.27-cp311-cp311-macosx_11_0_arm64.whl (1.3 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

helixerlite-25.5.27-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

helixerlite-25.5.27-cp310-cp310-macosx_11_0_arm64.whl (1.3 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

helixerlite-25.5.27-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ x86-64

helixerlite-25.5.27-cp39-cp39-macosx_11_0_arm64.whl (1.3 MB view details)

Uploaded CPython 3.9macOS 11.0+ ARM64

File details

Details for the file helixerlite-25.5.27.tar.gz.

File metadata

  • Download URL: helixerlite-25.5.27.tar.gz
  • Upload date:
  • Size: 108.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for helixerlite-25.5.27.tar.gz
Algorithm Hash digest
SHA256 f326eb7b4bd6d6575bd9170bf2823205a27b71e5d97b340f412ab669e8e04a43
MD5 0547369404504a4eba847b4fc9f695c8
BLAKE2b-256 30553af5225e0c10580904cc8bc8d2dc2b4c7fad2beac4b09a8cc85cda0130bb

See more details on using hashes here.

Provenance

The following attestation bundles were made for helixerlite-25.5.27.tar.gz:

Publisher: release.yml on nextgenusfs/helixerlite

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file helixerlite-25.5.27-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for helixerlite-25.5.27-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 27bb1dcfe839d7abbc9e6cc008331a6e3887fe23e9fa9108675ae2eee592f346
MD5 003c1e88763028fb6349d7250c31d512
BLAKE2b-256 31f865d52c1383a1832761f80bad6206de6e9e675adb67aaa8673cc709fe84c2

See more details on using hashes here.

Provenance

The following attestation bundles were made for helixerlite-25.5.27-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: release.yml on nextgenusfs/helixerlite

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file helixerlite-25.5.27-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for helixerlite-25.5.27-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 d1029a41ae346f95994263911169174495c1177215888db744006a0d2e1ea5bb
MD5 8fcb263b34d1318f2bf5c2da7076df5f
BLAKE2b-256 510279a58cfc45dd2fdc4747fbdb63304920faa095e456391e182972993ab075

See more details on using hashes here.

Provenance

The following attestation bundles were made for helixerlite-25.5.27-cp311-cp311-macosx_11_0_arm64.whl:

Publisher: release.yml on nextgenusfs/helixerlite

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file helixerlite-25.5.27-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for helixerlite-25.5.27-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 909ec8a5271765de548dbc6eb2eea9db8a7f6a618de4261b80925b8d9569bc21
MD5 1fb295ac1bba0f9b62690ed234b5b696
BLAKE2b-256 a2982f815fd3b5c21f9963d2cdf2049a311a4452aea7c75b92d9538ce136721f

See more details on using hashes here.

Provenance

The following attestation bundles were made for helixerlite-25.5.27-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: release.yml on nextgenusfs/helixerlite

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file helixerlite-25.5.27-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for helixerlite-25.5.27-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 a0bf8e5e882ae856c094a6844c5703daad48f5ffeb4588a0066551b27ae462ff
MD5 3ee1cbc0dae37a69cc29543bbd323d6a
BLAKE2b-256 60e5aba0b39d232efad29756171daf50eea55d0d905a976c34640b8fb339918d

See more details on using hashes here.

Provenance

The following attestation bundles were made for helixerlite-25.5.27-cp310-cp310-macosx_11_0_arm64.whl:

Publisher: release.yml on nextgenusfs/helixerlite

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file helixerlite-25.5.27-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for helixerlite-25.5.27-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 1a62c3518cb7ff3f5f24ac064a0c85fc4b6606fdf77441856b08a7f2b0305fff
MD5 fb02e9eb127a2eced59839ab5030ace3
BLAKE2b-256 f27d0a5858b15775205901389d009cdfd0409a258c7832d7d9f1864ee2720434

See more details on using hashes here.

Provenance

The following attestation bundles were made for helixerlite-25.5.27-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: release.yml on nextgenusfs/helixerlite

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file helixerlite-25.5.27-cp39-cp39-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for helixerlite-25.5.27-cp39-cp39-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 32a7f158982a581ec1e29c452b6388efaddf2119effa4f759be23f151d7db66b
MD5 cda28be02c1cf3aee1d3536c2e1b0f73
BLAKE2b-256 3b5b832f2ee52c44bd38702859e6093f3eb4064557ffa2e97277e7ece1ec0c38

See more details on using hashes here.

Provenance

The following attestation bundles were made for helixerlite-25.5.27-cp39-cp39-macosx_11_0_arm64.whl:

Publisher: release.yml on nextgenusfs/helixerlite

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page