Skip to main content

Python bindings for the mehari variant annotator

Project description

mehari

Python bindings for the mehari Rust library.

Features

  • Single variants: Annotate a single variant using a format string (chr:pos:ref:alt) or keyword arguments.
  • DataFrames: Process batches of variants by passing a polars.DataFrame.
  • LazyFrames: Support for polars.LazyFrame to process large datasets (like Parquet files) without loading everything into memory.

Usage

Initialize SeqvarsAnnotator with your transcript database (see mehari-data-tx) and a reference genome (FASTA, uncompressed, with index).

from mehari import SeqvarsAnnotator

annotator = SeqvarsAnnotator(
    transcript_db_paths=["path/to/txs.bin.zst"],
    reference_path="path/to/reference.fa"
)

To annotate a single variant either use colon separated format string or keyword arguments:

result1 = annotator.annotate("17:41197701:G:C")
result2 = annotator.annotate(chromosome="3", position=193332511, reference="G", alternative="T")

To annotate a batch of variants, pass a polars.DataFrame or polars.LazyFrame.

import polars as pl

df = pl.DataFrame(
    {
        "chromosome": ["17", "3"],
        "position": [41197701, 193332511],
        "reference": ["G", "G"],
        "alternative": ["C", "T"],
    },
    schema={
        "chromosome": pl.Categorical, "position": pl.Int32,
        "reference": pl.String, "alternative": pl.String
    }
)

annotated_df = annotator.annotate(df)

Schemas and types

Enums

Mehari exports its internal enums to Python so you can use them for filtering or comparisons:

from mehari import ConsequenceEnum, ImpactEnum

DataFrame Schema

When annotating a DataFrame or LazyFrame, mehari appends an "annotation" column. This column is a polars List(Struct) with the following fields:

  • allele: String
  • consequences: List(ConsequenceEnum)
  • putative_impact: ImpactEnum
  • gene_symbol: String
  • gene_id: String
  • feature_type: String
  • feature_id: String
  • feature_biotype: List(String)
  • feature_tags: List(String)
  • rank: Struct(ord: Int32, total: Int32)
  • cdna_pos: Struct(ord: Int32, total: Int32)
  • cds_pos: Struct(ord: Int32, total: Int32)
  • protein_pos: Struct(ord: Int32, total: Int32)
  • hgvs_g: String
  • hgvs_n: String
  • hgvs_c: String
  • hgvs_p: String
  • distance: Int32
  • strand: Int32
  • messages: List(String)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mehari-0.41.1.tar.gz (585.2 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

mehari-0.41.1-cp313-cp313-manylinux_2_28_x86_64.whl (6.8 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

mehari-0.41.1-cp312-cp312-manylinux_2_28_x86_64.whl (6.8 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

mehari-0.41.1-cp311-cp311-manylinux_2_28_x86_64.whl (6.8 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

File details

Details for the file mehari-0.41.1.tar.gz.

File metadata

  • Download URL: mehari-0.41.1.tar.gz
  • Upload date:
  • Size: 585.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: maturin/1.12.6

File hashes

Hashes for mehari-0.41.1.tar.gz
Algorithm Hash digest
SHA256 be8d961b902537ae28f324e16875a69e54f5082a51a7ccbb0347f93e7fd5908f
MD5 61f285e39c093933e354ed94249faf48
BLAKE2b-256 f19cc7eb179c02cb1930e513808915abd79d533575d3ed34b259dbc50211ab11

See more details on using hashes here.

File details

Details for the file mehari-0.41.1-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mehari-0.41.1-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 13411bab23bd5efb56084acd0847ac59cfc10fbe8b5aba5ed190b517df94928f
MD5 917ad8bf3dbc5113bec0a814fce369ad
BLAKE2b-256 602d9087542b0dac6a84f166c705f375c0399681171d9d6044b21eb23bbc28b5

See more details on using hashes here.

File details

Details for the file mehari-0.41.1-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mehari-0.41.1-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 98fc44f29d26cd543f29d030e3396aa5da1643fdadb41f7bee20541ebdc4cd23
MD5 ff07724e38f9077aa91c7125d7ddf18c
BLAKE2b-256 6244d6772e7896d4e3eb07634aeb51dbf218a39432c9224ab299913a37e43cbe

See more details on using hashes here.

File details

Details for the file mehari-0.41.1-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mehari-0.41.1-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 8e6eec8b72e9e0184fdb48fa64a36d501352727f005ac0742466021d27f08e75
MD5 2ddbb9679bd037fadb3273be9c5c9239
BLAKE2b-256 9da7f1e4ce4da61e908ff25859cabd67c248092834ffc5cef605a317731e60fc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page