Skip to main content

Pydantic-backed Metric.

Project description

fgmetric

Type-validated Python models for delimited data files.

CI Python Versions MyPy Checked uv Ruff

Fulcrum Genomics

Visit us at Fulcrum Genomics to learn more about how we can power your Bioinformatics with fgmetric and beyond.

Overview

fgmetric lets you define Python classes ("Metrics") that map directly to rows in CSV/TSV files. It handles parsing, type coercion (strings → int, float, bool), and validation automatically using Pydantic.

Installation

Requires Python 3.12 or later.

pip install fgmetric

Or with uv:

uv add fgmetric

Why fgmetric?

If you're a bioinformatician or data engineer processing delimited files in Python, you've probably written code like this:

import csv

with open("metrics.tsv") as f:
    reader = csv.DictReader(f, delimiter="\t")
    for row in reader:
        quality = int(row["mapping_quality"])
        is_duplicate = row["is_duplicate"].lower() in ("true", "1", "yes")
        if row["score"]:  # handle empty strings
            score = float(row["score"])
        # ... repeat for every field

fgmetric replaces this with:

for metric in AlignmentMetric.read(path):
    # metric.mapping_quality is already an int
    # metric.is_duplicate is already a bool
    # metric.score is already Optional[float]

How it compares:

  • vs. csv + dataclasses — Automatic type coercion and validation without boilerplate. Built on Pydantic, so additional custom validators and serializer can be readily added.
  • vs. pandas — Unlike pandas, fgmetric processes records lazily — you can handle files larger than memory. And Metrics are type-validated and can be made immutable, making them safe to pass between functions without defensive copying.
  • vs. Pydantic alonefgmetric handles CSV/TSV specifics (header parsing, delimiter configuration) and provides out-of-the box features like empty value handling and Counter field pivoting.

Quick Start

Define a class to represent each row:

from pathlib import Path
from fgmetric import Metric, MetricWriter


class AlignmentMetric(Metric):
    read_name: str
    mapping_quality: int
    is_duplicate: bool = False

Then read or write:

# Reading
for metric in AlignmentMetric.read(Path("alignments.tsv")):
    print(f"{metric.read_name}: MQ={metric.mapping_quality}")

# Writing
metrics = [
    AlignmentMetric(read_name="read1", mapping_quality=60),
    AlignmentMetric(read_name="read2", mapping_quality=30, is_duplicate=True),
]
with MetricWriter(AlignmentMetric, Path("output.tsv")) as writer:
    writer.writeall(metrics)

Example input file (alignments.tsv):

read_name	mapping_quality	is_duplicate
read1	60	false
read2	30	true

Invalid data raises pydantic.ValidationError with details about which field failed.

Core Usage

Custom Delimiters

Both reading and writing support custom delimiters for working with CSV or other formats:

# Reading CSV files
for metric in MyMetric.read(Path("data.csv"), delimiter=","):
    ...

# Writing CSV files
with MetricWriter(MyMetric, Path("output.csv"), delimiter=",") as writer:
    ...

List Fields

Fields typed as list[T] are automatically parsed from and serialized to delimited strings:

class TaggedRead(Metric):
    read_id: str
    tags: list[str]           # "A,B,C" becomes ["A", "B", "C"]
    scores: list[int]         # "1,2,3" becomes [1, 2, 3]
    optional_tags: list[str] | None  # "" becomes None

The list delimiter defaults to , but can be customized per-metric:

class SemicolonMetric(Metric):
    collection_delimiter = ";"
    values: list[int]  # "1;2;3" becomes [1, 2, 3]

Counter Fields

When your file has categorical data with one column per category (e.g. base counts A, C, G, T), you can model them as a single Counter[StrEnum] field:

from collections import Counter
from enum import StrEnum
from fgmetric import Metric


class Base(StrEnum):
    A = "A"
    C = "C"
    G = "G"
    T = "T"


class BaseCountMetric(Metric):
    position: int
    counts: Counter[Base]


# Input TSV:
# position  A   C   G   T
# 1         10  5   3   2

# Parses to:
# BaseCountMetric(position=1, counts=Counter({Base.A: 10, Base.C: 5, ...}))

Contributing

See the contributing guide for development setup and testing instructions.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fgmetric-0.3.0.tar.gz (60.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fgmetric-0.3.0-py3-none-any.whl (16.4 kB view details)

Uploaded Python 3

File details

Details for the file fgmetric-0.3.0.tar.gz.

File metadata

  • Download URL: fgmetric-0.3.0.tar.gz
  • Upload date:
  • Size: 60.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for fgmetric-0.3.0.tar.gz
Algorithm Hash digest
SHA256 1b8eee85dec9553443ae152e8f0b4348efc91c964ff065436d11bd6374b3fb36
MD5 b39ffafe4ded92d94ac6ee01c446fe51
BLAKE2b-256 45aec8118835c3f3da63956fefe1a6a0f828131917bb95896a22c55f8452c94c

See more details on using hashes here.

Provenance

The following attestation bundles were made for fgmetric-0.3.0.tar.gz:

Publisher: publish.yml on fg-labs/fgmetric

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file fgmetric-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: fgmetric-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 16.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for fgmetric-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ab9acdabb4fffe12d00be2cff07413e3a498139833fc43acff65626e337ed486
MD5 971b24b59ad014fd606ff4abc5c63b7a
BLAKE2b-256 f7b5675a007ea029df900579726516d8e2d8fc7c8d157c4972b343319a3aaf70

See more details on using hashes here.

Provenance

The following attestation bundles were made for fgmetric-0.3.0-py3-none-any.whl:

Publisher: publish.yml on fg-labs/fgmetric

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page