Pydantic-backed Metric.
Project description
fgmetric
Type-validated Python models for delimited data files.
Overview
fgmetric lets you define Python classes ("Metrics") that map directly to rows in CSV/TSV files.
It handles parsing, type coercion (strings → int, float, bool), and validation automatically using Pydantic.
Installation
Requires Python 3.12 or later.
pip install fgmetric
Or with uv:
uv add fgmetric
Why fgmetric?
If you're a bioinformatician or data engineer processing delimited files in Python, you've probably written code like this:
import csv
with open("metrics.tsv") as f:
reader = csv.DictReader(f, delimiter="\t")
for row in reader:
quality = int(row["mapping_quality"])
is_duplicate = row["is_duplicate"].lower() in ("true", "1", "yes")
if row["score"]: # handle empty strings
score = float(row["score"])
# ... repeat for every field
fgmetric replaces this with:
for metric in AlignmentMetric.read(path):
# metric.mapping_quality is already an int
# metric.is_duplicate is already a bool
# metric.score is already Optional[float]
How it compares:
- vs. csv + dataclasses — Automatic type coercion and validation without boilerplate. Built on Pydantic, so additional custom validators and serializer can be readily added.
- vs. pandas — Unlike pandas,
fgmetricprocesses records lazily — you can handle files larger than memory. AndMetrics are type-validated and can be made immutable, making them safe to pass between functions without defensive copying. - vs. Pydantic alone —
fgmetrichandles CSV/TSV specifics (header parsing, delimiter configuration) and provides out-of-the box features like empty value handling and Counter field pivoting.
Quick Start
Define a class to represent each row:
from pathlib import Path
from fgmetric import Metric, MetricWriter
class AlignmentMetric(Metric):
read_name: str
mapping_quality: int
is_duplicate: bool = False
Then read or write:
# Reading
for metric in AlignmentMetric.read(Path("alignments.tsv")):
print(f"{metric.read_name}: MQ={metric.mapping_quality}")
# Writing
metrics = [
AlignmentMetric(read_name="read1", mapping_quality=60),
AlignmentMetric(read_name="read2", mapping_quality=30, is_duplicate=True),
]
with MetricWriter(AlignmentMetric, Path("output.tsv")) as writer:
writer.writeall(metrics)
Example input file (alignments.tsv):
read_name mapping_quality is_duplicate
read1 60 false
read2 30 true
Invalid data raises pydantic.ValidationError with details about which field failed.
Core Usage
Custom Delimiters
Both reading and writing support custom delimiters for working with CSV or other formats:
# Reading CSV files
for metric in MyMetric.read(Path("data.csv"), delimiter=","):
...
# Writing CSV files
with MetricWriter(MyMetric, Path("output.csv"), delimiter=",") as writer:
...
List Fields
Fields typed as list[T] are automatically parsed from and serialized to delimited strings:
class TaggedRead(Metric):
read_id: str
tags: list[str] # "A,B,C" becomes ["A", "B", "C"]
scores: list[int] # "1,2,3" becomes [1, 2, 3]
optional_tags: list[str] | None # "" becomes None
The list delimiter defaults to , but can be customized per-metric:
class SemicolonMetric(Metric):
collection_delimiter = ";"
values: list[int] # "1;2;3" becomes [1, 2, 3]
Counter Fields
When your file has categorical data with one column per category (e.g. base counts A, C, G, T), you can model them as a single Counter[StrEnum] field:
from collections import Counter
from enum import StrEnum
from fgmetric import Metric
class Base(StrEnum):
A = "A"
C = "C"
G = "G"
T = "T"
class BaseCountMetric(Metric):
position: int
counts: Counter[Base]
# Input TSV:
# position A C G T
# 1 10 5 3 2
# Parses to:
# BaseCountMetric(position=1, counts=Counter({Base.A: 10, Base.C: 5, ...}))
Contributing
See the contributing guide for development setup and testing instructions.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fgmetric-0.2.0.tar.gz.
File metadata
- Download URL: fgmetric-0.2.0.tar.gz
- Upload date:
- Size: 56.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cf0ad030f0d4d3bf04cbe9d93326c2e544e8a2b3e8467b4810df35ebb666d7f1
|
|
| MD5 |
6103892e65eba66eb5fc4825efc108bd
|
|
| BLAKE2b-256 |
51b4edb1f6dfc3321620dfe02125ce20a4708805b9c2eb7d3ea57c2bce641662
|
Provenance
The following attestation bundles were made for fgmetric-0.2.0.tar.gz:
Publisher:
publish.yml on fulcrumgenomics/fgmetric
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
fgmetric-0.2.0.tar.gz -
Subject digest:
cf0ad030f0d4d3bf04cbe9d93326c2e544e8a2b3e8467b4810df35ebb666d7f1 - Sigstore transparency entry: 1127427378
- Sigstore integration time:
-
Permalink:
fulcrumgenomics/fgmetric@7df73b5b26a9ed90aae83bde031a0d23a455ea8c -
Branch / Tag:
refs/tags/0.2.0 - Owner: https://github.com/fulcrumgenomics
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@7df73b5b26a9ed90aae83bde031a0d23a455ea8c -
Trigger Event:
push
-
Statement type:
File details
Details for the file fgmetric-0.2.0-py3-none-any.whl.
File metadata
- Download URL: fgmetric-0.2.0-py3-none-any.whl
- Upload date:
- Size: 15.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
44b903aa9a66f70380df8cb4c69eb63fa1829f9dc60c11327885d9245bf5e825
|
|
| MD5 |
29c568775c81608f77688c6f110906f9
|
|
| BLAKE2b-256 |
fb2dc6a8949aa9d5e6b69f7dcb9e792f8b2ff17d93204cc69cc520deff221b3b
|
Provenance
The following attestation bundles were made for fgmetric-0.2.0-py3-none-any.whl:
Publisher:
publish.yml on fulcrumgenomics/fgmetric
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
fgmetric-0.2.0-py3-none-any.whl -
Subject digest:
44b903aa9a66f70380df8cb4c69eb63fa1829f9dc60c11327885d9245bf5e825 - Sigstore transparency entry: 1127427479
- Sigstore integration time:
-
Permalink:
fulcrumgenomics/fgmetric@7df73b5b26a9ed90aae83bde031a0d23a455ea8c -
Branch / Tag:
refs/tags/0.2.0 - Owner: https://github.com/fulcrumgenomics
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@7df73b5b26a9ed90aae83bde031a0d23a455ea8c -
Trigger Event:
push
-
Statement type: