A minimal library for capturing ML/NLP operation traces for later training data export

These details have not been verified by PyPI

Project links

Project description

oplogger

A minimal library for capturing ML/NLP operation traces for later training data export.

Installation

pip install oplogger

# With PostgreSQL support
pip install oplogger[postgres]

# With export features (pandas, HuggingFace datasets)
pip install oplogger[export]

Note: The package is installed as oplogger but imported as oplog.

Quick Start

from oplog import configure, op, run, db, export

# Configure once at startup
configure(project="my_project", backend="sqlite:///traces.db")

# Log standalone operations
op("classify") \
    .model("setfit-intent") \
    .input(text="hello world") \
    .output(label="greeting", score=0.95) \
    .save()

# Log grouped operations within a run (with run-level metadata for A/B testing)
with run(strategy="rerank_v2", experiment="exp_042") as r:
    op("retrieve") \
        .model("bge-m3") \
        .input(query="capital of France?", k=10) \
        .output(candidates=["Paris is the capital..."]) \
        .save()

    op("rerank") \
        .model("bge-reranker-base") \
        .input(query="capital of France?", candidates=[...]) \
        .output(ranked=["Paris is the capital..."], scores=[0.94]) \
        .meta(latency_ms=42) \
        .save()
    # Both ops get meta={"strategy": "rerank_v2", "experiment": "exp_042", ...}

# Flag for training
db.flag(run_id=r.id, reason="training", note="clean example")

# Query and export
records = db.query(operation="rerank", flagged_for="training")
export.to_jsonl(records, "training_data.jsonl")

API Reference

Configuration

configure(project="name", backend="sqlite:///traces.db")

Backend formats:

SQLite: sqlite:///path/to/traces.db (auto-creates file and parent directories)
PostgreSQL: postgresql://user:pass@host:port/dbname

Operations

op("operation_type")        # Start building an operation
    .model("model-name")    # Model identifier
    .input(**kwargs)        # Input data (JSON)
    .output(**kwargs)       # Output data (JSON)
    .meta(**kwargs)         # Metadata (latency, tokens, etc.)
    .tags("tag1", "tag2")   # Categorical tags
    .save()                 # Persist and return operation ID

Runs

with run() as r:            # Auto-generated run ID
    op(...).save()          # seq=0
    op(...).save()          # seq=1
    print(r.id)             # Access run ID

with run("custom-id"):      # Explicit run ID
    ...

# Run-level metadata (propagates to all operations in the run)
with run(strategy="methodA", experiment_id="exp123") as r:
    op("test").save()                      # meta={"strategy": "methodA", "experiment_id": "exp123"}
    op("test").meta(latency_ms=42).save()  # meta includes both run + op metadata

Run metadata is merged with operation metadata. Operation-level values override run-level on conflicts.

Database Operations

# Query
records = db.query(
    operation="rerank",     # Filter by operation type
    model="model-name",     # Filter by model
    run_id="...",           # Filter by run
    flagged_for="training", # Filter by flag
    tags=["tag1", "tag2"],  # Filter by tags (AND logic)
    limit=100,              # Pagination
    offset=0,
)

# Flag
db.flag(ids=[...], reason="training", note="optional note")
db.flag(run_id="...", reason="review")

# Unflag
db.unflag(ids=[...])
db.unflag(run_id="...")

Export

# JSONL
export.to_jsonl(records, "output.jsonl")

# CSV
export.to_csv(records, "output.csv")

# pandas DataFrame
df = export.to_dataframe(records)

# HuggingFace Dataset
dataset = export.to_dataset(records)

# Field selection (dot notation for nested fields)
export.to_jsonl(records, "output.jsonl", fields=["inputs.query", "outputs.score"])

Multi-Tracer Usage

For multiple projects or explicit control:

from oplog import Tracer

tracer = Tracer(project="my_project", backend="sqlite:///traces.db")

tracer.op("classify").input(...).save()

with tracer.run() as r:
    tracer.op("rerank").input(...).save()

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Nov 26, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

oplogger-0.1.0.tar.gz (14.7 kB view details)

Uploaded Nov 26, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

oplogger-0.1.0-py3-none-any.whl (14.7 kB view details)

Uploaded Nov 26, 2025 Python 3

File details

Details for the file oplogger-0.1.0.tar.gz.

File metadata

Download URL: oplogger-0.1.0.tar.gz
Upload date: Nov 26, 2025
Size: 14.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for oplogger-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`fe2da7664c526135f2c5cd1b78dce501d44acf25046c4554cd29ba7c07aee1df`
MD5	`9dd1680d6c223068c21e021afde03124`
BLAKE2b-256	`092a2358e2a49e033dca75451b8969ee3af27dfaa1221334ba8a6687b7aa970a`

See more details on using hashes here.

File details

Details for the file oplogger-0.1.0-py3-none-any.whl.

File metadata

Download URL: oplogger-0.1.0-py3-none-any.whl
Upload date: Nov 26, 2025
Size: 14.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for oplogger-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e1f34eb21a6e094cde71c95c2695e9c331c0cb68f86e69c1bd2b46d31291a8bc`
MD5	`fc53372ae46af0a3609e884b5cdba9f5`
BLAKE2b-256	`b0dd1f3d54b468ecc1bc4d7c1afdfec8f46937d14218080993b6c45fdf391f0e`

See more details on using hashes here.

oplogger 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

oplogger

Installation

Quick Start

API Reference

Configuration

Operations

Runs

Database Operations

Export

Multi-Tracer Usage

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes