A minimal library for capturing ML/NLP operation traces for later training data export
Project description
oplogger
A minimal library for capturing ML/NLP operation traces for later training data export.
Installation
pip install oplogger
# With PostgreSQL support
pip install oplogger[postgres]
# With export features (pandas, HuggingFace datasets)
pip install oplogger[export]
Note: The package is installed as
oploggerbut imported asoplog.
Quick Start
from oplog import configure, op, run, db, export
# Configure once at startup
configure(project="my_project", backend="sqlite:///traces.db")
# Log standalone operations
op("classify") \
.model("setfit-intent") \
.input(text="hello world") \
.output(label="greeting", score=0.95) \
.save()
# Log grouped operations within a run (with run-level metadata for A/B testing)
with run(strategy="rerank_v2", experiment="exp_042") as r:
op("retrieve") \
.model("bge-m3") \
.input(query="capital of France?", k=10) \
.output(candidates=["Paris is the capital..."]) \
.save()
op("rerank") \
.model("bge-reranker-base") \
.input(query="capital of France?", candidates=[...]) \
.output(ranked=["Paris is the capital..."], scores=[0.94]) \
.meta(latency_ms=42) \
.save()
# Both ops get meta={"strategy": "rerank_v2", "experiment": "exp_042", ...}
# Flag for training
db.flag(run_id=r.id, reason="training", note="clean example")
# Query and export
records = db.query(operation="rerank", flagged_for="training")
export.to_jsonl(records, "training_data.jsonl")
API Reference
Configuration
configure(project="name", backend="sqlite:///traces.db")
Backend formats:
- SQLite:
sqlite:///path/to/traces.db(auto-creates file and parent directories) - PostgreSQL:
postgresql://user:pass@host:port/dbname
Operations
op("operation_type") # Start building an operation
.model("model-name") # Model identifier
.input(**kwargs) # Input data (JSON)
.output(**kwargs) # Output data (JSON)
.meta(**kwargs) # Metadata (latency, tokens, etc.)
.tags("tag1", "tag2") # Categorical tags
.save() # Persist and return operation ID
Runs
with run() as r: # Auto-generated run ID
op(...).save() # seq=0
op(...).save() # seq=1
print(r.id) # Access run ID
with run("custom-id"): # Explicit run ID
...
# Run-level metadata (propagates to all operations in the run)
with run(strategy="methodA", experiment_id="exp123") as r:
op("test").save() # meta={"strategy": "methodA", "experiment_id": "exp123"}
op("test").meta(latency_ms=42).save() # meta includes both run + op metadata
Run metadata is merged with operation metadata. Operation-level values override run-level on conflicts.
Database Operations
# Query
records = db.query(
operation="rerank", # Filter by operation type
model="model-name", # Filter by model
run_id="...", # Filter by run
flagged_for="training", # Filter by flag
tags=["tag1", "tag2"], # Filter by tags (AND logic)
limit=100, # Pagination
offset=0,
)
# Flag
db.flag(ids=[...], reason="training", note="optional note")
db.flag(run_id="...", reason="review")
# Unflag
db.unflag(ids=[...])
db.unflag(run_id="...")
Export
# JSONL
export.to_jsonl(records, "output.jsonl")
# CSV
export.to_csv(records, "output.csv")
# pandas DataFrame
df = export.to_dataframe(records)
# HuggingFace Dataset
dataset = export.to_dataset(records)
# Field selection (dot notation for nested fields)
export.to_jsonl(records, "output.jsonl", fields=["inputs.query", "outputs.score"])
Multi-Tracer Usage
For multiple projects or explicit control:
from oplog import Tracer
tracer = Tracer(project="my_project", backend="sqlite:///traces.db")
tracer.op("classify").input(...).save()
with tracer.run() as r:
tracer.op("rerank").input(...).save()
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file oplogger-0.1.0.tar.gz.
File metadata
- Download URL: oplogger-0.1.0.tar.gz
- Upload date:
- Size: 14.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fe2da7664c526135f2c5cd1b78dce501d44acf25046c4554cd29ba7c07aee1df
|
|
| MD5 |
9dd1680d6c223068c21e021afde03124
|
|
| BLAKE2b-256 |
092a2358e2a49e033dca75451b8969ee3af27dfaa1221334ba8a6687b7aa970a
|
File details
Details for the file oplogger-0.1.0-py3-none-any.whl.
File metadata
- Download URL: oplogger-0.1.0-py3-none-any.whl
- Upload date:
- Size: 14.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e1f34eb21a6e094cde71c95c2695e9c331c0cb68f86e69c1bd2b46d31291a8bc
|
|
| MD5 |
fc53372ae46af0a3609e884b5cdba9f5
|
|
| BLAKE2b-256 |
b0dd1f3d54b468ecc1bc4d7c1afdfec8f46937d14218080993b6c45fdf391f0e
|