LLM-powered automated data labeling with schema-first, confidence-scored, and local-model ready
Project description
Tagnify
LLM-powered automated data labeling schema-first, confidence-scored, local-model ready.
from tagnify import Tagnify, Schema, Example
schema = Schema(
labels=["positive", "negative", "neutral"],
examples=[Example(text="Great product!", label="positive")]
)
tagnify = Tagnify(model="qwen2.5:7b")
result = tagnify.label("This was a disappointing experience.", schema)
print(result.label) # "negative"
print(result.confidence) # 0.91
Features
- Schema-first design — labels and few-shot examples are defined upfront; examples are mandatory, not optional
- Confidence scoring — every label includes a confidence score; low-confidence results are automatically flagged
- Automatic retries — invalid or malformed model output triggers a retry with a stronger prompt, up to 3 attempts
- Reasoning traces — optionally request a one-line explanation for each label (
reasoning=True) - Pluggable backends — run locally for free via Ollama, or plug in your own LLM API
Installation
pip install tagnify
Requires Ollama running locally for the default backend.
Custom Backends
Have your own LLM API an internal company model, a provider Tagnify doesn't support yet,
anything that isn't Ollama? Implement BaseBackend and wire it in with Tagnify.with_backend():
from tagnify import Tagnify, Schema, Example
from tagnify.backends.base import BaseBackend
from tagnify.exceptions import BackendError
import httpx
class MyCompanyBackend(BaseBackend):
def __init__(self, endpoint: str, api_key: str, model: str):
self.endpoint = endpoint
self.api_key = api_key
self.model = model
def complete(self, prompt: str) -> str:
try:
response = httpx.post(
self.endpoint,
headers={"Authorization": f"Bearer {self.api_key}"},
json={"model": self.model, "prompt": prompt},
timeout=60.0,
)
response.raise_for_status()
except httpx.HTTPError as e:
raise BackendError(f"Backend call failed: {e}") from e
return response.json()["text"]
backend = MyCompanyBackend(endpoint="...", api_key="...", model="...")
tagnify = Tagnify.with_backend(backend)
result = tagnify.label("This was a disappointing experience.", schema)
Everything downstream — retries, parsing, validation, confidence scoring — works identically,
regardless of where complete() gets its text from. Wrap your own network/API errors in
BackendError so the retry logic behaves correctly: it's treated as an infrastructure failure
and is not retried, unlike a malformed model response which is.
Documentation
Full docs at docs.tagnify.io — coming soon.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tagnify-0.2.0.tar.gz.
File metadata
- Download URL: tagnify-0.2.0.tar.gz
- Upload date:
- Size: 20.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
690d759b901310894ff0ad9f2067cb2ff2ffe11f538afcb6e73297a9c9bf2275
|
|
| MD5 |
87b0bb4286ea0423731773123842ab0a
|
|
| BLAKE2b-256 |
37c0758ce58506d7b84be4a04d0c2c6a6f791d58466795ecbc71c33f1a52b42c
|
Provenance
The following attestation bundles were made for tagnify-0.2.0.tar.gz:
Publisher:
publish.yml on MaulanaArya30/tagnify
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
tagnify-0.2.0.tar.gz -
Subject digest:
690d759b901310894ff0ad9f2067cb2ff2ffe11f538afcb6e73297a9c9bf2275 - Sigstore transparency entry: 1851365401
- Sigstore integration time:
-
Permalink:
MaulanaArya30/tagnify@34a20d066fb88c40c06235704ce44b2dfa04e3eb -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/MaulanaArya30
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@34a20d066fb88c40c06235704ce44b2dfa04e3eb -
Trigger Event:
push
-
Statement type:
File details
Details for the file tagnify-0.2.0-py3-none-any.whl.
File metadata
- Download URL: tagnify-0.2.0-py3-none-any.whl
- Upload date:
- Size: 12.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bd5978537c4d3c9b23a64385774313ea46167048bf9f5c6b2fed85871dd53fbe
|
|
| MD5 |
cc3bc81d5a238c925debef5b792ef684
|
|
| BLAKE2b-256 |
7bec69323caefda57738fb8bde5824b0e70471e7884f998559e3f9fa7ac08f90
|
Provenance
The following attestation bundles were made for tagnify-0.2.0-py3-none-any.whl:
Publisher:
publish.yml on MaulanaArya30/tagnify
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
tagnify-0.2.0-py3-none-any.whl -
Subject digest:
bd5978537c4d3c9b23a64385774313ea46167048bf9f5c6b2fed85871dd53fbe - Sigstore transparency entry: 1851365496
- Sigstore integration time:
-
Permalink:
MaulanaArya30/tagnify@34a20d066fb88c40c06235704ce44b2dfa04e3eb -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/MaulanaArya30
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@34a20d066fb88c40c06235704ce44b2dfa04e3eb -
Trigger Event:
push
-
Statement type: