A wrapper around Ollama Python SDK for text classification with constrained output and confidence scoring
Project description
ollama-classifier
A Python wrapper around the Ollama Python SDK for text classification with constrained output and confidence scoring. Now supports multiple inference backends: Ollama, vLLM, SGLang, and llama.cpp.
Features
- Constrained Output: Uses JSON schema with enum constraints to ensure only valid choices are generated
- Confidence Scoring: Multi-call evaluation with softmax for calibrated probabilities
- Sync & Async: Full support for both synchronous and asynchronous operations
- Batch Processing: Classify multiple texts efficiently
- Flexible Choices: Support for simple labels or labels with descriptions
- Custom Prompts: Override the default system prompt for specialized tasks
- Multiple Backends: Use Ollama, vLLM, SGLang, or llama.cpp as your inference engine (local or remote)
Installation
Core (Ollama only)
pip install ollama-classifier
Or with uv:
uv add ollama-classifier
With additional backends (vLLM, SGLang, llama.cpp)
pip install "ollama-classifier[backends]"
Or with uv:
uv add "ollama-classifier[backends]"
Prerequisites
- Ollama backend: Ollama installed and running, with a model pulled (e.g.,
ollama pull llama3.2) - vLLM backend: A running vLLM server
- SGLang backend: A running SGLang server
- llama.cpp backend: A running llama.cpp server
Quick Start
Ollama (original backend)
from ollama import Client
from ollama_classifier import OllamaClassifier
client = Client()
classifier = OllamaClassifier(client, model="llama3.2")
result = classifier.classify(
text="I love this product!",
choices=["positive", "negative", "neutral"]
)
print(f"Prediction: {result.prediction}")
print(f"Confidence: {result.confidence:.2%}")
print(f"Probabilities: {result.probabilities}")
vLLM
from ollama_classifier.backends import VLLMBackend
from ollama_classifier import LLMClassifier
backend = VLLMBackend(
model="meta-llama/Llama-3.2-3B-Instruct",
base_url="http://localhost:8000/v1",
)
classifier = LLMClassifier(backend)
result = classifier.classify(
text="I love this product!",
choices=["positive", "negative", "neutral"]
)
SGLang
from ollama_classifier.backends import SGLangBackend
from ollama_classifier import LLMClassifier
backend = SGLangBackend(
model="meta-llama/Llama-3.2-3B-Instruct",
base_url="http://localhost:30000/v1",
)
classifier = LLMClassifier(backend)
result = classifier.classify(
text="I love this product!",
choices=["positive", "negative", "neutral"]
)
llama.cpp
from ollama_classifier.backends import LlamaCppBackend
from ollama_classifier import LLMClassifier
backend = LlamaCppBackend(
model="model",
base_url="http://localhost:8080/v1",
)
classifier = LLMClassifier(backend)
result = classifier.classify(
text="I love this product!",
choices=["positive", "negative", "neutral"]
)
Usage
Basic Classification
from ollama import Client
from ollama_classifier import OllamaClassifier
client = Client()
classifier = OllamaClassifier(client, model="llama3.2")
result = classifier.classify(
text="The goalkeeper made an incredible save!",
choices=["sports", "politics", "technology", "entertainment"]
)
Classification with Label Descriptions
Providing descriptions helps the model understand each category better:
choices = {
"positive": "Text expresses happiness, satisfaction, or approval",
"negative": "Text expresses anger, disappointment, or disapproval",
"mixed": "Text contains both positive and negative sentiments",
"neutral": "Text is factual without strong emotional content",
}
result = classifier.classify(
text="The food was amazing but the service was terrible.",
choices=choices
)
Custom System Prompt
result = classifier.classify(
text="The quarterly earnings exceeded analyst expectations.",
choices=["bullish", "bearish", "neutral"],
system_prompt="You are a financial sentiment analyzer. "
"Classify financial news based on market sentiment."
)
Scoring (Multi-Call with Softmax)
Get calibrated probability distribution over all choices. Makes N API calls for N choices:
result = classifier.score(
text="The movie was fantastic!",
choices=["positive", "negative", "neutral"]
)
Generate Only (Fastest)
When you only need the prediction without confidence scores:
prediction = classifier.generate(
text="The team won the championship!",
choices=["sports", "finance", "politics"]
)
Batch Classification
texts = [
"The goalkeeper made an incredible save!",
"The central bank raised interest rates.",
"The new smartphone features a revolutionary camera.",
]
results = classifier.batch_classify(
texts=texts,
choices=["sports", "finance", "technology"]
)
for text, result in zip(texts, results):
print(f"{text} -> {result.prediction} ({result.confidence:.2%})")
Async Usage
import asyncio
from ollama import AsyncClient
from ollama_classifier import OllamaClassifier
async def main():
client = AsyncClient()
classifier = OllamaClassifier(client, model="llama3.2")
# Single classification
result = await classifier.aclassify(
text="The concert was amazing!",
choices=["positive", "negative", "neutral"]
)
# Batch classification (concurrent)
results = await classifier.abatch_classify(
texts=["Text 1", "Text 2", "Text 3"],
choices=["positive", "negative", "neutral"]
)
asyncio.run(main())
Inference Backends
Ollama (default)
The original backend using the Ollama Python SDK. Requires Ollama installed locally.
from ollama import Client
from ollama_classifier import OllamaClassifier
classifier = OllamaClassifier(Client(), model="llama3.2")
vLLM
High-throughput serving engine. Supports local and remote servers.
Local server:
python -m vllm.entrypoints.openai.api_server \
--model meta-llama/Llama-3.2-3B-Instruct \
--host 0.0.0.0 --port 8000
Connect:
from ollama_classifier.backends import VLLMBackend
from ollama_classifier import LLMClassifier
backend = VLLMBackend(
model="meta-llama/Llama-3.2-3B-Instruct",
base_url="http://localhost:8000/v1",
)
classifier = LLMClassifier(backend)
Remote server:
backend = VLLMBackend(
model="your-model",
base_url="https://your-vllm-server.com/v1",
api_key="your-api-key", # if authentication is required
)
SGLang
Fast serving system for large language models. Supports local and remote servers.
Local server:
python -m sglang.launch_server \
--model-path meta-llama/Llama-3.2-3B-Instruct \
--host 0.0.0.0 --port 30000
Connect:
from ollama_classifier.backends import SGLangBackend
from ollama_classifier import LLMClassifier
backend = SGLangBackend(
model="meta-llama/Llama-3.2-3B-Instruct",
base_url="http://localhost:30000/v1",
)
classifier = LLMClassifier(backend)
llama.cpp
Lightweight inference via llama-server. Ideal for CPU or mixed CPU/GPU environments.
Local server:
./llama-server -m model.gguf --host 0.0.0.0 --port 8080 -c 4096
Connect:
from ollama_classifier.backends import LlamaCppBackend
from ollama_classifier import LLMClassifier
backend = LlamaCppBackend(
model="model",
base_url="http://localhost:8080/v1",
)
classifier = LLMClassifier(backend)
Note: JSON schema constraints and logprobs require llama.cpp to be compiled with the appropriate flags (e.g.,
LLAMA_JSON_SCHEMAandLLAMA_SUPPORT_LOGPROBS).
Backend Configuration
All backends share common configuration options:
| Parameter | Default | Description |
|---|---|---|
model |
(required) | Model identifier |
base_url |
Engine-specific | Base URL of the inference server |
api_key |
"not-needed" |
API key for authentication |
timeout |
120.0 |
Request timeout in seconds |
max_tokens |
256 |
Maximum tokens to generate |
extra_body |
{} |
Extra parameters merged into every request |
API Reference
ClassificationResult
@dataclass
class ClassificationResult:
prediction: str # The predicted choice label
confidence: float # Confidence score (0.0 to 1.0)
probabilities: Dict[str, float] # Probability distribution over all choices
raw_response: Dict # Raw response for debugging
OllamaClassifier Methods
| Method | Async | Description |
|---|---|---|
generate(text, choices, system_prompt) |
agenerate |
Constrained output only (fastest) |
score(text, choices, system_prompt) |
ascore |
Multi-call evaluation with softmax |
classify(text, choices, system_prompt) |
aclassify |
Full classification with confidence scores |
batch_generate(texts, choices, system_prompt) |
abatch_generate |
Batch constrained output |
batch_score(texts, choices, system_prompt) |
abatch_score |
Batch scoring |
batch_classify(texts, choices, system_prompt) |
abatch_classify |
Batch classification |
LLMClassifier Methods
LLMClassifier exposes the same API as OllamaClassifier but accepts any LLMBackend:
| Method | Async | Description |
|---|---|---|
generate(text, choices, system_prompt) |
agenerate |
Constrained output only (fastest) |
score(text, choices, system_prompt) |
ascore |
Multi-call evaluation with softmax |
classify(text, choices, system_prompt) |
aclassify |
Full classification with confidence scores |
batch_generate(texts, choices, system_prompt) |
abatch_generate |
Batch constrained output |
batch_score(texts, choices, system_prompt) |
abatch_score |
Batch scoring |
batch_classify(texts, choices, system_prompt) |
abatch_classify |
Batch classification |
Parameters
- text (str): The text to classify
- texts (List[str]): List of texts to classify (batch methods)
- choices (Union[List[str], Dict[str, str]]): Either a list of choice labels, or a dict mapping labels to descriptions
- system_prompt (str | None): Optional custom system prompt
Choosing a Method
| Use Case | Recommended Method |
|---|---|
| Speed is critical, no confidence needed | generate |
| Accurate confidence scores | classify / score |
| Batch processing | batch_classify or batch_score |
| Concurrent processing | Async variants (aclassify, etc.) |
License
MIT License
Development
This project just started! Looking forward to suggestions, issues, and pull requests!
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ollama_classifier-0.3.0.tar.gz.
File metadata
- Download URL: ollama_classifier-0.3.0.tar.gz
- Upload date:
- Size: 11.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1f7c5ebc1038a601161a2ec1326c62267481b9ad9dbb72d0f1e21f4095366bb1
|
|
| MD5 |
90459b9253f6d056eb18e84604c04ea8
|
|
| BLAKE2b-256 |
10a3e25d4505aa11691faeb69773ce2b587ad7b4f7d6617dd848037d8fd38508
|
Provenance
The following attestation bundles were made for ollama_classifier-0.3.0.tar.gz:
Publisher:
python-publish.yml on paluigi/ollama-classifier
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ollama_classifier-0.3.0.tar.gz -
Subject digest:
1f7c5ebc1038a601161a2ec1326c62267481b9ad9dbb72d0f1e21f4095366bb1 - Sigstore transparency entry: 1392789140
- Sigstore integration time:
-
Permalink:
paluigi/ollama-classifier@7db4183bde486b98956012941462f130cb06baf8 -
Branch / Tag:
refs/tags/v0.3.0 - Owner: https://github.com/paluigi
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@7db4183bde486b98956012941462f130cb06baf8 -
Trigger Event:
release
-
Statement type:
File details
Details for the file ollama_classifier-0.3.0-py3-none-any.whl.
File metadata
- Download URL: ollama_classifier-0.3.0-py3-none-any.whl
- Upload date:
- Size: 18.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
42add244dba464fa59e9c81f6977fa5c80cc688fbd0b9e5b597d029363bf59b5
|
|
| MD5 |
6de2a25241060fb6d3308c26521adbc0
|
|
| BLAKE2b-256 |
32b4dcea9e38b97307f6ddbf8cce06b49afcdf9499df80d1038a62d6d8cc7ac3
|
Provenance
The following attestation bundles were made for ollama_classifier-0.3.0-py3-none-any.whl:
Publisher:
python-publish.yml on paluigi/ollama-classifier
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ollama_classifier-0.3.0-py3-none-any.whl -
Subject digest:
42add244dba464fa59e9c81f6977fa5c80cc688fbd0b9e5b597d029363bf59b5 - Sigstore transparency entry: 1392789157
- Sigstore integration time:
-
Permalink:
paluigi/ollama-classifier@7db4183bde486b98956012941462f130cb06baf8 -
Branch / Tag:
refs/tags/v0.3.0 - Owner: https://github.com/paluigi
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@7db4183bde486b98956012941462f130cb06baf8 -
Trigger Event:
release
-
Statement type: