A wrapper around Ollama Python SDK for text classification with constrained output and confidence scoring

Project description

ollama-classifier

A Python wrapper around the Ollama Python SDK for text classification with constrained output and confidence scoring. Now supports multiple inference backends: Ollama, vLLM, SGLang, and llama.cpp.

Features

Constrained Output: Uses JSON schema with enum constraints to ensure only valid choices are generated
Confidence Scoring: Multi-call evaluation with softmax for calibrated probabilities
Sync & Async: Full support for both synchronous and asynchronous operations
Batch Processing: Classify multiple texts efficiently
Flexible Choices: Support for simple labels or labels with descriptions
Custom Prompts: Override the default system prompt for specialized tasks
Multiple Backends: Use Ollama, vLLM, SGLang, or llama.cpp as your inference engine (local or remote)

Installation

Core (Ollama only)

pip install ollama-classifier

Or with uv:

uv add ollama-classifier

With additional backends (vLLM, SGLang, llama.cpp)

pip install "ollama-classifier[backends]"

Or with uv:

uv add "ollama-classifier[backends]"

Prerequisites

Ollama backend: Ollama installed and running, with a model pulled (e.g., ollama pull llama3.2)
vLLM backend: A running vLLM server
SGLang backend: A running SGLang server
llama.cpp backend: A running llama.cpp server

Quick Start

Ollama (original backend)

from ollama import Client
from ollama_classifier import OllamaClassifier

client = Client()
classifier = OllamaClassifier(client, model="llama3.2")

result = classifier.classify(
    text="I love this product!",
    choices=["positive", "negative", "neutral"]
)

print(f"Prediction: {result.prediction}")
print(f"Confidence: {result.confidence:.2%}")
print(f"Probabilities: {result.probabilities}")

vLLM

from ollama_classifier.backends import VLLMBackend
from ollama_classifier import LLMClassifier

backend = VLLMBackend(
    model="meta-llama/Llama-3.2-3B-Instruct",
    base_url="http://localhost:8000/v1",
)
classifier = LLMClassifier(backend)

result = classifier.classify(
    text="I love this product!",
    choices=["positive", "negative", "neutral"]
)

SGLang

from ollama_classifier.backends import SGLangBackend
from ollama_classifier import LLMClassifier

backend = SGLangBackend(
    model="meta-llama/Llama-3.2-3B-Instruct",
    base_url="http://localhost:30000/v1",
)
classifier = LLMClassifier(backend)

result = classifier.classify(
    text="I love this product!",
    choices=["positive", "negative", "neutral"]
)

llama.cpp

from ollama_classifier.backends import LlamaCppBackend
from ollama_classifier import LLMClassifier

backend = LlamaCppBackend(
    model="model",
    base_url="http://localhost:8080/v1",
)
classifier = LLMClassifier(backend)

result = classifier.classify(
    text="I love this product!",
    choices=["positive", "negative", "neutral"]
)

Usage

Basic Classification

from ollama import Client
from ollama_classifier import OllamaClassifier

client = Client()
classifier = OllamaClassifier(client, model="llama3.2")

result = classifier.classify(
    text="The goalkeeper made an incredible save!",
    choices=["sports", "politics", "technology", "entertainment"]
)

Classification with Label Descriptions

Providing descriptions helps the model understand each category better:

choices = {
    "positive": "Text expresses happiness, satisfaction, or approval",
    "negative": "Text expresses anger, disappointment, or disapproval",
    "mixed": "Text contains both positive and negative sentiments",
    "neutral": "Text is factual without strong emotional content",
}

result = classifier.classify(
    text="The food was amazing but the service was terrible.",
    choices=choices
)

Custom System Prompt

result = classifier.classify(
    text="The quarterly earnings exceeded analyst expectations.",
    choices=["bullish", "bearish", "neutral"],
    system_prompt="You are a financial sentiment analyzer. "
                  "Classify financial news based on market sentiment."
)

Scoring (Multi-Call with Softmax)

Get calibrated probability distribution over all choices. Makes N API calls for N choices:

result = classifier.score(
    text="The movie was fantastic!",
    choices=["positive", "negative", "neutral"]
)

Generate Only (Fastest)

When you only need the prediction without confidence scores:

prediction = classifier.generate(
    text="The team won the championship!",
    choices=["sports", "finance", "politics"]
)

Batch Classification

texts = [
    "The goalkeeper made an incredible save!",
    "The central bank raised interest rates.",
    "The new smartphone features a revolutionary camera.",
]

results = classifier.batch_classify(
    texts=texts,
    choices=["sports", "finance", "technology"]
)

for text, result in zip(texts, results):
    print(f"{text} -> {result.prediction} ({result.confidence:.2%})")

Async Usage

import asyncio
from ollama import AsyncClient
from ollama_classifier import OllamaClassifier

async def main():
    client = AsyncClient()
    classifier = OllamaClassifier(client, model="llama3.2")
    
    # Single classification
    result = await classifier.aclassify(
        text="The concert was amazing!",
        choices=["positive", "negative", "neutral"]
    )
    
    # Batch classification (concurrent)
    results = await classifier.abatch_classify(
        texts=["Text 1", "Text 2", "Text 3"],
        choices=["positive", "negative", "neutral"]
    )

asyncio.run(main())

Inference Backends

Ollama (default)

The original backend using the Ollama Python SDK. Requires Ollama installed locally.

from ollama import Client
from ollama_classifier import OllamaClassifier

classifier = OllamaClassifier(Client(), model="llama3.2")

vLLM

High-throughput serving engine. Supports local and remote servers.

Local server:

python -m vllm.entrypoints.openai.api_server \
    --model meta-llama/Llama-3.2-3B-Instruct \
    --host 0.0.0.0 --port 8000

Connect:

from ollama_classifier.backends import VLLMBackend
from ollama_classifier import LLMClassifier

backend = VLLMBackend(
    model="meta-llama/Llama-3.2-3B-Instruct",
    base_url="http://localhost:8000/v1",
)
classifier = LLMClassifier(backend)

Remote server:

backend = VLLMBackend(
    model="your-model",
    base_url="https://your-vllm-server.com/v1",
    api_key="your-api-key",  # if authentication is required
)

SGLang

Fast serving system for large language models. Supports local and remote servers.

Local server:

python -m sglang.launch_server \
    --model-path meta-llama/Llama-3.2-3B-Instruct \
    --host 0.0.0.0 --port 30000

Connect:

from ollama_classifier.backends import SGLangBackend
from ollama_classifier import LLMClassifier

backend = SGLangBackend(
    model="meta-llama/Llama-3.2-3B-Instruct",
    base_url="http://localhost:30000/v1",
)
classifier = LLMClassifier(backend)

llama.cpp

Lightweight inference via llama-server. Ideal for CPU or mixed CPU/GPU environments.

Local server:

./llama-server -m model.gguf --host 0.0.0.0 --port 8080 -c 4096

Connect:

from ollama_classifier.backends import LlamaCppBackend
from ollama_classifier import LLMClassifier

backend = LlamaCppBackend(
    model="model",
    base_url="http://localhost:8080/v1",
)
classifier = LLMClassifier(backend)

Note: JSON schema constraints and logprobs require llama.cpp to be compiled with the appropriate flags (e.g., LLAMA_JSON_SCHEMA and LLAMA_SUPPORT_LOGPROBS).

Backend Configuration

All backends share common configuration options:

Parameter	Default	Description
`model`	(required)	Model identifier
`base_url`	Engine-specific	Base URL of the inference server
`api_key`	`"not-needed"`	API key for authentication
`timeout`	`120.0`	Request timeout in seconds
`max_tokens`	`256`	Maximum tokens to generate
`extra_body`	`{}`	Extra parameters merged into every request

API Reference

ClassificationResult

@dataclass
class ClassificationResult:
    prediction: str              # The predicted choice label
    confidence: float            # Confidence score (0.0 to 1.0)
    probabilities: Dict[str, float]  # Probability distribution over all choices
    raw_response: Dict           # Raw response for debugging

OllamaClassifier Methods

Method	Async	Description
`generate(text, choices, system_prompt)`	`agenerate`	Constrained output only (fastest)
`score(text, choices, system_prompt)`	`ascore`	Multi-call evaluation with softmax
`classify(text, choices, system_prompt)`	`aclassify`	Full classification with confidence scores
`batch_generate(texts, choices, system_prompt)`	`abatch_generate`	Batch constrained output
`batch_score(texts, choices, system_prompt)`	`abatch_score`	Batch scoring
`batch_classify(texts, choices, system_prompt)`	`abatch_classify`	Batch classification

LLMClassifier Methods

LLMClassifier exposes the same API as OllamaClassifier but accepts any LLMBackend:

Method	Async	Description
`generate(text, choices, system_prompt)`	`agenerate`	Constrained output only (fastest)
`score(text, choices, system_prompt)`	`ascore`	Multi-call evaluation with softmax
`classify(text, choices, system_prompt)`	`aclassify`	Full classification with confidence scores
`batch_generate(texts, choices, system_prompt)`	`abatch_generate`	Batch constrained output
`batch_score(texts, choices, system_prompt)`	`abatch_score`	Batch scoring
`batch_classify(texts, choices, system_prompt)`	`abatch_classify`	Batch classification

Parameters

text (str): The text to classify
texts (List[str]): List of texts to classify (batch methods)
choices (Union[List[str], Dict[str, str]]): Either a list of choice labels, or a dict mapping labels to descriptions
system_prompt (str | None): Optional custom system prompt

Choosing a Method

Use Case	Recommended Method
Speed is critical, no confidence needed	`generate`
Accurate confidence scores	`classify` / `score`
Batch processing	`batch_classify` or `batch_score`
Concurrent processing	Async variants (`aclassify`, etc.)

License

MIT License

Development

This project just started! Looking forward to suggestions, issues, and pull requests!

Project details

Release history Release notifications | RSS feed

This version

0.3.0

Apr 27, 2026

0.2.0

Mar 26, 2026

0.1.1

Mar 12, 2026

0.1.0

Mar 11, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ollama_classifier-0.3.0.tar.gz (11.3 kB view details)

Uploaded Apr 27, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

ollama_classifier-0.3.0-py3-none-any.whl (18.8 kB view details)

Uploaded Apr 27, 2026 Python 3

File details

Details for the file ollama_classifier-0.3.0.tar.gz.

File metadata

Download URL: ollama_classifier-0.3.0.tar.gz
Upload date: Apr 27, 2026
Size: 11.3 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ollama_classifier-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`1f7c5ebc1038a601161a2ec1326c62267481b9ad9dbb72d0f1e21f4095366bb1`
MD5	`90459b9253f6d056eb18e84604c04ea8`
BLAKE2b-256	`10a3e25d4505aa11691faeb69773ce2b587ad7b4f7d6617dd848037d8fd38508`

See more details on using hashes here.

Provenance

The following attestation bundles were made for ollama_classifier-0.3.0.tar.gz:

Publisher: python-publish.yml on paluigi/ollama-classifier

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: ollama_classifier-0.3.0.tar.gz
- Subject digest: 1f7c5ebc1038a601161a2ec1326c62267481b9ad9dbb72d0f1e21f4095366bb1
- Sigstore transparency entry: 1392789140
- Sigstore integration time: Apr 27, 2026
Source repository:
- Permalink: paluigi/ollama-classifier@7db4183bde486b98956012941462f130cb06baf8
- Branch / Tag: refs/tags/v0.3.0
- Owner: https://github.com/paluigi
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: python-publish.yml@7db4183bde486b98956012941462f130cb06baf8
- Trigger Event: release

File details

Details for the file ollama_classifier-0.3.0-py3-none-any.whl.

File metadata

Download URL: ollama_classifier-0.3.0-py3-none-any.whl
Upload date: Apr 27, 2026
Size: 18.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ollama_classifier-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`42add244dba464fa59e9c81f6977fa5c80cc688fbd0b9e5b597d029363bf59b5`
MD5	`6de2a25241060fb6d3308c26521adbc0`
BLAKE2b-256	`32b4dcea9e38b97307f6ddbf8cce06b49afcdf9499df80d1038a62d6d8cc7ac3`

See more details on using hashes here.

Provenance

The following attestation bundles were made for ollama_classifier-0.3.0-py3-none-any.whl:

Publisher: python-publish.yml on paluigi/ollama-classifier

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: ollama_classifier-0.3.0-py3-none-any.whl
- Subject digest: 42add244dba464fa59e9c81f6977fa5c80cc688fbd0b9e5b597d029363bf59b5
- Sigstore transparency entry: 1392789157
- Sigstore integration time: Apr 27, 2026
Source repository:
- Permalink: paluigi/ollama-classifier@7db4183bde486b98956012941462f130cb06baf8
- Branch / Tag: refs/tags/v0.3.0
- Owner: https://github.com/paluigi
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: python-publish.yml@7db4183bde486b98956012941462f130cb06baf8
- Trigger Event: release

ollama-classifier 0.3.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

ollama-classifier

Features

Installation

Core (Ollama only)

With additional backends (vLLM, SGLang, llama.cpp)

Prerequisites

Quick Start

Ollama (original backend)

vLLM

SGLang

llama.cpp

Usage

Basic Classification

Classification with Label Descriptions

Custom System Prompt

Scoring (Multi-Call with Softmax)

Generate Only (Fastest)

Batch Classification

Async Usage

Inference Backends

Ollama (default)

vLLM

SGLang

llama.cpp

Backend Configuration

API Reference

ClassificationResult

OllamaClassifier Methods

LLMClassifier Methods

Parameters

Choosing a Method

License

Development

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance