preLLM — One function for small LLM preprocessing before large LLM execution. Like litellm.completion() but with decomposition.

These details have not been verified by PyPI

Project links

Repository

Project description

🧠 preLLM

One function for small LLM preprocessing before large LLM execution. Like litellm.completion() but with decomposition.

from prellm import preprocess_and_execute

result = await preprocess_and_execute(
    query="Deploy app to production",
    small_llm="ollama/qwen2.5:3b",
    large_llm="gpt-4o-mini",
)
print(result.content)

Install & Run in 60 Seconds

pip install prellm

# CLI — zero config
prellm query "Zdeployuj apkę na prod" --small ollama/qwen2.5:3b --large gpt-4o-mini

# With strategy
prellm query "Refaktoryzuj kod" --strategy structure --json

# Docker
docker run prellm/prellm query "Deploy app" --small ollama/qwen2.5:3b --large gpt-4o-mini

How It Works

User Query → Small LLM (≤3B, local) → classify/structure/enrich → Large LLM (cloud) → Validated Response
              Qwen2.5 / Phi3 / Gemma      decomposition pipeline     GPT-4 / Claude / Llama

Result: 70-80% token savings + enterprise-quality output for the price of a small LLM call.

Python API

One Function (recommended)

from prellm import preprocess_and_execute

# Zero-config — just query + models
result = await preprocess_and_execute("Refaktoryzuj kod")

# Full control
result = await preprocess_and_execute(
    query="Deploy app to production",
    small_llm="ollama/qwen2.5:3b",      # local preprocessing
    large_llm="anthropic/claude-sonnet-4-20250514",  # cloud execution
    strategy="structure",                 # classify|structure|split|enrich|passthrough
    user_context="gdansk_embedded_python",
)

print(result.content)              # Large LLM response
print(result.decomposition)        # Small LLM analysis
print(result.model_used)           # "anthropic/claude-sonnet-4-20250514"
print(result.small_model_used)     # "ollama/qwen2.5:3b"

Sync Version

from prellm import preprocess_and_execute_sync

result = preprocess_and_execute_sync("Deploy app", large_llm="gpt-4o-mini")

With Domain Rules

result = await preprocess_and_execute(
    query="Usuń bazę danych klientów",
    small_llm="ollama/qwen2.5:3b",
    large_llm="gpt-4o-mini",
    domain_rules=[{
        "name": "destructive_db",
        "keywords": ["delete", "drop", "usuń"],
        "required_fields": ["target_database", "backup_confirmed"],
        "severity": "critical",
    }],
)
print(result.decomposition.missing_fields)  # ["target_database", "backup_confirmed"]

With YAML Config

result = await preprocess_and_execute(
    query="Deploy to staging",
    config_path="configs/prellm_config.yaml",
)

Use Cases

1. Code Refactoring

result = await preprocess_and_execute(
    query="Popraw mój projekt z hardcode'em",
    small_llm="ollama/qwen2.5:3b",
    large_llm="anthropic/claude-sonnet-4-20250514",
    strategy="structure",
    user_context="gdansk_embedded_python",
)
# Small LLM: classify intent, extract structure, compose prompt
# Large LLM: complete refactored code with tests
# Cost: $0.01 + $0.45 = $0.46

2. Kubernetes Diagnostics

result = await preprocess_and_execute(
    query="Zdiagnozuj problem z K8s podami",
    small_llm="ollama/qwen2.5:3b",
    large_llm="gpt-4o-mini",
    strategy="enrich",
    user_context={"cluster": "k8s-prod", "namespace": "backend"},
)
# Small LLM: parse context, identify missing fields, enrich prompt
# Large LLM: root cause + K8s manifests + Prometheus rules
# Cost: $0.02 + $0.38 = $0.40

3. Business Automation

result = await preprocess_and_execute(
    query="Zautomatyzuj kalkulację leasingu dla camper van",
    small_llm="ollama/qwen2.5:3b",
    large_llm="anthropic/claude-sonnet-4-20250514",
    strategy="enrich",
    user_context="PL_automotive_leasing",
)
# Small LLM: domain=automotive, locale=PL, required=[VAT, WIBOR]
# Large LLM: Python calculator + Excel generator + PDF templates
# Cost: $0.015 + $0.52 = $0.535

5 Decomposition Strategies

Strategy	What it does	Best for
`classify`	Classify intent + domain	General queries, routing
`structure`	Extract action, target, params	DevOps commands, API calls
`split`	Break into sub-queries	Complex multi-part requests
`enrich`	Add missing context	Incomplete prompts, safety
`passthrough`	No preprocessing	Simple/direct queries

Configuration (YAML)

# configs/prellm_config.yaml
small_model:
  model: "ollama/qwen2.5:3b"
  fallback: ["phi3:mini"]
  max_tokens: 512

large_model:
  model: "gpt-4o-mini"
  fallback: ["llama3", "mistral"]
  max_tokens: 2048

default_strategy: classify

domain_rules:
  - name: production_deploy
    keywords: ["deploy", "push", "release"]
    required_fields: ["environment", "version"]
    severity: critical
    strategy: structure

Process Chains (DevOps Workflows)

from prellm import PreLLM, ProcessChain

engine = PreLLM("configs/prellm_config.yaml")
chain = ProcessChain("configs/deploy.yaml", engine=engine)
result = await chain.execute(env="production", dry_run=True)

for step in result.steps:
    print(f"{step.step_name}: {step.status}")

Architecture

preprocess_and_execute(query, small_llm, large_llm)
    │
    ├── ContextEngine (env/git/system)
    ├── QueryDecomposer (small LLM ≤3B)
    │   ├── classify → intent + domain
    │   ├── structure → action + target + params
    │   ├── split → sub-queries
    │   ├── enrich → missing fields + context
    │   └── compose → optimized prompt
    ├── LLMProvider (large LLM via litellm)
    │   ├── retry + fallback chain
    │   └── 100+ models (OpenAI, Anthropic, Ollama, etc.)
    └── PreLLMResponse (Pydantic v2 validated)

Development

git clone https://github.com/wronai/prellm
cd prellm
poetry install
poetry run pytest          # 144+ tests
poetry run pytest --cov    # ~80% coverage

Roadmap

See ROADMAP.md for the full 12-month plan to make preLLM a standard.

License

Apache License 2.0 - see LICENSE for details.

Author

Created by Tom Sapletta - tom@sapletta.com

Project details

These details have not been verified by PyPI

Project links

Repository

Release history Release notifications | RSS feed

0.4.25

Mar 29, 2026

0.4.24

Mar 29, 2026

0.4.23

Mar 29, 2026

0.4.22

Mar 29, 2026

0.4.21

Mar 29, 2026

0.4.20

Mar 29, 2026

0.4.17

Mar 25, 2026

0.4.15

Mar 25, 2026

0.4.14

Mar 25, 2026

0.4.12

Mar 25, 2026

0.4.5

Feb 16, 2026

0.4.3

Feb 16, 2026

0.4.2

Feb 16, 2026

0.4.1

Feb 16, 2026

0.3.15

Feb 16, 2026

0.3.14

Feb 16, 2026

0.3.13

Feb 16, 2026

0.3.12

Feb 15, 2026

0.3.11

Feb 15, 2026

0.3.10

Feb 15, 2026

0.3.9

Feb 15, 2026

0.3.7

Feb 15, 2026

0.3.6

Feb 15, 2026

0.3.5

Feb 15, 2026

0.3.3

Feb 15, 2026

0.3.2

Feb 15, 2026

This version

0.3.1

Feb 15, 2026

0.2.1

Feb 15, 2026

0.1.15

Feb 15, 2026

0.1.13

Feb 15, 2026

0.1.12

Feb 15, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

prellm-0.3.1.tar.gz (25.6 kB view details)

Uploaded Feb 15, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

prellm-0.3.1-py3-none-any.whl (30.6 kB view details)

Uploaded Feb 15, 2026 Python 3

File details

Details for the file prellm-0.3.1.tar.gz.

File metadata

Download URL: prellm-0.3.1.tar.gz
Upload date: Feb 15, 2026
Size: 25.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for prellm-0.3.1.tar.gz
Algorithm	Hash digest
SHA256	`cae3cf9f960875c7e459ca66d1680aa3deccb1a225e4ec05ebf9379dd1b2c34c`
MD5	`be1e227906eca77f90bbe6702cc84caa`
BLAKE2b-256	`ae4bd7122135737c1b68c7d74f0a1a220fc4c2f49e6f047ba106719b99e96d43`

See more details on using hashes here.

File details

Details for the file prellm-0.3.1-py3-none-any.whl.

File metadata

Download URL: prellm-0.3.1-py3-none-any.whl
Upload date: Feb 15, 2026
Size: 30.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for prellm-0.3.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2dedb01381f5c5540fe0edc8c1266bc437036d538ccc2f25f728f31ff03de7ab`
MD5	`8b72335d9338f955bec256753bcc577b`
BLAKE2b-256	`dce2af947ad7d0f8937e4913f9ed16ee4d3464e8d8c8e6c888bf3943025d1a99`

See more details on using hashes here.

prellm 0.3.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

🧠 preLLM

Install & Run in 60 Seconds

How It Works

Python API

One Function (recommended)

Sync Version

With Domain Rules

With YAML Config

Use Cases

1. Code Refactoring

2. Kubernetes Diagnostics

3. Business Automation

5 Decomposition Strategies

Configuration (YAML)

Process Chains (DevOps Workflows)

Architecture

Development

Roadmap

License

Author

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes