A lightweight Python library for optimizing and cleaning LLM inputs

These details have not been verified by PyPI

Project links

Project description

Prompt Refiner

🚀 Lightweight Python library for building production LLM applications with smart context management and automatic token optimization. Save 10-20% on API costs while fitting RAG docs, chat history, and prompts into your token budget.

🎯 Perfect for:

RAG Applications • Chatbots • Document Processing • Production LLM Apps • Cost Optimization

Why use Prompt Refiner?

Build production RAG applications with automatic token optimization and smart context management. Here's a complete example (see examples/quickstart.py for full code):

from prompt_refiner import MessagesPacker, SchemaCompressor, ResponseCompressor, StripHTML
from openai import OpenAI, pydantic_function_tool
from pydantic import BaseModel, Field

# 1. Pack messages with token budget (101 → 56 tokens, 44.6% saved)
packer = MessagesPacker(max_tokens=1000, model="gpt-4o-mini")
packer.add("You are a helpful AI assistant that helps users find books.", role="system")
packer.add("Search for books about Python programming.", role="query")
packer.add("<div><h1>Installation Guide</h1>...</div>", role="context", refine_with=[StripHTML()])
messages = packer.pack()

# 2. Compress tool schema (139 → 131 tokens, 5.8% saved)
class SearchBooksInput(BaseModel):
    query: str = Field(description="Search query to find books")

tool_schema = pydantic_function_tool(SearchBooksInput, name="search_books")
compressed_schema = SchemaCompressor().process(tool_schema)

# 3. Execute tool call with OpenAI
client = OpenAI()
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=messages,
    tools=[compressed_schema]
)
tool_call = response.choices[0].message.tool_calls[0]
tool_response = search_books(**json.loads(tool_call.function.arguments))

# 4. Compress tool response (19251 → 12813 tokens, 33.4% saved)
compressed_response = ResponseCompressor().process(tool_response)

💡 Run python examples/quickstart.py to see the complete workflow with real OpenAI API verification.

Key benefits:

Tool schema compression - Save 10-15% tokens on function definitions
Tool response compression - Save 30-70% tokens on API responses
Compose operations with | - Chain multiple cleaners into a pipeline
Save 10-20% tokens - Remove HTML, whitespace, duplicates, and redact PII automatically
Stay within budget - MessagesPacker fits everything into 1000 tokens using priority-based selection
JIT cleaning - Clean content on-the-fly with refine_with parameter
Production ready - Output goes directly to OpenAI without extra steps

✨ Key Features

Module	Description	Components
Cleaner	Remove noise and save tokens	`StripHTML()`, `NormalizeWhitespace()`, `FixUnicode()`, `JsonCleaner()`
Compressor	Reduce size aggressively	`TruncateTokens()`, `Deduplicate()`
Scrubber	Protect sensitive data	`RedactPII()`
Tools	Optimize LLM tool schemas and responses	`SchemaCompressor()`, `ResponseCompressor()`
Packer	Fit content within token budgets	`MessagesPacker` (chat APIs), `TextPacker` (completion APIs)
Strategy	Benchmark-tested presets for quick setup	`MinimalStrategy`, `StandardStrategy`, `AggressiveStrategy`

Installation

# Basic installation (lightweight, zero dependencies)
pip install llm-prompt-refiner

# With precise token counting (optional, installs tiktoken)
pip install llm-prompt-refiner[token]

Examples

Check out the examples/ folder for detailed examples:

strategy/ - Preset strategies (Minimal, Standard, Aggressive) with benchmark results
cleaner/ - HTML cleaning, JSON compression, whitespace normalization, Unicode fixing
compressor/ - Smart truncation, deduplication
scrubber/ - PII redaction (emails, phones, credit cards, etc.)
tools/ - Tool/API output cleaning for agent systems
packer/ - Context budget management with OpenAI integration
analyzer/ - Token counting and cost savings tracking
custom_operation.py - Build your own custom operations

📖 Full documentation: examples/README.md

📊 Proven Effectiveness

We benchmarked Prompt Refiner on 30 real-world test cases (SQuAD + RAG scenarios) to measure token reduction and response quality:

Strategy	Token Reduction	Quality (Cosine)	Judge Approval	Overall Equivalent
Minimal	4.3%	0.987	86.7%	86.7%
Standard	4.8%	0.984	90.0%	86.7%
Aggressive	15.0%	0.964	80.0%	66.7%

Key Insights:

Aggressive strategy achieves 3x more savings (15%) vs Minimal while maintaining 96.4% quality
Individual RAG tests showed 17-74% token savings with aggressive strategy
Deduplicate (Standard) shows minimal gains on typical RAG contexts
TruncateTokens (Aggressive) provides the largest cost reduction for long contexts
Trade-off: More aggressive = more savings but slightly lower judge approval

Example: RAG with duplicates

Minimal (HTML + Whitespace): 17% reduction
Standard (+ Deduplicate): 31% reduction
Aggressive (+ Truncate 150 tokens): 49% reduction 🎉

Token Reduction vs Quality

💰 Cost Savings: At scale (1M tokens/month), 15% reduction saves ~$54/month on GPT-4 input tokens.

📖 See full benchmark: benchmark/custom/README.md

⚡ Performance & Latency

"What's the latency overhead?" - Negligible. Prompt Refiner adds < 0.5ms per 1k tokens of overhead.

Strategy	@ 1k tokens	@ 10k tokens	@ 50k tokens	Overhead per 1k tokens
Minimal (HTML + Whitespace)	0.05ms	0.48ms	2.39ms	0.05ms
Standard (+ Deduplicate)	0.26ms	2.47ms	12.27ms	0.25ms
Aggressive (+ Truncate)	0.26ms	2.46ms	12.38ms	0.25ms

Key Insights:

⚡ Minimal strategy: Only 0.05ms per 1k tokens (faster than a network packet)
🎯 Standard strategy: 0.25ms per 1k tokens - adds ~2.5ms to a 10k token prompt
📊 Context: Network + LLM TTFT is typically 600ms+, refining adds < 0.5% overhead
🚀 Individual operations (HTML, whitespace) are < 0.5ms per 1k tokens

Real-world impact:

10k token RAG context refining: ~2.5ms overhead
Network latency: ~100ms
LLM Processing (TTFT): ~500ms+
Total overhead: < 0.5% of request time

🔬 Run yourself: python benchmark/latency/benchmark.py (no API keys needed)

🎮 Interactive Demo

Try prompt-refiner in your browser - no installation required!

🚀 Launch Interactive Demo →

Play with different strategies, see real-time token savings, and find the perfect configuration for your use case. Features:

🎯 6 preset examples (e-commerce, support tickets, docs, RAG, etc.)
⚡ Quick strategy presets (Minimal, Standard, Aggressive)
💰 Real-time cost savings calculator
🔧 All 7 operations configurable
📊 Visual metrics dashboard

Star History

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.3

Dec 17, 2025

0.2.2

Dec 16, 2025

0.2.1

Dec 13, 2025

0.2.0

Dec 13, 2025

0.1.8

Dec 13, 2025

This version

0.1.7

Dec 7, 2025

0.1.6

Dec 6, 2025

0.1.5

Dec 2, 2025

0.1.4

Dec 1, 2025

0.1.3

Nov 30, 2025

0.1.2

Nov 29, 2025

0.1.1

Nov 29, 2025

0.1.0

Nov 28, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_prompt_refiner-0.1.7.tar.gz (611.9 kB view details)

Uploaded Dec 7, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

llm_prompt_refiner-0.1.7-py3-none-any.whl (39.0 kB view details)

Uploaded Dec 7, 2025 Python 3

File details

Details for the file llm_prompt_refiner-0.1.7.tar.gz.

File metadata

Download URL: llm_prompt_refiner-0.1.7.tar.gz
Upload date: Dec 7, 2025
Size: 611.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.16 {"installer":{"name":"uv","version":"0.9.16","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for llm_prompt_refiner-0.1.7.tar.gz
Algorithm	Hash digest
SHA256	`88951d1b6166da604f2df06bab612deb65f4658d50f2d66f9ab86ea00c0cf8dd`
MD5	`548d67c6f6f83509cf097409029734f0`
BLAKE2b-256	`47b6e19f7d6ee1ed36b3ace5c5693c161c04e8c0cc7dfbcbb742568956d55006`

See more details on using hashes here.

File details

Details for the file llm_prompt_refiner-0.1.7-py3-none-any.whl.

File metadata

Download URL: llm_prompt_refiner-0.1.7-py3-none-any.whl
Upload date: Dec 7, 2025
Size: 39.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.16 {"installer":{"name":"uv","version":"0.9.16","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for llm_prompt_refiner-0.1.7-py3-none-any.whl
Algorithm	Hash digest
SHA256	`53fb112310d90cbff2eea5d6864cf4c35a8f186a852e4e5d02f6aa32c88bd566`
MD5	`4d1ee6e39344f7e93b45f5729ebed415`
BLAKE2b-256	`361ea4de867bc60539f1b2c147c02631dc7a25cea9494923a711ca047b63d356`

See more details on using hashes here.

llm-prompt-refiner 0.1.7

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Prompt Refiner

🎯 Perfect for:

Why use Prompt Refiner?

✨ Key Features

Installation

Examples

📊 Proven Effectiveness

⚡ Performance & Latency

🎮 Interactive Demo

Star History

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes