Structural prompt compression with safety gating
Project description
prompt-compress
Structural prompt compression for production LLM apps. Where LLMLingua removes individual low-perplexity tokens, this library parses your system prompt into named components (instruction, examples, constraints, style, context), uses Bayesian optimisation to search which components to keep and how aggressively to compress each, scores candidates by semantic similarity to the original, and gates every output through a post-compression validator (persona / placeholder / similarity). Prompts that are already information-dense are detected up front and passed through unchanged.
Install
pip install prompt-compress
Quickstart — production integration
from prompt_compress import PromptCompressor, CompressionFailedError
compressor = PromptCompressor()
try:
result = compressor.compress(
SYSTEM_PROMPT,
min_similarity=0.80,
on_failure='raise',
)
SYSTEM_PROMPT = result.compressed_text
print(f"Saved {result.tokens_saved} tokens per call ({result.compression_ratio:.1%})")
except CompressionFailedError as e:
print(f"Compression unsafe, using original: {e}")
on_failure accepts 'fallback' (default — return the original silently with gate_passed=False), 'raise' (raise CompressionFailedError), or 'warn' (log a warning and return the fallback). The library never blocks on user input.
Inspecting results
result = compressor.compress(SYSTEM_PROMPT)
print(result.summary()) # one-screen terminal summary
print(result.diff()) # side-by-side original vs compressed
result.to_dict() # JSON-serialisable, useful for caching/logging
Key properties on CompressionResult:
| Property | Description |
|---|---|
compressed_text |
the output you should use |
compression_ratio |
tokens saved / original tokens |
tokens_saved |
absolute token count saved |
semantic_similarity |
cosine sim of original vs compressed (MiniLM) |
compression_efficiency |
compression_ratio × semantic_similarity |
safe_to_use |
True iff all validator checks passed |
persona_preserved |
True iff the "You are…" line survived |
placeholders_preserved |
True iff every {var} from the original is in the output |
tier / tier_label |
which pipeline tier ran (1 BO, 2 TextRank, 3 Preserved) |
density |
information density score used for routing |
Configuration
from prompt_compress import PromptCompressor, OptimisationConfig
compressor = PromptCompressor(
# Optimiser variants:
use_informed_prior=False, # seed BO with P3-derived prior
use_attention_prior=False, # per-prompt attention prior + ISR safety gate
# Trade-off knob:
alpha=0.3, # "auto" → 0.3 (validated benchmark default)
# Tune BO budget:
optimisation_config=OptimisationConfig(
n_iterations=20, n_init=5, beta=2.0, random_seed=42,
),
)
min_similarity and on_failure are per-call (compressor.compress(prompt, min_similarity=…, on_failure=…)) so different parts of your app can adopt different safety bars without rebuilding the compressor.
Benchmark results
Matched-subset comparison against LLMLingua on the 38 prompts both systems successfully compressed (see research/benchmark.py and research/evaluate.py to reproduce):
| Metric | Ours | LLMLingua |
|---|---|---|
| Compression ratio | 24.1% | 24.2% |
| LLM judge score (0–100) | 73.3 | 70.2 |
| Persona preservation | 100% | 53% |
| Compression efficiency | 0.179 | 0.155 |
Compression efficiency = compression_ratio × output_similarity — rewards being high on both axes.
Citation
EMNLP manuscript in preparation.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file prompt_compress-0.1.0.tar.gz.
File metadata
- Download URL: prompt_compress-0.1.0.tar.gz
- Upload date:
- Size: 35.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dd1244af6586cef058a5c1b0e0e530d69f0612ed20d52261837cb1569e0172ac
|
|
| MD5 |
070c92b4b27623282249f16443ef441b
|
|
| BLAKE2b-256 |
b3250c07e14a8cfde5bc98c470d6ddc786ef8f9c2d274b36bde2e33fd30c70e3
|
Provenance
The following attestation bundles were made for prompt_compress-0.1.0.tar.gz:
Publisher:
publish.yml on joela03/bayesian-prompt-compressor-
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
prompt_compress-0.1.0.tar.gz -
Subject digest:
dd1244af6586cef058a5c1b0e0e530d69f0612ed20d52261837cb1569e0172ac - Sigstore transparency entry: 1735854647
- Sigstore integration time:
-
Permalink:
joela03/bayesian-prompt-compressor-@efc5ef564c1826462bd19818062e5bcecd93eb0f -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/joela03
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@efc5ef564c1826462bd19818062e5bcecd93eb0f -
Trigger Event:
release
-
Statement type:
File details
Details for the file prompt_compress-0.1.0-py3-none-any.whl.
File metadata
- Download URL: prompt_compress-0.1.0-py3-none-any.whl
- Upload date:
- Size: 43.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5aae73f14263044c3f553b8f0a37f787fe62221c3cdc577f51ae75c0407e1841
|
|
| MD5 |
8f6acc87603c9de938817eb1a9914578
|
|
| BLAKE2b-256 |
ee612e442c23b90c2747227c411dbc059338a078dde11e9db8fe221e74f871e4
|
Provenance
The following attestation bundles were made for prompt_compress-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on joela03/bayesian-prompt-compressor-
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
prompt_compress-0.1.0-py3-none-any.whl -
Subject digest:
5aae73f14263044c3f553b8f0a37f787fe62221c3cdc577f51ae75c0407e1841 - Sigstore transparency entry: 1735854737
- Sigstore integration time:
-
Permalink:
joela03/bayesian-prompt-compressor-@efc5ef564c1826462bd19818062e5bcecd93eb0f -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/joela03
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@efc5ef564c1826462bd19818062e5bcecd93eb0f -
Trigger Event:
release
-
Statement type: