Easy-to-use prompt & data compression for LLM workflows (CompactPrompt, Choi et al. 2025, arXiv:2510.18043).
Project description
CompactPrompt
Easy-to-use prompt & data compression for LLM workflows — a clean, faithful implementation of the four strategies from CompactPrompt: A Unified Pipeline for Prompt and Data Compression in LLM Workflows (Choi et al., 2025).
The headline call is one line:
from compactprompt import CompactPrompt
result = CompactPrompt.compact("Please could you very kindly go ahead and provide "
"a really concise summary of the quarterly report.")
print(result.text) # the compressed prompt
print(f"{result.ratio:.2f}x smaller "
f"({result.tokens_before} -> {result.tokens_after} tokens)")
a really concise summary of the quarterly report.
1.7x smaller (22 -> 13 tokens)
Why
Long, data-rich prompts are expensive and bump into context limits.
compactprompt shrinks them while preserving meaning, using four complementary
techniques from the paper.
Install
The core (hard-prompt pruning + n-gram abbreviation) has no required
dependencies — CompactPrompt.compact() works on a clean Python install.
Heavy libraries are imported lazily, only when a strategy needs them. Install
just the extras you want:
pip install compactprompt # core
pip install 'compactprompt[freq]' # better static scores (wordfreq)
pip install 'compactprompt[dynamic]' # context-aware scoring (torch + transformers)
pip install 'compactprompt[phrases]' # grammar-preserving pruning (spaCy)
pip install 'compactprompt[ml]' # k-means quantization + exemplar selection
pip install 'compactprompt[embeddings]' # semantic embeddings (all-mpnet-base-v2)
pip install 'compactprompt[all]' # everything, faithful to the paper
For phrase-level pruning also download a spaCy model once:
python -m spacy download en_core_web_sm
Interactive demo
A tiny Streamlit app lets you paste a prompt and watch it compact in real time, with live token-savings metrics:
pip install 'compactprompt[app]'
streamlit run streamlit_app.py
The sidebar toggles each strategy (pruning aggressiveness, phrase preservation, context-aware scoring, reversible abbreviation, fidelity measurement); controls for unavailable optional dependencies are disabled with a hint to install them.
The four strategies
1. Hard Prompt Compression (lossy)
Drops low-information words/phrases, scored with hybrid static (corpus
rarity, -log2 p(t)) and dynamic (context surprisal, -log2 P_model(t|c))
self-information, fused with the paper's Δ=0.1 rule. Phrases are pruned as units
(via spaCy dependency parsing) to keep grammar intact, and named entities /
numbers are protected.
from compactprompt import CompactPrompt
# Target removing ~40% of tokens
r = CompactPrompt.compact(prompt, ratio=0.4)
# Or pin an absolute token budget
r = CompactPrompt.compact(prompt, budget=64)
Context-aware scoring is pluggable. Use the bundled offline model, or supply
your own scorer (any text -> [(token, start, end, bits), ...] callable):
from compactprompt import CompactPrompt, LocalLMScorer
r = CompactPrompt.compact(prompt, scorer=LocalLMScorer("gpt2")) # offline, no API key
2. Textual N-gram Abbreviation (lossless / reversible)
Replaces frequent multi-word patterns with short, token-cheap placeholders, and guarantees an exact round trip. A token-savings guard ensures the output is never longer than the input.
import compactprompt as cp
doc = "operating cash flow rose. operating cash flow fell. operating cash flow held."
abbr = cp.abbreviate(doc, n=3)
print(abbr.text) # '@0 rose. @0 fell. @0 held.'
print(abbr.dictionary) # {'@0': 'operating cash flow'}
assert abbr.restore() == doc
Keep abbr.dictionary as a legend so the downstream model (or you) can expand it.
Enable it inside the pipeline with CompactPrompt.compact(text, abbreviate=True).
3. Numerical Quantization (bounded-error)
Lowers the precision of numeric columns to save tokens, within a guaranteed error bound.
import compactprompt as cp
q = cp.quantize([1.0, 2.5, 3.3, 4.8, 9.2, 10.0], method="uniform", bits=8)
q.reconstruct() # approx originals
q.max_error # epsilon_max bound
q = cp.quantize(values, method="kmeans", k=16) # needs the `ml` extra
# Or a whole DataFrame:
new_df, results = cp.quantize_dataframe(df, bits=8)
4. Representative Example Selection (few-shot)
Picks a small, diverse set of exemplars by embedding candidates
(all-mpnet-base-v2), running k-means for k ∈ [5, 50], choosing k* by
maximum silhouette score, and keeping the point nearest each centroid.
from compactprompt import select_examples # needs `embeddings` + `ml`
sel = select_examples(candidate_texts, k_range=(5, 50))
few_shot = sel.examples # the chosen prototypes
sel.k_star # selected number of clusters
Measuring fidelity
from compactprompt import cosine_fidelity # needs `embeddings`
f = cosine_fidelity(original_text, result.text)
print(f.mean, f.p5) # mean and worst-case (5th pct) cosine similarity
The result object
CompactPrompt.compact(...) returns a CompactResult:
| attribute | meaning |
|---|---|
.text |
the compressed prompt (also str(result)) |
.original |
the input |
.tokens_before / .tokens_after |
token counts (tiktoken if available) |
.ratio |
tokens_before / tokens_after (e.g. 2.3x) |
.savings |
fraction of tokens removed, [0, 1] |
.dictionary |
reversible n-gram map (when abbreviate=True) |
.restore() |
reverse the lossless abbreviation step |
.steps / .stats |
which strategies ran, with diagnostics |
All compact() options
CompactPrompt.compact(
prompt,
ratio=0.5, # fraction of tokens to remove via pruning (ignored if budget set)
budget=None, # absolute target token count
prune=True, # hard-prompt pruning (lossy, usable as-is)
abbreviate=False, # also apply reversible n-gram abbreviation
ngram=2, # n-gram length (paper's best: 2)
top_k=100, # number of frequent n-grams to abbreviate
scorer=None, # pluggable dynamic self-information scorer
static=None, # static self-information scorer (default: best available)
delta_threshold=0.1, # static/dynamic fusion threshold (paper default)
use_phrases=True, # grammar-preserving phrase pruning (needs spaCy)
spacy_model="en_core_web_sm",
)
Reuse an instance to amortize an expensive scorer/model across many prompts:
cp = CompactPrompt(scorer=LocalLMScorer("gpt2"))
cp.run(prompt_a)
cp.run(prompt_b)
Tests
pip install pytest
pytest
The suite runs on the zero-dependency core; embedding/clustering tests skip automatically when their optional dependencies are absent.
Citation
@article{choi2025compactprompt,
title={CompactPrompt: A Unified Pipeline for Prompt and Data Compression in LLM Workflows},
author={Choi, Joong Ho and Zhao, Jiayang and Shah, Jeel and Sonawane, Ritvika and
Singh, Vedant and Appalla, Avani and Flanagan, Will and Condessa, Filipe},
journal={arXiv preprint arXiv:2510.18043},
year={2025}
}
This is an independent implementation of the methodology; it is not affiliated with the paper's authors.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file compactprompt-0.1.0.tar.gz.
File metadata
- Download URL: compactprompt-0.1.0.tar.gz
- Upload date:
- Size: 38.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
eded6fd330dbbe77f5621febd785323a753c7bf1e8c51ff9662b6ba802bd7f51
|
|
| MD5 |
e6ce6ddaa2bd76b091711c54a2eb2ac3
|
|
| BLAKE2b-256 |
5662360af0cf519b19e581339480b9d2eb964107dfacc5a81d2378262b4d51c7
|
Provenance
The following attestation bundles were made for compactprompt-0.1.0.tar.gz:
Publisher:
publish.yml on gtkcyber/compact_prompt
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
compactprompt-0.1.0.tar.gz -
Subject digest:
eded6fd330dbbe77f5621febd785323a753c7bf1e8c51ff9662b6ba802bd7f51 - Sigstore transparency entry: 1796605214
- Sigstore integration time:
-
Permalink:
gtkcyber/compact_prompt@46c0683997b65a5028ceaf6c3406f93819c69792 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/gtkcyber
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@46c0683997b65a5028ceaf6c3406f93819c69792 -
Trigger Event:
release
-
Statement type:
File details
Details for the file compactprompt-0.1.0-py3-none-any.whl.
File metadata
- Download URL: compactprompt-0.1.0-py3-none-any.whl
- Upload date:
- Size: 26.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6fe10741062b375785821cb7567e1783cb31f763135e7e448843d2dee98513c1
|
|
| MD5 |
2d19040860abe83b836c2dc4b7a5aa11
|
|
| BLAKE2b-256 |
6f01ceccb1c60055a56286ae234367a6adc7396608f127f207d7216d60186a3d
|
Provenance
The following attestation bundles were made for compactprompt-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on gtkcyber/compact_prompt
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
compactprompt-0.1.0-py3-none-any.whl -
Subject digest:
6fe10741062b375785821cb7567e1783cb31f763135e7e448843d2dee98513c1 - Sigstore transparency entry: 1796605716
- Sigstore integration time:
-
Permalink:
gtkcyber/compact_prompt@46c0683997b65a5028ceaf6c3406f93819c69792 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/gtkcyber
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@46c0683997b65a5028ceaf6c3406f93819c69792 -
Trigger Event:
release
-
Statement type: