Prompt rewrite pipeline with verifier and token counting.
Project description
promptcrab
Keep the meaning. Trim the spell.
English · 繁體中文 · Installation · Quick Start · Model Guidance
promptcrab is a CLI for rewriting prompts for downstream LLMs with lower token cost and strict fidelity checks.
Instead of simply shortening text, it generates multiple rewrite candidates, verifies that they preserve task meaning and ordering, checks protected literals such as URLs, IDs, keys, and numbers, and then returns the safest compact version.
Requires Python 3.12 or newer.
What It Does
- Rewrites a prompt into compact
zh,wenyan, andencandidates - Optionally verifies each candidate with a dedicated judge backend
- Checks whether important literals were dropped
- Estimates token counts
- Picks the best valid candidate, or falls back to the original prompt
Supported Backends
minimax: usesMINIMAX_API_KEYorOPENAI_API_KEYgemini: usesGEMINI_API_KEYgemini_cli: uses the localgeminiexecutable and its own login/sessioncodex_cli: uses the localcodexexecutable
Installation
If you are installing from a local checkout:
uv tool install .
Or install into a virtual environment:
uv pip install .
To see the available options:
promptcrab --help
Configuration
promptcrab reads credentials in this order:
- CLI flags such as
--minimax-api-keyand--gemini-api-key - Existing shell environment variables
--env-file /path/to/file.env- A
.envfile found by searching from the current working directory upward
This makes local project .env files work even when promptcrab is installed globally.
Example:
MINIMAX_API_KEY=your-key
GEMINI_API_KEY=your-key
OPENAI_API_KEY=your-key
Only set the variables required by the backend you actually use.
If you keep provider keys outside the project root, pass an explicit file:
promptcrab --env-file ~/.config/promptcrab/provider.env --help
Quick Start
Rewrite a prompt with MiniMax through opencode:
promptcrab \
--backend opencode_cli \
--model minimax-coding-plan/MiniMax-M2.7-highspeed \
--prompt "Summarize this API design and keep every field name unchanged."
Rewrite a prompt from a file with the local Gemini CLI:
promptcrab \
--backend gemini_cli \
--model gemini-3-flash-preview \
--prompt-file ./prompt.txt
Use a fixed judge backend instead of self-verification:
promptcrab \
--backend opencode_cli \
--model minimax-coding-plan/MiniMax-M2.7-highspeed \
--judge-backend codex_cli \
--judge-model gpt-5.4 \
--judge-codex-reasoning-effort medium \
--prompt-file ./prompt.txt
Rewrite a prompt with the local Gemini CLI:
promptcrab \
--backend gemini_cli \
--model gemini-3-flash-preview \
--prompt-file ./prompt.txt
Pipe a prompt through stdin:
cat ./prompt.txt | promptcrab --backend codex_cli --model gpt-5.4
Common Usage
Show every candidate and its checks:
promptcrab \
--backend opencode_cli \
--model minimax-coding-plan/MiniMax-M2.7-highspeed \
--prompt-file ./prompt.txt \
--show-all
Return machine-readable JSON:
promptcrab \
--backend gemini_cli \
--model gemini-3-flash-preview \
--prompt-file ./prompt.txt \
--json-output
Write the best prompt to a file:
promptcrab \
--backend opencode_cli \
--model minimax-coding-plan/MiniMax-M2.7-highspeed \
--prompt-file ./prompt.txt \
--write-best-to ./optimized.txt
Optionally cap generation output if a specific provider/model needs it:
promptcrab \
--backend gemini \
--model gemini-3-flash-preview \
--prompt-file ./prompt.txt \
--max-output-tokens 4096
Use a non-default Codex executable path:
promptcrab \
--backend codex_cli \
--model gpt-5.4 \
--codex-executable /path/to/codex \
--prompt-file ./prompt.txt
Current Model Guidance
Instead of checking in a small, stale benchmark table, promptcrab now ships a reproducible promptcrab-benchmark runner. It runs a built-in literal/format hard-case suite, pulls public web datasets, re-counts every prompt with one shared tokenizer, and evaluates rewrites with a multi-judge panel.
Directional Snapshot
This single-judge snapshot was run on 2026-04-15 for a README-sized comparison that finishes quickly. It samples 4 MT-Bench cases and 4 IFEval cases, uses o200k_base as the shared tokenizer, keeps literal checks enabled, and evaluates every row with codex_cli + gpt-5.4 (medium) as the judge. Treat it as directional, not a final ranking; the GPT row is self-judged.
Avg accepted token reduction is computed only over cases where at least one candidate passed the fidelity gates.
| Rewrite backend | Judge | Sample | Pass rate (95% CI) | Avg accepted token reduction (95% CI) | Dataset pass split | Notes |
|---|---|---|---|---|---|---|
codex_cli + gpt-5.4 (medium) |
codex_cli + gpt-5.4 (medium) |
8 |
6/8 = 75.0% (40.9-92.9%) |
4.8% (-5.5-12.3%) |
MT-Bench 4/4, IFEval 2/4 |
Self-judged; most conservative compression. IFEval failures came from strict literal/verbatim constraints. |
opencode_cli + MiniMax-M2.7-highspeed |
codex_cli + gpt-5.4 (medium) |
8 |
2/8 = 25.0% (7.1-59.1%) |
20.1% (19.2-20.9%) |
MT-Bench 2/4, IFEval 0/4 |
Highest accepted compression, but many IFEval cases failed on literal or format drift. |
gemini_cli + gemini-3-flash-preview |
codex_cli + gpt-5.4 (medium) |
8 |
4/8 = 50.0% (21.5-78.5%) |
7.8% (-16.7-26.3%) |
MT-Bench 3/4, IFEval 1/4 |
Middle fidelity; failures mostly came from translated or dropped literal constraints. |
Built-in prompt sources:
hard_cases: built-in literal and format preservation prompts covering verbatim repeat, bullet templates, exact markers, section separators, case/count constraints, symbols, JSON keys, and URLs- MT-Bench
- IFEval
The benchmark reports:
- per-judge pass rate with 95% Wilson confidence intervals
- panel consensus pass rate
- before-gate token reduction, showing how much the raw shortest candidate compressed before fidelity checks
- after-gate token reduction, showing accepted compression after literal and judge gates
- 95% bootstrap confidence intervals for mean token reduction
- pairwise judge agreement and Cohen's kappa
- per-dataset breakdowns
Example: rerun the benchmark on hard cases and public real-world cases
promptcrab-benchmark \
--backend codex_cli \
--model gpt-5.4 \
--codex-reasoning-effort medium \
--judge gemini_cli:gemini-3-flash-preview \
--judge opencode_cli:minimax-coding-plan/MiniMax-M2.7-highspeed \
--dataset hard_cases \
--dataset mt_bench \
--dataset ifeval \
--cases-per-dataset 24 \
--trials 2 \
--tokenizer o200k_base
If you want to run the full datasets instead of a stratified sample:
promptcrab-benchmark \
--backend codex_cli \
--model gpt-5.4 \
--codex-reasoning-effort medium \
--judge gemini_cli:gemini-3-flash-preview \
--judge opencode_cli:minimax-coding-plan/MiniMax-M2.7-highspeed \
--dataset hard_cases \
--dataset mt_bench \
--dataset ifeval \
--cases-per-dataset 0 \
--tokenizer o200k_base
The built-in hard_cases suite is always evaluated in full when selected; --cases-per-dataset only limits sampled external datasets.
Recommended starting points:
- For highest fidelity and stability, use
codex_cli --model gpt-5.4, optionally pin--codex-reasoning-effort medium|high|xhigh, and pick a different judge backend such asgemini_clioropencode_cli. - For strongest prompt compression, compare
opencode_cli --model minimax-coding-plan/MiniMax-M2.7-highspeedwithcodex_cli --model gpt-5.4as judge. - Use
gemini_cli --model gemini-3-flash-previewas a rewrite backend only if you want to compare it explicitly; current literal-fidelity performance is weaker thangpt-5.4in the directional snapshot above.
If you omit --judge-backend, promptcrab skips judge-based verification and only applies literal checks. This is faster, but less safe.
Example: safer default rewrite
promptcrab \
--backend codex_cli \
--model gpt-5.4 \
--codex-reasoning-effort medium \
--judge-backend gemini_cli \
--judge-model gemini-3-flash-preview \
--prompt-file ./prompt.txt
Example: stronger compression with an external judge
promptcrab \
--backend opencode_cli \
--model minimax-coding-plan/MiniMax-M2.7-highspeed \
--judge-backend codex_cli \
--judge-model gpt-5.4 \
--judge-codex-reasoning-effort medium \
--prompt-file ./prompt.txt
For codex_cli, promptcrab can override reasoning effort with --codex-reasoning-effort and --judge-codex-reasoning-effort. If you omit those flags, Codex falls back to your local CLI configuration such as ~/.codex/config.toml.
Output Modes
- Default output: prints the selected best prompt
--show-all: prints all candidates, checks, and verifier results--json-output: prints a JSON object for automation--write-best-to: saves the selected prompt to a file
Notes
- If no candidate passes the fidelity gates,
promptcrabreturns the original prompt unchanged. - If you set
--judge-backend, promptcrab runs an extra verification pass before accepting a candidate. - If you omit
--judge-backend, promptcrab skips semantic verification and only uses literal checks. - If you want a truly independent judge, set
--judge-backendto a different backend than--backend. promptcrabdoes not set a generation output cap by default; if you need one for a specific backend or model, pass--max-output-tokens.--max-output-tokensis currently forwarded tominimaxandgemini;codex_cliandgemini_clido not expose a matching flag in this wrapper yet.- Token counting depends on backend support and available credentials.
- The selected best candidate is language-agnostic; whichever valid rewrite is smallest wins.
Changelog
See CHANGELOG.md.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file promptcrab-2026.4.15.tar.gz.
File metadata
- Download URL: promptcrab-2026.4.15.tar.gz
- Upload date:
- Size: 35.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5c8f70464ed5987f14bc4c264e07b85cfaed79cb89b043a66f3941ff2f2ec228
|
|
| MD5 |
4141094da687a6ba2f3ea1521f22ee54
|
|
| BLAKE2b-256 |
99ec54fbd9d20f6417b208535fc567a638647da88d2179b7ce8afeec684a40ed
|
Provenance
The following attestation bundles were made for promptcrab-2026.4.15.tar.gz:
Publisher:
release-prepare.yml on conscientiousness/promptcrab
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
promptcrab-2026.4.15.tar.gz -
Subject digest:
5c8f70464ed5987f14bc4c264e07b85cfaed79cb89b043a66f3941ff2f2ec228 - Sigstore transparency entry: 1305942814
- Sigstore integration time:
-
Permalink:
conscientiousness/promptcrab@65170a6217b9a22ff7101d211e9b78e1e670522d -
Branch / Tag:
refs/tags/v2026.4.15 - Owner: https://github.com/conscientiousness
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-prepare.yml@65170a6217b9a22ff7101d211e9b78e1e670522d -
Trigger Event:
push
-
Statement type:
File details
Details for the file promptcrab-2026.4.15-py3-none-any.whl.
File metadata
- Download URL: promptcrab-2026.4.15-py3-none-any.whl
- Upload date:
- Size: 33.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6597240dbec16acc9e9fe539d13cc39edd71680c22b924ce3246a257263b44df
|
|
| MD5 |
90a4870a416b684fe07a0cb5d51529d5
|
|
| BLAKE2b-256 |
8673abb755185181abc8502a0a7519550474e5690983ee0a96e94b5a54a67192
|
Provenance
The following attestation bundles were made for promptcrab-2026.4.15-py3-none-any.whl:
Publisher:
release-prepare.yml on conscientiousness/promptcrab
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
promptcrab-2026.4.15-py3-none-any.whl -
Subject digest:
6597240dbec16acc9e9fe539d13cc39edd71680c22b924ce3246a257263b44df - Sigstore transparency entry: 1305942947
- Sigstore integration time:
-
Permalink:
conscientiousness/promptcrab@65170a6217b9a22ff7101d211e9b78e1e670522d -
Branch / Tag:
refs/tags/v2026.4.15 - Owner: https://github.com/conscientiousness
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-prepare.yml@65170a6217b9a22ff7101d211e9b78e1e670522d -
Trigger Event:
push
-
Statement type: