Prompt rewrite pipeline with verifier and token counting.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

cyberneko

These details have not been verified by PyPI

Project description

promptcrab

Keep the meaning. Trim the spell.

English · 繁體中文 · Installation · Quick Start · Model Guidance

promptcrab pixel art banner

promptcrab is a CLI for rewriting prompts for downstream LLMs with lower token cost and strict fidelity checks.

Instead of simply shortening text, it generates multiple rewrite candidates, verifies that they preserve task meaning and ordering, checks protected literals such as URLs, IDs, keys, and numbers, and then returns the safest compact version.

Requires Python 3.12 or newer.

What It Does

Rewrites a prompt into compact zh, wenyan, and en candidates
Optionally verifies each candidate with a dedicated judge backend
Checks whether important literals were dropped
Estimates token counts
Picks the best valid candidate, or falls back to the original prompt

Supported Backends

minimax: uses MINIMAX_API_KEY or OPENAI_API_KEY
gemini: uses GEMINI_API_KEY
gemini_cli: uses the local gemini executable and its own login/session
codex_cli: uses the local codex executable

Installation

If you are installing from a local checkout:

uv tool install .

Or install into a virtual environment:

uv pip install .

To see the available options:

promptcrab --help

Configuration

promptcrab reads credentials in this order:

CLI flags such as --minimax-api-key and --gemini-api-key
Existing shell environment variables
--env-file /path/to/file.env
A .env file found by searching from the current working directory upward

This makes local project .env files work even when promptcrab is installed globally.

Example:

MINIMAX_API_KEY=your-key
GEMINI_API_KEY=your-key
OPENAI_API_KEY=your-key

Only set the variables required by the backend you actually use.

If you keep provider keys outside the project root, pass an explicit file:

promptcrab --env-file ~/.config/promptcrab/provider.env --help

Quick Start

Rewrite a prompt with MiniMax through opencode:

promptcrab \
  --backend opencode_cli \
  --model minimax-coding-plan/MiniMax-M2.7-highspeed \
  --prompt "Summarize this API design and keep every field name unchanged."

Rewrite a prompt from a file with the local Gemini CLI:

promptcrab \
  --backend gemini_cli \
  --model gemini-3-flash-preview \
  --prompt-file ./prompt.txt

Use a fixed judge backend instead of self-verification:

promptcrab \
  --backend opencode_cli \
  --model minimax-coding-plan/MiniMax-M2.7-highspeed \
  --judge-backend codex_cli \
  --judge-model gpt-5.4 \
  --judge-codex-reasoning-effort medium \
  --prompt-file ./prompt.txt

Rewrite a prompt with the local Gemini CLI:

promptcrab \
  --backend gemini_cli \
  --model gemini-3-flash-preview \
  --prompt-file ./prompt.txt

Pipe a prompt through stdin:

cat ./prompt.txt | promptcrab --backend codex_cli --model gpt-5.4

Common Usage

Show every candidate and its checks:

promptcrab \
  --backend opencode_cli \
  --model minimax-coding-plan/MiniMax-M2.7-highspeed \
  --prompt-file ./prompt.txt \
  --show-all

Return machine-readable JSON:

promptcrab \
  --backend gemini_cli \
  --model gemini-3-flash-preview \
  --prompt-file ./prompt.txt \
  --json-output

Write the best prompt to a file:

promptcrab \
  --backend opencode_cli \
  --model minimax-coding-plan/MiniMax-M2.7-highspeed \
  --prompt-file ./prompt.txt \
  --write-best-to ./optimized.txt

Optionally cap generation output if a specific provider/model needs it:

promptcrab \
  --backend gemini \
  --model gemini-3-flash-preview \
  --prompt-file ./prompt.txt \
  --max-output-tokens 4096

Use a non-default Codex executable path:

promptcrab \
  --backend codex_cli \
  --model gpt-5.4 \
  --codex-executable /path/to/codex \
  --prompt-file ./prompt.txt

Current Model Guidance

Instead of checking in a small, stale benchmark table, promptcrab now ships a reproducible promptcrab-benchmark runner. It runs a built-in literal/format hard-case suite, pulls public web datasets, re-counts every prompt with one shared tokenizer, and evaluates rewrites with a multi-judge panel.

Directional Snapshot

This single-judge snapshot was run on 2026-04-15 for a README-sized comparison that finishes quickly. It samples 4 MT-Bench cases and 4 IFEval cases, uses o200k_base as the shared tokenizer, keeps literal checks enabled, and evaluates every row with codex_cli + gpt-5.4 (medium) as the judge. Treat it as directional, not a final ranking; the GPT row is self-judged.

Avg accepted token reduction is computed only over cases where at least one candidate passed the fidelity gates.

Rewrite backend	Judge	Sample	Pass rate (95% CI)	Avg accepted token reduction (95% CI)	Dataset pass split	Notes
`codex_cli + gpt-5.4 (medium)`	`codex_cli + gpt-5.4 (medium)`	`8`	`6/8 = 75.0%` (`40.9-92.9%`)	`4.8%` (`-5.5-12.3%`)	MT-Bench `4/4`, IFEval `2/4`	Self-judged; most conservative compression. IFEval failures came from strict literal/verbatim constraints.
`opencode_cli + MiniMax-M2.7-highspeed`	`codex_cli + gpt-5.4 (medium)`	`8`	`2/8 = 25.0%` (`7.1-59.1%`)	`20.1%` (`19.2-20.9%`)	MT-Bench `2/4`, IFEval `0/4`	Highest accepted compression, but many IFEval cases failed on literal or format drift.
`gemini_cli + gemini-3-flash-preview`	`codex_cli + gpt-5.4 (medium)`	`8`	`4/8 = 50.0%` (`21.5-78.5%`)	`7.8%` (`-16.7-26.3%`)	MT-Bench `3/4`, IFEval `1/4`	Middle fidelity; failures mostly came from translated or dropped literal constraints.

Built-in prompt sources:

hard_cases: built-in literal and format preservation prompts covering verbatim repeat, bullet templates, exact markers, section separators, case/count constraints, symbols, JSON keys, and URLs
MT-Bench
IFEval

The benchmark reports:

per-judge pass rate with 95% Wilson confidence intervals
panel consensus pass rate
before-gate token reduction, showing how much the raw shortest candidate compressed before fidelity checks
after-gate token reduction, showing accepted compression after literal and judge gates
95% bootstrap confidence intervals for mean token reduction
pairwise judge agreement and Cohen's kappa
per-dataset breakdowns

Example: rerun the benchmark on hard cases and public real-world cases

promptcrab-benchmark \
  --backend codex_cli \
  --model gpt-5.4 \
  --codex-reasoning-effort medium \
  --judge gemini_cli:gemini-3-flash-preview \
  --judge opencode_cli:minimax-coding-plan/MiniMax-M2.7-highspeed \
  --dataset hard_cases \
  --dataset mt_bench \
  --dataset ifeval \
  --cases-per-dataset 24 \
  --trials 2 \
  --tokenizer o200k_base

If you want to run the full datasets instead of a stratified sample:

promptcrab-benchmark \
  --backend codex_cli \
  --model gpt-5.4 \
  --codex-reasoning-effort medium \
  --judge gemini_cli:gemini-3-flash-preview \
  --judge opencode_cli:minimax-coding-plan/MiniMax-M2.7-highspeed \
  --dataset hard_cases \
  --dataset mt_bench \
  --dataset ifeval \
  --cases-per-dataset 0 \
  --tokenizer o200k_base

The built-in hard_cases suite is always evaluated in full when selected; --cases-per-dataset only limits sampled external datasets.

Recommended starting points:

For highest fidelity and stability, use codex_cli --model gpt-5.4, optionally pin --codex-reasoning-effort medium|high|xhigh, and pick a different judge backend such as gemini_cli or opencode_cli.
For strongest prompt compression, compare opencode_cli --model minimax-coding-plan/MiniMax-M2.7-highspeed with codex_cli --model gpt-5.4 as judge.
Use gemini_cli --model gemini-3-flash-preview as a rewrite backend only if you want to compare it explicitly; current literal-fidelity performance is weaker than gpt-5.4 in the directional snapshot above.

If you omit --judge-backend, promptcrab skips judge-based verification and only applies literal checks. This is faster, but less safe.

Example: safer default rewrite

promptcrab \
  --backend codex_cli \
  --model gpt-5.4 \
  --codex-reasoning-effort medium \
  --judge-backend gemini_cli \
  --judge-model gemini-3-flash-preview \
  --prompt-file ./prompt.txt

Example: stronger compression with an external judge

promptcrab \
  --backend opencode_cli \
  --model minimax-coding-plan/MiniMax-M2.7-highspeed \
  --judge-backend codex_cli \
  --judge-model gpt-5.4 \
  --judge-codex-reasoning-effort medium \
  --prompt-file ./prompt.txt

For codex_cli, promptcrab can override reasoning effort with --codex-reasoning-effort and --judge-codex-reasoning-effort. If you omit those flags, Codex falls back to your local CLI configuration such as ~/.codex/config.toml.

Output Modes

Default output: prints the selected best prompt
--show-all: prints all candidates, checks, and verifier results
--json-output: prints a JSON object for automation
--write-best-to: saves the selected prompt to a file

Notes

If no candidate passes the fidelity gates, promptcrab returns the original prompt unchanged.
If you set --judge-backend, promptcrab runs an extra verification pass before accepting a candidate.
If you omit --judge-backend, promptcrab skips semantic verification and only uses literal checks.
If you want a truly independent judge, set --judge-backend to a different backend than --backend.
promptcrab does not set a generation output cap by default; if you need one for a specific backend or model, pass --max-output-tokens.
--max-output-tokens is currently forwarded to minimax and gemini; codex_cli and gemini_cli do not expose a matching flag in this wrapper yet.
Token counting depends on backend support and available credentials.
The selected best candidate is language-agnostic; whichever valid rewrite is smallest wins.

Changelog

See CHANGELOG.md.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

cyberneko

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

2026.4.15

Apr 15, 2026

2026.4.14

Apr 14, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

promptcrab-2026.4.15.tar.gz (35.6 kB view details)

Uploaded Apr 15, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

promptcrab-2026.4.15-py3-none-any.whl (33.6 kB view details)

Uploaded Apr 15, 2026 Python 3

File details

Details for the file promptcrab-2026.4.15.tar.gz.

File metadata

Download URL: promptcrab-2026.4.15.tar.gz
Upload date: Apr 15, 2026
Size: 35.6 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for promptcrab-2026.4.15.tar.gz
Algorithm	Hash digest
SHA256	`5c8f70464ed5987f14bc4c264e07b85cfaed79cb89b043a66f3941ff2f2ec228`
MD5	`4141094da687a6ba2f3ea1521f22ee54`
BLAKE2b-256	`99ec54fbd9d20f6417b208535fc567a638647da88d2179b7ce8afeec684a40ed`

See more details on using hashes here.

Provenance

The following attestation bundles were made for promptcrab-2026.4.15.tar.gz:

Publisher: release-prepare.yml on conscientiousness/promptcrab

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: promptcrab-2026.4.15.tar.gz
- Subject digest: 5c8f70464ed5987f14bc4c264e07b85cfaed79cb89b043a66f3941ff2f2ec228
- Sigstore transparency entry: 1305942814
- Sigstore integration time: Apr 15, 2026
Source repository:
- Permalink: conscientiousness/promptcrab@65170a6217b9a22ff7101d211e9b78e1e670522d
- Branch / Tag: refs/tags/v2026.4.15
- Owner: https://github.com/conscientiousness
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release-prepare.yml@65170a6217b9a22ff7101d211e9b78e1e670522d
- Trigger Event: push

File details

Details for the file promptcrab-2026.4.15-py3-none-any.whl.

File metadata

Download URL: promptcrab-2026.4.15-py3-none-any.whl
Upload date: Apr 15, 2026
Size: 33.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for promptcrab-2026.4.15-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6597240dbec16acc9e9fe539d13cc39edd71680c22b924ce3246a257263b44df`
MD5	`90a4870a416b684fe07a0cb5d51529d5`
BLAKE2b-256	`8673abb755185181abc8502a0a7519550474e5690983ee0a96e94b5a54a67192`

See more details on using hashes here.

Provenance

The following attestation bundles were made for promptcrab-2026.4.15-py3-none-any.whl:

Publisher: release-prepare.yml on conscientiousness/promptcrab

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: promptcrab-2026.4.15-py3-none-any.whl
- Subject digest: 6597240dbec16acc9e9fe539d13cc39edd71680c22b924ce3246a257263b44df
- Sigstore transparency entry: 1305942947
- Sigstore integration time: Apr 15, 2026
Source repository:
- Permalink: conscientiousness/promptcrab@65170a6217b9a22ff7101d211e9b78e1e670522d
- Branch / Tag: refs/tags/v2026.4.15
- Owner: https://github.com/conscientiousness
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release-prepare.yml@65170a6217b9a22ff7101d211e9b78e1e670522d
- Trigger Event: push

promptcrab 2026.4.15

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

promptcrab

What It Does

Supported Backends

Installation

Configuration

Quick Start

Common Usage

Current Model Guidance

Directional Snapshot

Output Modes

Notes

Changelog

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance