promptfit - Modular toolkit for optimizing LLM prompts: estimate token usage, rank by semantic relevance, and compress with LLMs to fit any token budget. Perfect for RAG, few-shot, and instruction-heavy GenAI workflows.

Project description

promptfit

📣 Author

Vedant Laxman Chandore
GitHub

The Core Problem

Modern LLMs (Cohere, OpenAI, Gemini, Anthropic, etc.) are powerful, but their token limits make it hard to fit rich, multi-section prompts—especially for Retrieval-Augmented Generation (RAG), few-shot learning, and instruction-heavy use cases. Developers waste time manually trimming prompts, risking loss of important context, incomplete responses, or costly token overages.

promptfit solves this by automating prompt analysis, compression, and optimization—so you get the most value from every token, every time.

✨ Features

🔢 Token Budget Estimator: Analyze and estimate token usage for prompt templates, sections, and variables—before sending to an LLM.
🧭 Semantic Relevance Scoring: Split prompts into sections, generate embeddings (Cohere), and rank by cosine similarity to your query or task.
✂️ Smart Prompt Pruner: Drop or trim low-salience sections first, keeping only the most relevant content to fit your token budget.
✍️ Paraphrasing Module: Use Cohere’s LLM to rewrite and compress over-budget prompts, preserving key instructions and meaning.
📦 Modular Design: Each feature is a standalone module—use them independently or together.
🧪 Test-Driven: Comprehensive unit tests with mocked or live Cohere API responses.
🔐 Secure API Key Handling: Loads your Cohere API key from a .env file or environment variable.
🖥️ CLI Support: Optimize prompts directly from the command line.

🛠️ Tech Stack

Language: Python 3.10+
LLM: Cohere command-r-plus
Embeddings: embed-english-v3.0
Tokenizer: Cohere’s estimator (or manual fallback)
Libraries:
- cohere, scikit-learn, tiktoken, python-dotenv, nltk/spacy, rich, typer, pytest

📦 Installation

pip install promptfit

🚀 Demo Usage

Python API

import os
import time
# 1. Set your Cohere key (or have it in your env)
os.environ["COHERE_API_KEY"] = "Kwi33HNnmXRDCkO4j7FndNP3LATOoKX3yvoOdztK"

from promptfit.token_budget import estimate_tokens
from promptfit.utils import split_sentences
from promptfit.embedder import get_embeddings
from promptfit.relevance import compute_cosine_similarities
from promptfit.optimizer import optimize_prompt

# 2. Your actual long prompt
prompt = """
You are a customer-support assistant. A user reports that their device fails
intermittently under cold conditions, the battery drains within two hours, and
previous support tickets went unanswered. They’ve provided logs and screenshots.
Please summarize the issues, note their emotional tone, propose immediate
fixes, and suggest long-term retention strategies.
"""

query = "Summarize issues, emotional tone, action items, and retention strategies."

print("=== PROMPT ===")
print(prompt)

# 3. Split into sentences
sentences = split_sentences(prompt)
print("=== SENTENCES ===")
for i, s in enumerate(sentences, 1):
    print(f"{i}: {s!r}")
print()

# 4. Compute embeddings (first the query, then the sentences)
all_texts = [query] + sentences
embs = get_embeddings(all_texts)

# 5. Compute cosine similarities between query and each sentence
query_emb = embs[0]
sent_embs = embs[1:]
scores = compute_cosine_similarities(query_emb, sent_embs)

# 6. Display relevance scores
print("=== RELEVANCE SCORES OF COSINE SIMILARITY===")
for sent, score in zip(sentences, scores):
    print(f"{score:.4f} – {sent!r}")
print()

# 7. Show total token count before optimization
orig_tokens = estimate_tokens(prompt)
print(f"Original prompt ≈ {orig_tokens} tokens\n")

# 8. Run optimizer with timing
budget = 40
start_time = time.time()
optimized = optimize_prompt(prompt, query, max_tokens=budget)
end_time = time.time()
opt_tokens = estimate_tokens(optimized)

tokens_saved = orig_tokens - opt_tokens
percent_saved = (tokens_saved / orig_tokens) * 100

# 9. Output final result with efficiency stats
print(f"Optimized prompt ({opt_tokens} tokens ≤ {budget} budget):\n")
print(optimized)
print("\n--- Efficiency Stats ---")
print(f"Token reduction: {orig_tokens - opt_tokens} tokens")
print(f"Reduction percentage: {(orig_tokens - opt_tokens) / orig_tokens * 100:.1f}%")
print(f"Optimization time: {end_time - start_time:.2f} seconds")
print(f"Tokens Saved: {tokens_saved}")

Command Line

python -m promptfit.cli "YOUR_PROMPT" "YOUR_QUERY" --max-tokens 120

Full Demo Script

See demo/demo_usage.py for a comprehensive example covering:

Token estimation
Embedding generation
Relevance ranking
Pruning and paraphrasing
End-to-end optimization

🗝️ Environment Setup

Store your COHERE_API_KEY in a .env file in your project root:
```
COHERE_API_KEY=your-real-api-key-here
```

Or set it in your shell:

export COHERE_API_KEY=your-real-api-key-here

📚 Directory Structure

promptfit/
│
├── __init__.py
├── token_budget.py
├── embedder.py
├── relevance.py
├── optimizer.py
├── paraphraser.py
├── cli.py
├── utils.py
├── config.py
│
├── README.md
├── requirements.txt
│
├── tests/
│   ├── test_token_budget.py
│   ├── test_relevance.py
│   ├── test_optimizer.py
│   └── test_paraphraser.py
│
└── demo/
    └── demo_usage.py

💡 Why Use promptfit?

Save tokens, save money: Only send the most relevant, concise prompts to your LLM.
Prevent errors: Never exceed token limits or lose critical context.
Automate prompt engineering: Focus on your app, not manual prompt trimming.
Works with any LLM: Designed for Cohere, but easily adaptable to OpenAI, Gemini, Anthropic, and more.

📝 License

MIT License

🤝 Contributing

Pull requests, issues, and stars are welcome! For major changes, please open an issue first to discuss what you’d like to change.

📣 Author

Vedant Laxman Chandore
GitHub

<<<<<<<

Built for the next generation of GenAI and LLM developers. Optimize your prompts, maximize your results!

Project details

Release history Release notifications | RSS feed

0.5.0

Sep 29, 2025

0.4.0

Aug 14, 2025

This version

0.3.0

Jul 22, 2025

0.2.0

Jul 19, 2025

0.1.0

Jul 19, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

promptfit-0.3.0.tar.gz (12.6 kB view details)

Uploaded Jul 22, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

promptfit-0.3.0-py3-none-any.whl (12.9 kB view details)

Uploaded Jul 22, 2025 Python 3

File details

Details for the file promptfit-0.3.0.tar.gz.

File metadata

Download URL: promptfit-0.3.0.tar.gz
Upload date: Jul 22, 2025
Size: 12.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.10.12

File hashes

Hashes for promptfit-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`c22d472566e311c6ea6e5a1104f0b8debc5f9b9a0326cde3acd1f55a2a739af6`
MD5	`26c7a3e9249281e987c94f7cf7c9e6be`
BLAKE2b-256	`5ad36a95817bfbc264286408afaf3c85fb68d63e163c595b18d4e12da4e1906f`

See more details on using hashes here.

File details

Details for the file promptfit-0.3.0-py3-none-any.whl.

File metadata

Download URL: promptfit-0.3.0-py3-none-any.whl
Upload date: Jul 22, 2025
Size: 12.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.10.12

File hashes

Hashes for promptfit-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`af8dc03ed944b8981f8cef09c66c121582fa7860a79b17da5c91a03df7ef87ea`
MD5	`a2b2ac42d00131d188d1f26ce540281e`
BLAKE2b-256	`edd30e60b431b7c27138b01f802b84f2a1caa95a1ea77035ccbdd80cb88639e3`

See more details on using hashes here.

promptfit 0.3.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

promptfit

📣 Author

The Core Problem

✨ Features

🛠️ Tech Stack

📦 Installation

🚀 Demo Usage

Python API

Command Line

Full Demo Script

🗝️ Environment Setup

📚 Directory Structure

💡 Why Use promptfit?

📝 License

🤝 Contributing

📣 Author

Built for the next generation of GenAI and LLM developers. Optimize your prompts, maximize your results!

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes