Prompt Sensitivity Analyzer — measure robustness, entropy, and token importance of LLM prompts.

These details have not been verified by PyPI

Project links

Project description

promptgrad

Prompt Sensitivity Analyzer — measure how robust your LLM prompts are to linguistic perturbations.

Prompt engineering is chaos. The same instruction, phrased slightly differently, can produce wildly different outputs. promptgrad quantifies this instability.

Given a prompt, it:

Perturbs it with six linguistic strategies (synonym swap, word deletion, paraphrase, casing, punctuation, sentence reorder)
Embeds every variant using TF-IDF (built-in, zero dependencies) or semantic models
Computes cosine similarity shifts, output variance, and Shannon entropy
Detects unstable instructions and surfaces which tokens drive the instability
Outputs a robustness score (0–1), a per-token sensitivity heatmap, and a strategy-level radar chart

Installation

# Minimal install (TF-IDF embeddings, no ML dependencies)
pip install promptgrad

# With local semantic embeddings (recommended)
pip install "promptgrad[local]"

# With OpenAI embeddings
pip install "promptgrad[openai]"

# With visualisation
pip install "promptgrad[viz]"

# Everything
pip install "promptgrad[all]"

Quick Start

from promptgrad import PromptAnalyzer

analyzer = PromptAnalyzer()

report = analyzer.analyze("Summarize the document in three sentences.")

print(report.robustness_score)      # e.g. 0.847
print(report.stability_label)       # "STABLE"
print(report.top_sensitive_tokens()) # [("Summarize", 0.91), ("three", 0.72), ...]

for warning in report.warnings:
    print(warning)

Visualise

analyzer.plot(report)                 # All four charts
analyzer.plot(report, kind="heatmap") # Just the token heatmap
analyzer.plot(report, kind="gauge")   # Just the robustness gauge
analyzer.plot(report, save_path="report.png", show=False)

Compare multiple prompts

result = analyzer.compare([
    "Summarize the document in three sentences.",
    "Please provide a summary of the document.",
    "What are the key points in this document?",
])

for prompt, score in result["ranked"]:
    print(f"{score:.3f}  {prompt}")

CLI

# Analyse a prompt
promptgrad analyze "List the top 5 programming languages."

# With semantic embeddings and save plot
promptgrad analyze "Explain quantum computing." \
    --backend sentence_transformers \
    --plot report.png

# Export JSON report
promptgrad analyze "Write a haiku about rain." --json report.json

# Compare prompts from a file (one per line)
promptgrad compare prompts.txt

Outputs

Field	Type	Description
`robustness_score`	`float`	0.0 (unstable) → 1.0 (robust)
`stability_label`	`str`	`VERY STABLE` / `STABLE` / `FRAGILE` / `UNSTABLE`
`mean_cosine_similarity`	`float`	Average similarity to original across all perturbations
`embedding_shift_std`	`float`	Standard deviation of L2 shifts
`entropy`	`float`	Shannon entropy of the similarity distribution
`output_variance`	`float \| None`	Variance of LLM output embeddings (if provided)
`token_importance`	`dict[str, float]`	Token → sensitivity score (0–1)
`per_strategy`	`dict[str, float]`	Mean cosine similarity per perturbation type
`warnings`	`list[str]`	Human-readable instability warnings

Embedding Backends

Backend	Requires	Quality	Use when
`tfidf`	nothing	lexical only	Fast tests, CI, offline
`sentence_transformers`	`pip install promptgrad[local]`	semantic	Recommended default
`openai`	`pip install promptgrad[openai]` + API key	semantic	Production / highest quality

# Auto-selects sentence_transformers if installed, else tfidf
analyzer = PromptAnalyzer(embedding_backend="auto")

# Explicit
analyzer = PromptAnalyzer(embedding_backend="sentence_transformers",
                          model_name="all-MiniLM-L6-v2")

analyzer = PromptAnalyzer(embedding_backend="openai",
                          model="text-embedding-3-small",
                          api_key="sk-...")

Perturbation Strategies

Strategy	What it does
`synonym_substitution`	Swaps words with near-synonyms
`word_deletion`	Removes individual non-stopword tokens
`paraphrase`	Structural rewrites (imperative ↔ question, etc.)
`casing_variation`	Title, UPPER, lower, mIxEd case
`punctuation_variation`	Adds, removes, or changes terminal punctuation
`order_shuffle`	Shuffles sentence order in multi-sentence prompts

Use all of them (default) or pick a subset:

analyzer = PromptAnalyzer(perturbations=["synonym_substitution", "word_deletion"])

Bring your own:

from promptgrad import Perturbation, PerturbationResult

class BackTranslation(Perturbation):
    name = "back_translation"

    def apply(self, prompt, n=5, seed=None):
        # ... call translation API ...
        return [PerturbationResult(original=prompt, perturbed=translated, strategy=self.name)]

analyzer = PromptAnalyzer(perturbations=[BackTranslation()])

Measuring Output Variance

If you have LLM responses for the perturbed prompts, pass them in to compute semantic output variance:

import openai

client = openai.OpenAI()

def complete(prompt):
    return client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": prompt}]
    ).choices[0].message.content

# Generate outputs for each perturbed variant
from promptgrad.perturbations import SynonymSubstitution
perturbed = SynonymSubstitution().apply(my_prompt, n=10)
outputs = [complete(r.perturbed) for r in perturbed]

report = analyzer.analyze(my_prompt, output_texts=outputs)
print(report.output_variance)

Development

git clone https://github.com/Liodon-AI/promptgrad
cd promptgrad
pip install -e ".[dev]"
pytest -v

License

MIT © Liodon AI

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Feb 23, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

promptgrad-0.1.0.tar.gz (17.6 kB view details)

Uploaded Feb 23, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

promptgrad-0.1.0-py3-none-any.whl (21.0 kB view details)

Uploaded Feb 23, 2026 Python 3

File details

Details for the file promptgrad-0.1.0.tar.gz.

File metadata

Download URL: promptgrad-0.1.0.tar.gz
Upload date: Feb 23, 2026
Size: 17.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for promptgrad-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`45bb6a5c27968573e2121ee88428506238104e30e90322db633ef0ceeb659ff6`
MD5	`b6a6503e0d22f2297b5df39c9939be16`
BLAKE2b-256	`6f3d37136e6f4f8fa30e58df8899353a5d14d67e1ddfc8efdeb8701d2f210855`

See more details on using hashes here.

File details

Details for the file promptgrad-0.1.0-py3-none-any.whl.

File metadata

Download URL: promptgrad-0.1.0-py3-none-any.whl
Upload date: Feb 23, 2026
Size: 21.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for promptgrad-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`274f62f48fdc828e2b5fc1e1d4157b6a7abb2de7db4764695b9f30075ebf2027`
MD5	`274f4b6717048bfe2572453386b9e1b7`
BLAKE2b-256	`983fb5238c8b30e860c7a67affb2704864f7d7a053906e0c7e5b662a117acc6f`

See more details on using hashes here.

promptgrad 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

promptgrad

Installation

Quick Start

Visualise

Compare multiple prompts

CLI

Outputs

Embedding Backends

Perturbation Strategies

Measuring Output Variance

Development

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes