Debug AI responses, understand failures, and get automatic improvement suggestions
Project description
prompt-debugger
A Python library for debugging LLM interactions with automatic issue detection, quality scoring, and actionable suggestions.
Overview
prompt-debugger provides tools for analyzing and debugging interactions with Large Language Models. It detects common issues in prompts before API calls, analyzes response quality, and provides specific suggestions for improvement. The library works with any LLM provider and has zero dependencies.
Installation
pip install prompt-debugger
Quick Start
from prompt-debugger import prompt-debuggerger
debugger = prompt-debuggerger(verbose=True)
result = debugger.debug_call( llm_function=your_llm_function, prompt="Your prompt here" )
print(f"Prompt Quality: {result['prompt_analysis']['quality_score']}/10") print(f"Response Quality: {result['response_analysis']['quality_score']}/10")
Features
Prompt Analysis
prompt-debugger analyzes prompts before they are sent to the LLM, checking for vague or ambiguous language, insufficient context, overly complex or conflicting instructions, and length issues (too short or too long).
debugger = prompt-debuggerger() result = debugger.debug_call(llm, "Tell me about stuff")
This detects issues such as TOO_VAGUE: Contains 1 vague term, with a Quality Score of 6/10.
Response Analysis
Every response is analyzed for empty or minimal responses, AI refusals or inability to answer, low relevance to the original prompt, incomplete answers, and potential hallucinations.
Automatic Improvements
Generate improved versions of problematic prompts:
original = "Explain things" improved = debugger.get_improved_prompt(original)
This returns: "Please provide a detailed, specific response. Explain things"
Complete Logging
All interactions are logged with metadata for later analysis:
debugger.export_report("debug_session.json")
summary = debugger.get_session_summary() print(f"Total calls: {summary['total_calls']}") print(f"Issues found: {summary['issues_found']}")
Usage Examples
Basic Usage
from prompt-debugger import prompt-debuggerger import openai
def call_gpt(prompt): response = openai.ChatCompletion.create( model="gpt-3.5-turbo", messages=[{"role": "user", "content": prompt}] ) return response.choices[0].message.content
debugger = prompt-debuggerger(verbose=True)
result = debugger.debug_call( llm_function=call_gpt, prompt="Write a Python function to sort a list" )
Production Mode
debugger = prompt-debuggerger(verbose=False)
result = debugger.debug_call(llm, user_input)
if result['prompt_analysis']['quality_score'] < 7: improved = debugger.get_improved_prompt(user_input) result = debugger.debug_call(llm, improved)
Batch Processing
prompts = ["Prompt 1", "Prompt 2", "Prompt 3"]
for prompt in prompts: result = debugger.debug_call(llm, prompt) print(f"Quality: {result['response_analysis']['quality_score']}/10")
Compatibility
prompt-debugger works with any LLM provider including OpenAI (GPT-3.5, GPT-4, GPT-4-turbo), Anthropic (Claude, Claude-2, Claude-3), Google (Gemini, PaLM), Local models (Llama, Mistral, Ollama), and any custom LLM function.
Architecture
The library consists of four main components:
PromptChecker: Analyzes prompts for common issues ResponseAnalyzer: Evaluates response quality and relevance DebugLogger: Persistent logging of all interactions prompt-debuggerger: Main interface that coordinates all components
Requirements
Python 3.8 or higher. No external dependencies (pure Python implementation).
Advanced Usage
Using Components Independently
from prompt-debugger import PromptChecker, ResponseAnalyzer
checker = PromptChecker() analysis = checker.check_prompt("Your prompt here")
analyzer = ResponseAnalyzer() quality = analyzer.analyze_response(prompt, response)
Accessing Debug Logs
from prompt-debugger import DebugLogger
logger = DebugLogger() problematic_calls = logger.get_problematic_calls()
for call in problematic_calls: print(f"Call ID: {call['call_id']}") print(f"Issues: {len(call['prompt_analysis']['issues'])}")
Testing
Run pytest tests/ -v to execute tests or pytest tests/ --cov=prompt-debugger --cov-report=html for coverage reports.
Contributing
Contributions are welcome. Please follow these steps: Fork the repository, create a feature branch, make your changes with appropriate tests, and submit a pull request.
License
This project is licensed under the MIT License. See the LICENSE file for details.
Author
Your Name GitHub: https://github.com/yourusername Email: your.email@example.com
Changelog
Version 0.1.0 (2025-10-17)
Initial release with prompt quality analysis, response quality analysis, automatic suggestion generation, complete interaction logging, and session tracking and export functionality.
Support
For bug reports and feature requests, please use the GitHub issue tracker at https://github.com/yourusername/prompt-debugger/issues
Acknowledgments
This project was developed to address the need for better debugging tools in LLM application development.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file prompt_debugger-0.1.0.tar.gz.
File metadata
- Download URL: prompt_debugger-0.1.0.tar.gz
- Upload date:
- Size: 13.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
12253b8a43edeac2412593b4d8d73241b1a039a164ad79cd035a747639b8a065
|
|
| MD5 |
80958150b090eda018413220a2bbccda
|
|
| BLAKE2b-256 |
fe3145e031208e5565137e3628a5f35353281843447e2745a1f8c244fcb12d98
|
File details
Details for the file prompt_debugger-0.1.0-py3-none-any.whl.
File metadata
- Download URL: prompt_debugger-0.1.0-py3-none-any.whl
- Upload date:
- Size: 11.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
884224cf9e87989d1647f8ce881f753516912873eb3ef1540f72ac7eb1bb7f1f
|
|
| MD5 |
c94517e487d52852c99dcd2c35c8cac0
|
|
| BLAKE2b-256 |
c752fa6c2da380ff6e883bcfa26b6cd3b0983886d16daebcb472cc8341638ab1
|