Debug AI responses, understand failures, and get automatic improvement suggestions
Project description
Prompt-Debugger
A Python library for debugging LLM interactions with automatic issue detection, quality scoring, and actionable suggestions.
Overview
prompt-debugger provides tools for analyzing and debugging interactions with Large Language Models. It detects common issues in prompts before API calls, analyzes response quality, and provides specific suggestions for improvement. The library works with any LLM provider and has zero dependencies.
Installation
pip install prompt-debugger
text
Quick Start
from prompt_debugger import AIDebugger
debugger = AIDebugger(verbose=True)
result = debugger.debug_call( llm_function=your_llm_function, prompt="Your prompt here" )
print(f"Prompt Quality: {result['prompt_analysis']['quality_score']}/10") print(f"Response Quality: {result['response_analysis']['quality_score']}/10")
text
Output:
Debug Report - Call 20251017_160000_1 Status: SUCCESS Latency: 1.234s
No prompt issues detected
Response Quality: 8/10
======================================================================
text
Features
Prompt Analysis
prompt-debugger analyzes prompts before they are sent to the LLM, checking for:
- Vague or ambiguous language
- Insufficient context
- Overly complex or conflicting instructions
- Length issues (too short or too long)
debugger = AIDebugger() result = debugger.debug_call(llm, "Tell me about stuff")
text
Detects: Prompt Issues:
TOO_VAGUE: Contains 1 vague term
Quality Score: 6/10
text
Response Analysis
Every response is analyzed for:
- Empty or minimal responses
- AI refusals or inability to answer
- Low relevance to the original prompt
- Incomplete answers
- Potential hallucinations
Automatic Improvements
Generate improved versions of problematic prompts:
original = "Explain things" improved = debugger.get_improved_prompt(original)
print(improved)
text
Complete Logging
All interactions are logged with metadata for later analysis:
debugger.export_report("debug_session.json")
summary = debugger.get_session_summary() print(f"Total calls: {summary['total_calls']}") print(f"Issues found: {summary['issues_found']}")
text
Usage Examples
Example 1: Basic Usage with OpenAI
from prompt_debugger import AIDebugger import openai
def call_gpt(prompt): response = openai.ChatCompletion.create( model="gpt-3.5-turbo", messages=[{"role": "user", "content": prompt}] ) return response.choices.message.content
debugger = AIDebugger(verbose=True)
result = debugger.debug_call( llm_function=call_gpt, prompt="Write a Python function to sort a list" )
text
Example 2: Production Mode
debugger = AIDebugger(verbose=False)
result = debugger.debug_call(llm, user_input)
if result['prompt_analysis']['quality_score'] < 7: improved = debugger.get_improved_prompt(user_input) result = debugger.debug_call(llm, improved)
text
Example 3: Batch Processing
prompts = ["Prompt 1", "Prompt 2", "Prompt 3"]
for prompt in prompts: result = debugger.debug_call(llm, prompt) print(f"Quality: {result['response_analysis']['quality_score']}/10")
text
Compatibility
prompt-debugger works with any LLM provider:
- OpenAI (GPT-3.5, GPT-4, GPT-4-turbo)
- Anthropic (Claude, Claude-2, Claude-3)
- Google (Gemini, PaLM)
- Local models (Llama, Mistral, Ollama)
- Any custom LLM function
Architecture
The library consists of four main components:
| Component | Description |
|---|---|
| PromptChecker | Analyzes prompts for common issues |
| ResponseAnalyzer | Evaluates response quality and relevance |
| DebugLogger | Persistent logging of all interactions |
| AIDebugger | Main interface coordinating all components |
Requirements
- Python 3.8+
- No external dependencies (pure Python implementation)
Advanced Usage
Using Components Independently
from prompt_debugger import PromptChecker, ResponseAnalyzer
checker = PromptChecker() analysis = checker.check_prompt("Your prompt here")
analyzer = ResponseAnalyzer() quality = analyzer.analyze_response(prompt, response)
text
Accessing Debug Logs
from prompt_debugger import DebugLogger
logger = DebugLogger() logs = logger.get_all_interactions()
for log in logs: print(f"Call ID: {log['call_id']}") print(f"Quality: {log['prompt_analysis']['quality_score']}/10")
text
Author
Mohamed Imthiyas - Creator and Maintainer
- PyPI: prompt-debugger
Support
For questions or support, please contact through PyPI.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Changelog
Version 0.1.1 (2025-10-17)
- Updated author information
- Improved documentation
- Enhanced README with better examples
Version 0.1.0 (2025-10-17)
- Initial release
- Prompt quality analysis
- Response quality scoring
- Automatic suggestion generation
- Complete interaction logging
- Session tracking functionality
Acknowledgments
This project was developed to address the growing need for better debugging tools in LLM application development.
If you find this useful, please consider giving it a positive review on PyPI!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file prompt_debugger-0.1.1.tar.gz.
File metadata
- Download URL: prompt_debugger-0.1.1.tar.gz
- Upload date:
- Size: 13.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9ea03f9957ae7bb896c34eadec94b3de38daa7403bbb724d4f5cb827ad140b29
|
|
| MD5 |
2666297b4a6b15207c5ec745cdb7b761
|
|
| BLAKE2b-256 |
248c5be66496cd47c06c6a2cdbea3ffecc0fbc5763bb4a684dd096787c8dfbd4
|
File details
Details for the file prompt_debugger-0.1.1-py3-none-any.whl.
File metadata
- Download URL: prompt_debugger-0.1.1-py3-none-any.whl
- Upload date:
- Size: 11.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ad19506d5def5fae8605257a7ace650cf15920ac82f0c9919cd57a8e2eba7602
|
|
| MD5 |
8b4d32e7126a7c2b351e36f64ce8f364
|
|
| BLAKE2b-256 |
16b7a18e0fb777d83e14bdb3130c6f65a85fec6bb9d910eab6eba9cc63dec17c
|