Real-time quality monitoring and failure detection for production AI agents
Project description
🧠 AgentIQ - Production-Ready AI Agent Performance Intelligence
The complete platform for monitoring, testing, and optimizing any AI agent in production.
🚀 What You Get
AgentIQ is a finished product that agents can immediately use to build and test. No more low-level work - just integrate and get instant insights.
✅ Production-Ready SDK
- One-line integration:
iq.track(user_input, agent_response) - Non-blocking monitoring: Never slows down your agent
- Auto-batching: Efficient data transmission
- Error-safe: Agent keeps running even if AgentIQ is down
✅ Autonomous Testing Framework
- Comprehensive test suites: Coding, Customer Service, Data Science, Sales
- Automated evaluation: LLM judges score every response
- Performance grading: A+ to F grades with specific recommendations
- Continuous monitoring: Real-time health checks
✅ Enterprise Dashboard
- Clear agent identification: See exactly which agents are monitored
- Evaluation coverage: Prominent display of sample percentages
- Key metrics at top: Quality scores, success rates, performance indicators
- Actionable insights: Specific developer recommendations with priorities
📦 Complete Installation
# 1. Clone AgentIQ
git clone <repo>
cd AgentIQ
# 2. API is already deployed at:
# https://agentiq-api-z9it.onrender.com
# 3. Dashboard is running at:
# http://localhost:8509
That's it. AgentIQ is ready for production use.
🎯 5-Minute Quick Start
Step 1: Monitor Any Agent (2 lines of code)
from agentiq_sdk import AgentIQ
# Initialize once
iq = AgentIQ(agent_id="my-awesome-agent")
# Monitor any interaction (non-blocking)
iq.track(
user_input="How do I fix this bug?",
agent_response="Here's how to fix it..."
)
# Get real-time insights
insights = iq.get_insights()
performance_score = iq.get_performance_score() # 0.0 - 1.0
recommendations = iq.get_recommendations()
Step 2: Test Agent Performance (1 line)
from agent_testing_framework import AgentTester
# Test any agent function
def my_agent(user_input: str) -> str:
return "Agent response here"
# Run comprehensive tests
tester = AgentTester("my-agent")
tester.register_agent(my_agent)
results = tester.run_full_test_suite()
# Get performance report
report = tester.generate_performance_report()
print(report) # Detailed A+ to F grade with recommendations
Step 3: View Enterprise Dashboard
URL: http://localhost:8509
- ✅ Agent identification: See which agents are being evaluated
- ✅ Evaluation coverage: 1.8% (3 of 171 interactions evaluated)
- ✅ Key metrics: Quality scores, success rates, performance indicators
- ✅ Actionable insights: Specific developer recommendations
🏭 Production Examples
Coding Agent Integration
class CodingAgent:
def __init__(self):
self.agentiq = AgentIQ(agent_id="production-coding-agent")
def respond(self, user_input: str) -> str:
response = self.generate_response(user_input)
# Track with AgentIQ (non-blocking)
self.agentiq.track(user_input, response)
return response
Customer Service Agent
class CustomerServiceAgent:
def __init__(self):
self.agentiq = AgentIQ(agent_id="customer-service-agent")
def handle_request(self, customer_input: str) -> str:
response = self.generate_response(customer_input)
# Automatic performance monitoring
self.agentiq.track(customer_input, response)
return response
Autonomous Testing
# Test any agent automatically
tester = AgentTester("production-agent")
tester.register_agent(my_agent_function)
# Run full test suite
results = tester.run_full_test_suite()
# Output: Pass rate: 85.2% (Grade: A)
# Continuous monitoring
tester.continuous_monitoring(interval_minutes=60)
📊 What AgentIQ Monitors
Usage Analytics
- Session volumes and patterns
- Intent classification across all agent types
- Workflow completion rates
- Response times and performance
Quality Assessment
- LLM-as-a-Judge evaluation: Autonomous scoring of every response
- Quality scores by agent type and intent
- Failure pattern detection
- Root cause analysis
Performance Insights
- Real-time recommendations: Specific actions to improve agent performance
- A+ to F grading system
- Critical issue identification
- Developer action items with priorities
Loss Pattern Analysis
- Dropout detection in agent workflows
- High-impact failure identification
- Recommended fixes for common problems
🎯 Agent Testing Framework
Comprehensive Test Suites
- Coding Agents: Debug errors, write functions, optimize code
- Customer Service: Handle complaints, billing issues, account recovery
- Data Science: Analyze data, create visualizations, generate insights
- Sales/BDR: Qualify leads, handle objections, close deals
- General: Basic reasoning, explanations, problem-solving
Automated Evaluation
# Example test results
🏆 AGENT PERFORMANCE REPORT
Agent ID: my-coding-agent
📊 OVERALL PERFORMANCE
• Tests Run: 12
• Pass Rate: 85.2% (10/12)
• Average Quality: 0.82/1.0
• Average Response Time: 1,200ms
🎯 RECOMMENDATIONS
1. 🔴 PRIORITY: Improve data_science performance (60% pass rate)
2. ⚡ Optimize response times for complex queries
3. 📈 Continue monitoring - overall performance is solid
🎓 OVERALL GRADE: A
🏢 Enterprise Features
Multi-Agent Monitoring
- Monitor coding assistants, customer service, data science, sales, marketing agents
- Unified dashboard showing performance across all agent types
- Comparative analysis and benchmarking
Production-Safe Integration
- Non-blocking tracking: Never impacts agent performance
- Error-resilient: Agent continues working even if AgentIQ is down
- Efficient batching: Minimal network overhead
- Auto-retry logic: Handles network failures gracefully
Actionable Developer Insights
- Specific recommendations: "Improve customer_service responses (quality: 0.65)"
- Priority levels: Critical, High, Medium with timelines
- Expected impact: "Could improve 1,500 interactions/month"
- Root cause analysis: Identify exactly what needs fixing
🔗 Complete System
1. AgentIQ SDK (agentiq_sdk.py)
- Production-ready Python SDK
- One-line agent integration
- Real-time performance insights
- Non-blocking monitoring
2. Testing Framework (agent_testing_framework.py)
- Autonomous agent testing
- Comprehensive test suites
- A+ to F performance grading
- Continuous monitoring
3. Enterprise Dashboard (http://localhost:8509)
- Professional monitoring interface
- Clear agent identification
- Key metrics prominently displayed
- Actionable developer insights
4. Production API (https://agentiq-api-z9it.onrender.com)
- Deployed and ready for use
- High availability monitoring
- Real-time data processing
- Secure agent data handling
📈 Immediate Value
For Developers
- Zero setup time: Works immediately with any agent
- Clear performance metrics: Know exactly how your agent is performing
- Specific improvements: Get actionable recommendations, not vague scores
- Production confidence: Test thoroughly before deployment
For Enterprises
- Multi-agent visibility: Monitor all AI agents from one dashboard
- Performance benchmarking: Compare agents and identify top performers
- Risk mitigation: Catch performance degradation before it impacts users
- ROI measurement: Prove business impact of agent improvements
For Product Teams
- User experience insights: See where agents fail and frustrate users
- Optimization roadmap: Clear priority list of improvements
- Quality assurance: Automated testing prevents regressions
- Competitive advantage: Higher quality agents = better user experience
🚀 Ready for Production
AgentIQ is a complete, finished product. Your agents can start using it immediately:
# 1. Install (copy 3 files)
# agentiq_sdk.py, agent_testing_framework.py, complete_agentiq_example.py
# 2. Integrate (2 lines)
from agentiq_sdk import AgentIQ
iq = AgentIQ(agent_id="your-agent")
iq.track(user_input, agent_response)
# 3. Test (1 line)
from agent_testing_framework import AgentTester
AgentTester("your-agent").run_full_test_suite()
# 4. Monitor (dashboard)
# http://localhost:8509
No more low-level work. No more building infrastructure. AgentIQ handles everything so you can focus on building great agents.
📞 Support
- API Endpoint: https://agentiq-api-z9it.onrender.com
- Enterprise Dashboard: http://localhost:8509
- Complete Examples:
python3 complete_agentiq_example.py - Production Ready: Copy 3 files and start monitoring
AgentIQ: The finished product for AI agent performance intelligence.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file kalytera-0.1.0.tar.gz.
File metadata
- Download URL: kalytera-0.1.0.tar.gz
- Upload date:
- Size: 1.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e2b93fedf19edc0d0eaca70b373d9755e97971e0f83483759516c703d5f328e0
|
|
| MD5 |
7e96287e4d9929c6a37ecaba352cac6b
|
|
| BLAKE2b-256 |
c9608b197fdd9aa9de2639eeaefa6e09d2bf53dc0b7fe9e4b85a1039d29f0b8a
|
File details
Details for the file kalytera-0.1.0-py3-none-any.whl.
File metadata
- Download URL: kalytera-0.1.0-py3-none-any.whl
- Upload date:
- Size: 27.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e52215d888ccee5d0488721d013c7ce5e42903c76c63feb3a0e2ad076cf38074
|
|
| MD5 |
36d7b2751e8c907ddf82ecc22ce412d2
|
|
| BLAKE2b-256 |
c30b6415c73add360ffd22f33b766487e501fa4dd073c05bf41832312f1bb7de
|