Skip to main content

Real-time quality monitoring and failure detection for production AI agents

Project description

🧠 AgentIQ - Production-Ready AI Agent Performance Intelligence

The complete platform for monitoring, testing, and optimizing any AI agent in production.


🚀 What You Get

AgentIQ is a finished product that agents can immediately use to build and test. No more low-level work - just integrate and get instant insights.

Production-Ready SDK

  • One-line integration: iq.track(user_input, agent_response)
  • Non-blocking monitoring: Never slows down your agent
  • Auto-batching: Efficient data transmission
  • Error-safe: Agent keeps running even if AgentIQ is down

Autonomous Testing Framework

  • Comprehensive test suites: Coding, Customer Service, Data Science, Sales
  • Automated evaluation: LLM judges score every response
  • Performance grading: A+ to F grades with specific recommendations
  • Continuous monitoring: Real-time health checks

Enterprise Dashboard

  • Clear agent identification: See exactly which agents are monitored
  • Evaluation coverage: Prominent display of sample percentages
  • Key metrics at top: Quality scores, success rates, performance indicators
  • Actionable insights: Specific developer recommendations with priorities

📦 Complete Installation

# 1. Clone AgentIQ
git clone <repo>
cd AgentIQ

# 2. API is already deployed at:
# https://agentiq-api-z9it.onrender.com

# 3. Dashboard is running at:
# http://localhost:8509

That's it. AgentIQ is ready for production use.


🎯 5-Minute Quick Start

Step 1: Monitor Any Agent (2 lines of code)

from agentiq_sdk import AgentIQ

# Initialize once
iq = AgentIQ(agent_id="my-awesome-agent")

# Monitor any interaction (non-blocking)
iq.track(
    user_input="How do I fix this bug?", 
    agent_response="Here's how to fix it..."
)

# Get real-time insights
insights = iq.get_insights()
performance_score = iq.get_performance_score()  # 0.0 - 1.0
recommendations = iq.get_recommendations()

Step 2: Test Agent Performance (1 line)

from agent_testing_framework import AgentTester

# Test any agent function
def my_agent(user_input: str) -> str:
    return "Agent response here"

# Run comprehensive tests
tester = AgentTester("my-agent")
tester.register_agent(my_agent)
results = tester.run_full_test_suite()

# Get performance report
report = tester.generate_performance_report()
print(report)  # Detailed A+ to F grade with recommendations

Step 3: View Enterprise Dashboard

URL: http://localhost:8509

  • Agent identification: See which agents are being evaluated
  • Evaluation coverage: 1.8% (3 of 171 interactions evaluated)
  • Key metrics: Quality scores, success rates, performance indicators
  • Actionable insights: Specific developer recommendations

🏭 Production Examples

Coding Agent Integration

class CodingAgent:
    def __init__(self):
        self.agentiq = AgentIQ(agent_id="production-coding-agent")
    
    def respond(self, user_input: str) -> str:
        response = self.generate_response(user_input)
        
        # Track with AgentIQ (non-blocking)
        self.agentiq.track(user_input, response)
        
        return response

Customer Service Agent

class CustomerServiceAgent:
    def __init__(self):
        self.agentiq = AgentIQ(agent_id="customer-service-agent")
    
    def handle_request(self, customer_input: str) -> str:
        response = self.generate_response(customer_input)
        
        # Automatic performance monitoring
        self.agentiq.track(customer_input, response)
        
        return response

Autonomous Testing

# Test any agent automatically
tester = AgentTester("production-agent")
tester.register_agent(my_agent_function)

# Run full test suite
results = tester.run_full_test_suite()
# Output: Pass rate: 85.2% (Grade: A)

# Continuous monitoring
tester.continuous_monitoring(interval_minutes=60)

📊 What AgentIQ Monitors

Usage Analytics

  • Session volumes and patterns
  • Intent classification across all agent types
  • Workflow completion rates
  • Response times and performance

Quality Assessment

  • LLM-as-a-Judge evaluation: Autonomous scoring of every response
  • Quality scores by agent type and intent
  • Failure pattern detection
  • Root cause analysis

Performance Insights

  • Real-time recommendations: Specific actions to improve agent performance
  • A+ to F grading system
  • Critical issue identification
  • Developer action items with priorities

Loss Pattern Analysis

  • Dropout detection in agent workflows
  • High-impact failure identification
  • Recommended fixes for common problems

🎯 Agent Testing Framework

Comprehensive Test Suites

  • Coding Agents: Debug errors, write functions, optimize code
  • Customer Service: Handle complaints, billing issues, account recovery
  • Data Science: Analyze data, create visualizations, generate insights
  • Sales/BDR: Qualify leads, handle objections, close deals
  • General: Basic reasoning, explanations, problem-solving

Automated Evaluation

# Example test results
🏆 AGENT PERFORMANCE REPORT
Agent ID: my-coding-agent

📊 OVERALL PERFORMANCE
 Tests Run: 12
 Pass Rate: 85.2% (10/12)
 Average Quality: 0.82/1.0
 Average Response Time: 1,200ms

🎯 RECOMMENDATIONS
1. 🔴 PRIORITY: Improve data_science performance (60% pass rate)
2.  Optimize response times for complex queries
3. 📈 Continue monitoring - overall performance is solid

🎓 OVERALL GRADE: A

🏢 Enterprise Features

Multi-Agent Monitoring

  • Monitor coding assistants, customer service, data science, sales, marketing agents
  • Unified dashboard showing performance across all agent types
  • Comparative analysis and benchmarking

Production-Safe Integration

  • Non-blocking tracking: Never impacts agent performance
  • Error-resilient: Agent continues working even if AgentIQ is down
  • Efficient batching: Minimal network overhead
  • Auto-retry logic: Handles network failures gracefully

Actionable Developer Insights

  • Specific recommendations: "Improve customer_service responses (quality: 0.65)"
  • Priority levels: Critical, High, Medium with timelines
  • Expected impact: "Could improve 1,500 interactions/month"
  • Root cause analysis: Identify exactly what needs fixing

🔗 Complete System

1. AgentIQ SDK (agentiq_sdk.py)

  • Production-ready Python SDK
  • One-line agent integration
  • Real-time performance insights
  • Non-blocking monitoring

2. Testing Framework (agent_testing_framework.py)

  • Autonomous agent testing
  • Comprehensive test suites
  • A+ to F performance grading
  • Continuous monitoring

3. Enterprise Dashboard (http://localhost:8509)

  • Professional monitoring interface
  • Clear agent identification
  • Key metrics prominently displayed
  • Actionable developer insights

4. Production API (https://agentiq-api-z9it.onrender.com)

  • Deployed and ready for use
  • High availability monitoring
  • Real-time data processing
  • Secure agent data handling

📈 Immediate Value

For Developers

  • Zero setup time: Works immediately with any agent
  • Clear performance metrics: Know exactly how your agent is performing
  • Specific improvements: Get actionable recommendations, not vague scores
  • Production confidence: Test thoroughly before deployment

For Enterprises

  • Multi-agent visibility: Monitor all AI agents from one dashboard
  • Performance benchmarking: Compare agents and identify top performers
  • Risk mitigation: Catch performance degradation before it impacts users
  • ROI measurement: Prove business impact of agent improvements

For Product Teams

  • User experience insights: See where agents fail and frustrate users
  • Optimization roadmap: Clear priority list of improvements
  • Quality assurance: Automated testing prevents regressions
  • Competitive advantage: Higher quality agents = better user experience

🚀 Ready for Production

AgentIQ is a complete, finished product. Your agents can start using it immediately:

# 1. Install (copy 3 files)
# agentiq_sdk.py, agent_testing_framework.py, complete_agentiq_example.py

# 2. Integrate (2 lines)
from agentiq_sdk import AgentIQ
iq = AgentIQ(agent_id="your-agent")
iq.track(user_input, agent_response)

# 3. Test (1 line)
from agent_testing_framework import AgentTester
AgentTester("your-agent").run_full_test_suite()

# 4. Monitor (dashboard)
# http://localhost:8509

No more low-level work. No more building infrastructure. AgentIQ handles everything so you can focus on building great agents.


📞 Support

  • API Endpoint: https://agentiq-api-z9it.onrender.com
  • Enterprise Dashboard: http://localhost:8509
  • Complete Examples: python3 complete_agentiq_example.py
  • Production Ready: Copy 3 files and start monitoring

AgentIQ: The finished product for AI agent performance intelligence.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kalytera-0.1.0.tar.gz (1.1 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

kalytera-0.1.0-py3-none-any.whl (27.7 kB view details)

Uploaded Python 3

File details

Details for the file kalytera-0.1.0.tar.gz.

File metadata

  • Download URL: kalytera-0.1.0.tar.gz
  • Upload date:
  • Size: 1.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.4

File hashes

Hashes for kalytera-0.1.0.tar.gz
Algorithm Hash digest
SHA256 e2b93fedf19edc0d0eaca70b373d9755e97971e0f83483759516c703d5f328e0
MD5 7e96287e4d9929c6a37ecaba352cac6b
BLAKE2b-256 c9608b197fdd9aa9de2639eeaefa6e09d2bf53dc0b7fe9e4b85a1039d29f0b8a

See more details on using hashes here.

File details

Details for the file kalytera-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: kalytera-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 27.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.4

File hashes

Hashes for kalytera-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e52215d888ccee5d0488721d013c7ce5e42903c76c63feb3a0e2ad076cf38074
MD5 36d7b2751e8c907ddf82ecc22ce412d2
BLAKE2b-256 c30b6415c73add360ffd22f33b766487e501fa4dd073c05bf41832312f1bb7de

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page