A generative AI-powered framework for testing virtual agents.
Project description
Agent Evaluation
Agent Evaluation is a generative AI-powered framework for testing virtual agents.
Internally, Agent Evaluation implements an LLM agent (evaluator) that will orchestrate conversations with your own agent (target) and evaluate the responses during the conversation.
✨ Key features
- Built-in support for popular AWS services including Amazon Bedrock, Amazon Q Business, and Amazon SageMaker. You can also bring your own agent to test using Agent Evaluation.
- Orchestrate concurrent, multi-turn conversations with your agent while evaluating its responses.
- Define hooks to perform additional tasks such as integration testing.
- Can be incorporated into CI/CD pipelines to expedite the time to delivery while maintaining the stability of agents in production environments.
📚 Documentation
To get started, please visit the full documentation here. To contribute, please refer to CONTRIBUTING.md
👏 Contributors
Shout out to these awesome contributors:
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
agent_evaluation-0.1.0.tar.gz
(22.3 kB
view hashes)
Built Distribution
Close
Hashes for agent_evaluation-0.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6c8876df6bb6fb932c724fa9ad16e31f81e60d4f7fd5986eba4651925ae183c5 |
|
MD5 | 0f9d744eff88fc238a56ae6f6df7944e |
|
BLAKE2b-256 | d7e6731c68918c5fce4c71c58096ffb36d87ce70d925bdd825538e46d12fdab2 |