A generative AI-powered framework for testing virtual agents.
Project description
Agent Evaluation
Agent Evaluation is a generative AI-powered framework for testing virtual agents.
Internally, Agent Evaluation implements an LLM agent (evaluator) that will orchestrate conversations with your own agent (target) and evaluate the responses during the conversation.
✨ Key features
- Built-in support for popular AWS services including Amazon Bedrock, Amazon Q Business, and Amazon SageMaker. You can also bring your own agent to test using Agent Evaluation.
- Orchestrate concurrent, multi-turn conversations with your agent while evaluating its responses.
- Define hooks to perform additional tasks such as integration testing.
- Can be incorporated into CI/CD pipelines to expedite the time to delivery while maintaining the stability of agents in production environments.
📚 Documentation
To get started, please visit the full documentation here. To contribute, please refer to CONTRIBUTING.md
👏 Contributors
Shout out to these awesome contributors:
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
agent_evaluation-0.2.0.tar.gz
(22.3 kB
view hashes)
Built Distribution
Close
Hashes for agent_evaluation-0.2.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ba24cc7e845435e9c5a50fadcaa9cbdd121dd7ead3edd135638f303e4babd312 |
|
MD5 | e24e4c79a950f67627047cc128dbab0a |
|
BLAKE2b-256 | 2fbcbbe6230edacd58b04c2d83c09fe8db799e5f4e3d372e5b1ca93da93012b9 |