A generative AI-powered framework for testing virtual agents.
Project description
Agent Evaluation
Agent Evaluation is a generative AI-powered framework for testing virtual agents.
Internally, Agent Evaluation implements an LLM agent (evaluator) that will orchestrate conversations with your own agent (target) and evaluate the responses during the conversation.
✨ Key features
- Built-in support for popular AWS services including Amazon Bedrock, Amazon Q Business, and Amazon SageMaker. You can also bring your own agent to test using Agent Evaluation.
- Orchestrate concurrent, multi-turn conversations with your agent while evaluating its responses.
- Define hooks to perform additional tasks such as integration testing.
- Can be incorporated into CI/CD pipelines to expedite the time to delivery while maintaining the stability of agents in production environments.
📚 Documentation
To get started, please visit the full documentation here. To contribute, please refer to CONTRIBUTING.md
👏 Contributors
Shout out to these awesome contributors:
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
agent_evaluation-0.2.0.tar.gz
(22.3 kB
view details)
Built Distribution
File details
Details for the file agent_evaluation-0.2.0.tar.gz
.
File metadata
- Download URL: agent_evaluation-0.2.0.tar.gz
- Upload date:
- Size: 22.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.11.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | afaada1e206022d4c3c2fece8e1494571aef4ca64d912badd6dd851b4fd4b2ac |
|
MD5 | 372ea8b92c13456e6b20fae18312884b |
|
BLAKE2b-256 | de348fc0850168c265da48c5082d8099cd50815372e164bd54b30479035b35d6 |
File details
Details for the file agent_evaluation-0.2.0-py3-none-any.whl
.
File metadata
- Download URL: agent_evaluation-0.2.0-py3-none-any.whl
- Upload date:
- Size: 34.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.11.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ba24cc7e845435e9c5a50fadcaa9cbdd121dd7ead3edd135638f303e4babd312 |
|
MD5 | e24e4c79a950f67627047cc128dbab0a |
|
BLAKE2b-256 | 2fbcbbe6230edacd58b04c2d83c09fe8db799e5f4e3d372e5b1ca93da93012b9 |