A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents
Project description
AgentDojo: Benchmarking the Capabilities and Adversarial Robustness of LLM Agents
Quickstart
pip install agentdojo
If you want to use the prompt injection detector, you need to install the transformers
extra:
pip install "agentdojo[transformers]"
Running the benchmark
The benchmark can be run with the benchmark script. Documentation on how to use the script can be obtained with the --help
flag.
For example, to run the workspace
suite on the tasks 0 and 1, with gpt-4o-2024-05-13
as the LLM, the tool filter as a defense, and the attacker with tool knowlege, run the following command:
python -m agentdojo.scripts.benchmark -s workspace -ut user_task_0 \
-ut user_task_1 --model gpt-4o-2024-05-13 \
--defense tool_filter --attacker tool_knowledge
To run the above, but on all suites and tasks, run the following:
python -m agentdojo.scripts.benchmark --model gpt-4o-2024-05-13 \
--defense tool_filter --attacker tool_knowledge
Documentation of the Dojo
Take a look at our documentation.
Development set-up
Take a look at the development set-up docs.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
agentdojo-0.1.1.tar.gz
(16.9 MB
view hashes)
Built Distribution
agentdojo-0.1.1-py3-none-any.whl
(139.0 kB
view hashes)
Close
Hashes for agentdojo-0.1.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7ef511ad184483829f8e858b653f70b29ff0434bb44815f2f0fec53be487c900 |
|
MD5 | 810887f72557f615bc934322af3e2686 |
|
BLAKE2b-256 | 21ba95619b5d3cbf6dc4eee773fd7a9dbdcd179e002e76b85f5a12377e291de7 |