A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents
Project description
AgentDojo: Benchmarking the Capabilities and Adversarial Robustness of LLM Agents
Quickstart
pip install agentdojo
If you want to use the prompt injection detector, you need to install the transformers
extra:
pip install "agentdojo[transformers]"
Running the benchmark
The benchmark can be run with the benchmark script. Documentation on how to use the script can be obtained with the --help
flag.
For example, to run the workspace
suite on the tasks 0 and 1, with gpt-4o-2024-05-13
as the LLM, the tool filter as a defense, and the attacker with tool knowlege, run the following command:
python -m agentdojo.scripts.benchmark -s workspace -ut user_task_0 \
-ut user_task_1 --model gpt-4o-2024-05-13 \
--defense tool_filter --attacker tool_knowledge
To run the above, but on all suites and tasks, run the following:
python -m agentdojo.scripts.benchmark --model gpt-4o-2024-05-13 \
--defense tool_filter --attacker tool_knowledge
Documentation of the Dojo
Take a look at our documentation.
Development set-up
Take a look at the development set-up docs.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
agentdojo-0.1.3.tar.gz
(16.9 MB
view hashes)
Built Distribution
agentdojo-0.1.3-py3-none-any.whl
(139.1 kB
view hashes)
Close
Hashes for agentdojo-0.1.3-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a17173ad384331c7b0481b4c05787f861d081b3dfffac44c5a411df8e7c3bfa4 |
|
MD5 | 9176642ad254a983b568c8b79c34640d |
|
BLAKE2b-256 | 7cd5b5b0f311bb60d2f3e722ae733274bfc684381fc6984451bd4cf0254349bf |