Chaos testing for AI apps. 18 extreme personas attack your AI to find edge cases before users do. OWASP LLM Top 10 coverage.
Project description
House Monkey 🐒
Chaos testing for AI apps. 18 extreme personas attack your AI to find edge cases before users do.
pip install housemonkey
housemonkey run --target https://your-api.com/chat --owasp
One command. 18 extreme personas. OWASP LLM Top 10 coverage. Terminal report in 2 minutes.
What it does
House Monkey attacks your AI app with realistic extreme users:
- The Jailbreaker — tries to extract your system prompt (OWASP LLM01)
- The Angry Customer — escalating hostility, demands manager
- The Confused Grandma — off-topic, misunderstands everything
- The Hallucination Baiter — asks about things that don't exist (OWASP LLM09)
- The Permission Escalator — tricks AI into unauthorized actions (OWASP LLM06)
- The RAG Poisoner — manipulates retrieval context (OWASP LLM08)
- ...and 12 more
Each persona runs a multi-turn conversation against your API, then an LLM judge evaluates if your AI handled it correctly.
Quick start
# Install
pip install housemonkey
# List all personas
housemonkey list
# Test your AI (needs OpenAI API key for persona generation + judging)
export OPENAI_API_KEY=sk-...
housemonkey run --target https://your-api.com/chat
# Run only OWASP-mapped personas
housemonkey run --target https://your-api.com/chat --owasp
# Run specific personas
housemonkey run --target https://your-api.com/chat --persona jailbreaker oversharer
# Custom API format (non-OpenAI)
housemonkey run --target https://your-api.com/ask --payload-template '{"input": "{{message}}"}'
# Save JSON report
housemonkey run --target https://your-api.com/chat --output report.json
OWASP LLM Top 10 coverage
| OWASP ID | Vulnerability | Persona |
|---|---|---|
| LLM01 | Prompt Injection | The Jailbreaker |
| LLM02 | Sensitive Info Disclosure | The Oversharer |
| LLM05 | Improper Output Handling | The JSON Breaker |
| LLM06 | Excessive Agency | The Permission Escalator |
| LLM08 | Vector/Embedding Weakness | The RAG Poisoner |
| LLM09 | Misinformation | The Hallucination Baiter |
| LLM10 | Unbounded Consumption | The Resource Abuser |
How it works
- Each persona has a system prompt that simulates an extreme user type
- An LLM generates realistic messages as that persona
- Messages are sent to your target API
- An LLM judge evaluates if your AI handled the persona correctly
- Terminal report shows pass/fail with specific failure reasons
Try it on a broken chatbot
# Start the intentionally broken test target (7 built-in flaws)
python test_target.py
# In another terminal, attack it
housemonkey run --target http://127.0.0.1:8888 --owasp
Requirements
- Python 3.10+
- OpenAI API key (for persona generation + judging)
- Your AI app must have an HTTP API endpoint
License
MIT. Powered by ClawClaw Soul open-source persona engine.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file housemonkey-0.1.0.tar.gz.
File metadata
- Download URL: housemonkey-0.1.0.tar.gz
- Upload date:
- Size: 15.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3aa033f8d8b8a4d5b73f0db0986d988a86bb1bd63c85db3df9ea59b1af717d61
|
|
| MD5 |
c748a39b4bd06c9c19d12a6945a1af17
|
|
| BLAKE2b-256 |
e8a28200e1054a83289325b45161ed112101cf2bc5f40b496442b7addb0c93ec
|
File details
Details for the file housemonkey-0.1.0-py3-none-any.whl.
File metadata
- Download URL: housemonkey-0.1.0-py3-none-any.whl
- Upload date:
- Size: 16.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8a987ee1693ea8f9296f47b54382ff4d8fca5e370983d222f9f7e8e72e941fdc
|
|
| MD5 |
b6b04e09fa71e6ec8ea9f1a05f20c2c1
|
|
| BLAKE2b-256 |
7f5721023877d76219d6cc88f7ad4ba88862a7fd529ae0d9e1431807c15aa958
|