Humanbound CLI - command line interface for AI agent security testing.
Project description
Humanbound CLI
CLI-first security testing for AI agents and chatbots. Adversarial attacks, behavioral QA, posture scoring, and guardrails export — from your terminal to your CI/CD pipeline.
pip install humanbound-cli
Overview
Humanbound runs automated adversarial attacks against your bot's live endpoint, evaluates responses using LLM-as-a-judge, and produces structured findings aligned with the OWASP Top 10 for LLM Applications and the OWASP Agentic AI Threats.
Platform Services
| Service | Description |
|---|---|
| CLI Tool | Full-featured command line interface. Initialize projects, run tests, check posture, export guardrails. |
| pytest Plugin | Native pytest integration with markers, fixtures, and baseline comparison. Run security tests alongside unit tests. |
| Adversarial Testing | OWASP-aligned attack scenarios: single-turn, multi-turn, adaptive, and agentic. |
| Behavioral Testing | Validate intent boundaries, response quality, and functional correctness. |
| Posture Scoring | Quantified 0-100 security score with breakdown by findings, coverage, and resilience. Track over time. |
| Guardrails Export | Generate protection rules from test findings. Export to OpenAI, Azure AI Content Safety, AWS Bedrock, or Humanbound format. |
Why Humanbound?
Manual red-teaming doesn't scale. Static analysis can't catch runtime behavior. Generic pentesting tools don't understand LLM-specific attack vectors like prompt injection, jailbreaks, or tool abuse.
Humanbound is built for this. Point it at your bot's endpoint, define the scope (or let it extract one from your system prompt), and get a structured security report with actionable findings — all mapped to OWASP LLM and Agentic AI categories.
Testing feeds into hardening: export guardrails, track posture across releases, and catch regressions before they reach production. Works with any chatbot or agent, cloud or on-prem.
Get Started
1. Install & authenticate
pip install humanbound-cli
hb login
2. Scan your bot & create a project
hb init scans your bot, extracts its scope and risk profile, and creates a project — all in one step. Point it at one or more sources:
# From a system prompt file
hb init -n "My Bot" --prompt ./system_prompt.txt
# From a live bot endpoint (API probing)
hb init -n "My Bot" -e ./bot-config.json
# From a live URL (browser discovery)
hb init -n "My Bot" -u https://my-bot.example.com
# Combine sources for better analysis
hb init -n "My Bot" --prompt ./system.txt -e ./bot-config.json
The --endpoint/-e flag accepts a JSON config (file or inline string) matching the experiment integration shape:
{
"streaming": false,
"thread_auth": {"endpoint": "", "headers": {}, "payload": {}},
"thread_init": {"endpoint": "https://bot.com/threads", "headers": {}, "payload": {}},
"chat_completion": {"endpoint": "https://bot.com/chat", "headers": {"Authorization": "Bearer token"}, "payload": {"content": "$PROMPT"}}
}
After scanning, you'll see the extracted scope, policies (permitted/restricted intents), and a risk dashboard with threat profile. Confirm to create the project.
3. Run a security test
# Run against your bot (uses project's default integration if configured during init)
hb test
# Or specify an endpoint directly
hb test -e ./bot-config.json
# Choose test category and depth
hb test -t humanbound/adversarial/owasp_multi_turn -l system
4. Review results
# Watch experiment progress
hb status --watch
# View logs
hb logs
# Check posture score
hb posture
# Export guardrails
hb guardrails --vendor openai -o guardrails.json
Test Categories
| Category | Mode | Description |
|---|---|---|
owasp_single_turn |
Adversarial | Single-prompt attacks: prompt injection, jailbreaks, data exfiltration. Fast coverage of basic vulnerabilities. |
owasp_multi_turn |
Adversarial | Conversational attacks that build context over multiple turns. Tests context manipulation and gradual escalation. |
owasp_agentic_multi_turn |
Adversarial | Targets tool-using agents. Tests goal hijacking, tool misuse, and privilege escalation. |
behavioral |
QA | Intent boundary validation and response quality testing. Ensures agent behaves within defined scope. |
Adaptive mode: Both owasp_multi_turn and owasp_agentic_multi_turn support an adaptive flag that enables evolutionary search — the attack strategy adapts based on bot responses instead of following scripted prompts.
Testing Levels
| Level | Description |
|---|---|
unit |
Standard coverage (~20 min) — default |
system |
Deep testing (~45 min) |
acceptance |
Full coverage (~90 min) |
pytest Integration
Run security tests alongside your existing test suite with native pytest markers and fixtures.
# test_security.py
import pytest
@pytest.mark.hb
def test_prompt_injection(hb):
"""Test prompt injection defenses."""
result = hb.test("llm001")
assert result.passed, f"Failed: {result.findings}"
@pytest.mark.hb
def test_posture_threshold(hb_posture):
"""Ensure posture meets minimum."""
assert hb_posture["score"] >= 70
@pytest.mark.hb
def test_no_regressions(hb, hb_baseline):
"""Compare against baseline."""
result = hb.test("llm001")
if hb_baseline:
regressions = result.compare(hb_baseline)
assert not regressions
# Run with Humanbound enabled
pytest --hb tests/
# Filter by category
pytest --hb --hb-category=adversarial
# Set failure threshold
pytest --hb --hb-fail-on=high
# Compare to baseline
pytest --hb --hb-baseline=baseline.json
# Save new baseline
pytest --hb --hb-save-baseline=baseline.json
CI/CD Integration
Block insecure deployments automatically with exit codes.
Build -> Unit Tests -> AI Security (hb test) -> Deploy
# .github/workflows/security.yml
name: AI Security Tests
on: [push, pull_request]
jobs:
security:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: pip install humanbound-cli
- name: Run Security Tests
env:
HUMANBOUND_API_KEY: ${{ secrets.HUMANBOUND_API_KEY }}
run: |
hb test --wait --fail-on=high
Usage
hb [--base-url URL] COMMAND [OPTIONS] [ARGS]
Authentication
| Command | Description |
|---|---|
login |
Authenticate via browser (OAuth PKCE) |
logout |
Clear stored credentials |
whoami |
Show current authentication status |
Organisation Management
| Command | Description |
|---|---|
orgs list |
List available organisations |
orgs current |
Show current organisation |
switch <id> |
Switch to organisation |
Provider Management
Providers are LLM configurations used for running security tests.
| Command | Description |
|---|---|
providers list |
List configured providers |
providers add |
Add new provider |
providers update <id> |
Update provider config |
providers remove <id> |
Remove provider |
providers add options
--name, -n Provider name: openai, claude, azureopenai, gemini, grok, custom
--api-key, -k API key
--endpoint, -e Endpoint URL (required for azureopenai, custom)
--model, -m Model name (optional)
--default Set as default provider
--interactive Interactive configuration mode
Project Management
| Command | Description |
|---|---|
projects list |
List projects |
projects use <id> |
Select project |
projects current |
Show current project |
projects show [id] |
Show project details |
projects update [id] |
Update project name/description |
projects delete [id] |
Delete project (with confirmation) |
init — scan bot & create project
hb init --name NAME [OPTIONS]
Sources (at least one required):
--prompt, -p PATH System prompt file (text source)
--url, -u URL Live bot URL for browser discovery (url source)
--endpoint, -e CONFIG Bot integration config — JSON string or file path (endpoint source)
--repo, -r PATH Repository path to scan (agentic or text source)
--openapi, -o PATH OpenAPI spec file (text source)
Options:
--description, -d Project description
--timeout, -t SECONDS Scan timeout (default: 180)
--yes, -y Auto-confirm project creation (no interactive prompts)
Test Execution
test — run security tests on current project
hb test [OPTIONS]
Test Category:
--test-category, -t Test to run (default: owasp_multi_turn)
Values: owasp_single_turn, owasp_multi_turn,
owasp_agentic_multi_turn, behavioral
Testing Level:
--testing-level, -l Depth of testing (default: unit)
unit | system | acceptance
Endpoint Override (optional — only needed if no default integration):
-e, --endpoint Bot integration config — JSON string or file path.
Same shape as 'hb init --endpoint'. Overrides default.
Other:
--provider-id Provider to use (default: first available)
--name, -n Experiment name (auto-generated if omitted)
--lang Language (default: english). Accepts codes: en, de, es...
--adaptive Enable adaptive mode (evolutionary attack strategy)
--no-auto-start Create without starting (manual mode)
--wait, -w Wait for completion
--fail-on SEVERITY Exit non-zero if findings >= severity
Values: critical, high, medium, low, any
Experiment Management
| Command | Description |
|---|---|
experiments list |
List experiments |
experiments show <id> |
Show experiment details |
experiments status <id> |
Check status |
experiments status <id> --watch |
Watch until completion |
experiments wait <id> |
Wait with progressive backoff (30s -> 60s -> 120s -> 300s) |
experiments logs <id> |
List experiment logs |
experiments terminate <id> |
Stop a running experiment |
experiments delete <id> |
Delete experiment (with confirmation) |
status is also available as a top-level alias — without an ID it shows the most recent experiment:
hb status [experiment_id] [--watch]
Findings
Track long-term security vulnerabilities across experiments.
| Command | Description |
|---|---|
findings |
List findings (filterable by --status, --severity) |
findings update <id> |
Update finding status or severity |
Finding states: open → stale (30+ days unseen) → fixed (resolved). Findings can also regress (was fixed, reappeared).
Coverage
| Command | Description |
|---|---|
coverage |
Test coverage summary |
coverage --gaps |
Include untested categories |
Campaigns
Continuous security assurance with automated campaign management (ASCAM).
| Command | Description |
|---|---|
campaigns |
Show current campaign plan |
campaigns break |
Stop a running campaign |
ASCAM phases: Reconnaissance → Hardening → Red Teaming → Analysis → Monitoring
Upload Conversation Logs
Evaluate real production conversations against security judges.
| Command | Description |
|---|---|
upload-logs <file> |
Upload JSON conversation logs |
Options: --tag, --lang
API Keys
| Command | Description |
|---|---|
api-keys list |
List API keys |
api-keys create |
Create new key (--name required, --scopes: admin/write/read) |
api-keys update <id> |
Update key name, scopes, or active state |
api-keys revoke <id> |
Revoke (delete) an API key |
Members
| Command | Description |
|---|---|
members list |
List organisation members |
members invite <email> |
Invite member (--role: admin/developer) |
members remove <id> |
Remove member |
Results & Export
# View experiment results (table, json, or csv)
hb logs [experiment_id] [--format table] [--verdict pass|fail] [--page N] [--size N]
# Security posture
hb posture [--json] [--trends]
# Test coverage
hb coverage [--gaps] [--json]
# Findings
hb findings [--status open] [--severity high] [--json]
# Export guardrails configuration
hb guardrails [--vendor humanbound|openai] [--format json|yaml] [-o FILE]
Documentation
hb docs
Opens documentation in browser.
Examples
End-to-end: scan, create project, test, review
hb login
hb switch abc123
# Scan bot & create project (uses endpoint config file)
hb init -n "Support Bot" -e ./bot-config.json
# Run adversarial test (uses project's default integration)
hb test -t humanbound/adversarial/owasp_multi_turn -l unit
# Watch and review
hb status --watch
hb logs
hb posture
Multi-source project init
# Combine system prompt + live endpoint for best scope extraction
hb init \
--name "Support Bot" \
--prompt ./prompts/system.txt \
--endpoint ./bot-config.json
# From repository + OpenAPI spec
hb init \
--name "API Agent" \
--repo ./my-agent \
--openapi ./openapi.yaml
Bot config with auth + thread init
{
"streaming": false,
"thread_auth": {
"endpoint": "https://bot.com/oauth/token",
"headers": {},
"payload": {"client_id": "x", "client_secret": "y"}
},
"thread_init": {
"endpoint": "https://bot.com/threads",
"headers": {"Content-Type": "application/json"},
"payload": {}
},
"chat_completion": {
"endpoint": "https://bot.com/chat",
"headers": {"Content-Type": "application/json"},
"payload": {"messages": [{"role": "user", "content": "$PROMPT"}]}
}
}
# Use with init or test
hb init -n "My Bot" -e ./bot-config.json
hb test -e ./bot-config.json
Export guardrails
hb guardrails --vendor openai --format json -o guardrails.json
On-Premises
export HUMANBOUND_BASE_URL=https://api.your-domain.com
hb login
Files
| Path | Description |
|---|---|
~/.humanbound/ |
Configuration directory |
~/.humanbound/credentials.json |
Auth tokens (mode 600) |
Exit Codes
| Code | Meaning |
|---|---|
0 |
Success |
1 |
Error or test failure (with --fail-on) |
Links
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file humanbound_cli-0.2.0-py3-none-any.whl.
File metadata
- Download URL: humanbound_cli-0.2.0-py3-none-any.whl
- Upload date:
- Size: 89.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b8868cc1bbc684f94af707376314b3ce543a150aeaa193f1b55a255654da6607
|
|
| MD5 |
7427091bdb161a086b64e0fc227af222
|
|
| BLAKE2b-256 |
c0330638ad734bcfc716c9b6d8e979c598c11255f9aab1cfe52decc69e18fe98
|