Skip to main content

Humanbound CLI - command line interface for AI agent security testing.

Project description

Humanbound CLI

CLI-first security testing for AI agents and chatbots. Adversarial attacks, behavioral QA, posture scoring, and guardrails export — from your terminal to your CI/CD pipeline.

PyPI License Version

pip install humanbound-cli

Overview

Humanbound runs automated adversarial attacks against your bot's live endpoint, evaluates responses using LLM-as-a-judge, and produces structured findings aligned with the OWASP Top 10 for LLM Applications and the OWASP Agentic AI Threats.

Platform Services

Service Description
CLI Tool Full-featured command line interface. Initialize projects, run tests, check posture, export guardrails.
pytest Plugin Native pytest integration with markers, fixtures, and baseline comparison. Run security tests alongside unit tests.
Adversarial Testing OWASP-aligned attack scenarios: single-turn, multi-turn, adaptive, and agentic.
Behavioral Testing Validate intent boundaries, response quality, and functional correctness.
Posture Scoring Quantified 0-100 security score with breakdown by findings, coverage, and resilience. Track over time.
Shadow AI Discovery Scan cloud tenants for AI services, assess risk with 15 SAI threat classes, and govern your AI inventory.
Guardrails Export Generate protection rules from test findings. Export to OpenAI, Azure AI Content Safety, AWS Bedrock, or Humanbound format.
MCP Server Model Context Protocol server exposing all CLI capabilities as tools for AI assistants (Claude Code, Cursor, Gemini CLI, etc.).

Why Humanbound?

Manual red-teaming doesn't scale. Static analysis can't catch runtime behavior. Generic pentesting tools don't understand LLM-specific attack vectors like prompt injection, jailbreaks, or tool abuse.

Humanbound is built for this. Point it at your bot's endpoint, define the scope (or let it extract one from your system prompt), and get a structured security report with actionable findings — all mapped to OWASP LLM and Agentic AI categories.

Testing feeds into hardening: export guardrails, track posture across releases, and catch regressions before they reach production. Works with any chatbot or agent, cloud or on-prem.


Get Started

1. Install & authenticate

pip install humanbound-cli
hb login

2. Scan your bot & create a project

hb init scans your bot, extracts its scope and risk profile, and creates a project — all in one step. Point it at one or more sources:

# From a system prompt file
hb init -n "My Bot" --prompt ./system_prompt.txt

# From a live bot endpoint (API probing)
hb init -n "My Bot" -e ./bot-config.json

# From a live URL (browser discovery)
hb init -n "My Bot" -u https://my-bot.example.com

# Combine sources for better analysis
hb init -n "My Bot" --prompt ./system.txt -e ./bot-config.json

The --endpoint/-e flag accepts a JSON config (file or inline string) matching the experiment integration shape:

{
  "streaming": false,
  "thread_auth": {"endpoint": "", "headers": {}, "payload": {}},
  "thread_init": {"endpoint": "https://bot.com/threads", "headers": {}, "payload": {}},
  "chat_completion": {"endpoint": "https://bot.com/chat", "headers": {"Authorization": "Bearer token"}, "payload": {"content": "$PROMPT"}}
}

After scanning, you'll see the extracted scope, policies (permitted/restricted intents), and a risk dashboard with threat profile. Confirm to create the project.

3. Run a security test

# Run against your bot (uses project's default integration if configured during init)
hb test

# Or specify an endpoint directly
hb test -e ./bot-config.json

# Choose test category and depth
hb test -t humanbound/adversarial/owasp_multi_turn -l system

4. Review results

# Watch experiment progress
hb status --watch

# View logs
hb logs

# Check posture score
hb posture

# Export guardrails
hb guardrails --vendor openai -o guardrails.json

Test Categories

Category Mode Description
owasp_single_turn Adversarial Single-prompt attacks: prompt injection, jailbreaks, data exfiltration. Fast coverage of basic vulnerabilities.
owasp_multi_turn Adversarial Conversational attacks that build context over multiple turns. Tests context manipulation and gradual escalation.
owasp_agentic_multi_turn Adversarial Targets tool-using agents. Tests goal hijacking, tool misuse, and privilege escalation.
behavioral QA Intent boundary validation and response quality testing. Ensures agent behaves within defined scope.

Adaptive mode: Both owasp_multi_turn and owasp_agentic_multi_turn support an adaptive flag that enables evolutionary search — the attack strategy adapts based on bot responses instead of following scripted prompts.

Testing Levels

Level Description
unit Standard coverage (~20 min) — default
system Deep testing (~45 min)
acceptance Full coverage (~90 min)

pytest Integration

Run security tests alongside your existing test suite with native pytest markers and fixtures.

# test_security.py
import pytest

@pytest.mark.hb
def test_prompt_injection(hb):
    """Test prompt injection defenses."""
    result = hb.test("llm001")
    assert result.passed, f"Failed: {result.findings}"

@pytest.mark.hb
def test_posture_threshold(hb_posture):
    """Ensure posture meets minimum."""
    assert hb_posture["score"] >= 70

@pytest.mark.hb
def test_no_regressions(hb, hb_baseline):
    """Compare against baseline."""
    result = hb.test("llm001")
    if hb_baseline:
        regressions = result.compare(hb_baseline)
        assert not regressions
# Run with Humanbound enabled
pytest --hb tests/

# Filter by category
pytest --hb --hb-category=adversarial

# Set failure threshold
pytest --hb --hb-fail-on=high

# Compare to baseline
pytest --hb --hb-baseline=baseline.json

# Save new baseline
pytest --hb --hb-save-baseline=baseline.json

CI/CD Integration

Block insecure deployments automatically with exit codes.

Build -> Unit Tests -> AI Security (hb test) -> Deploy
# .github/workflows/security.yml
name: AI Security Tests
on: [push, pull_request]

jobs:
  security:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: pip install humanbound-cli
      - name: Run Security Tests
        env:
          HUMANBOUND_API_KEY: ${{ secrets.HUMANBOUND_API_KEY }}
        run: |
          hb test --wait --fail-on=high

Usage

hb [--base-url URL] COMMAND [OPTIONS] [ARGS]

Authentication

Command Description
login Authenticate via browser (OAuth PKCE)
logout Clear stored credentials
whoami Show current authentication status

Organisation Management

Command Description
orgs list List available organisations
orgs current Show current organisation
switch <id> Switch to organisation

Provider Management

Providers are LLM configurations used for running security tests.

Command Description
providers list List configured providers
providers add Add new provider
providers update <id> Update provider config
providers remove <id> Remove provider
providers add options
--name, -n        Provider name: openai, claude, azureopenai, gemini, grok, custom
--api-key, -k     API key
--endpoint, -e    Endpoint URL (required for azureopenai, custom)
--model, -m       Model name (optional)
--default         Set as default provider
--interactive     Interactive configuration mode

Project Management

Command Description
projects list List projects
projects use <id> Select project
projects current Show current project
projects show [id] Show project details
projects update [id] Update project name/description
projects delete [id] Delete project (with confirmation)
init — scan bot & create project
hb init --name NAME [OPTIONS]

Sources (at least one required):
  --prompt, -p PATH       System prompt file (text source)
  --url, -u URL           Live bot URL for browser discovery (url source)
  --endpoint, -e CONFIG   Bot integration config — JSON string or file path (endpoint source)
  --repo, -r PATH         Repository path to scan (agentic or text source)
  --openapi, -o PATH      OpenAPI spec file (text source)

Options:
  --description, -d       Project description
  --timeout, -t SECONDS   Scan timeout (default: 180)
  --yes, -y               Auto-confirm project creation (no interactive prompts)

Test Execution

test — run security tests on current project
hb test [OPTIONS]

Test Category:
  --test-category, -t   Test to run (default: owasp_multi_turn)
                        Values: owasp_single_turn, owasp_multi_turn,
                                owasp_agentic_multi_turn, behavioral

Testing Level:
  --testing-level, -l   Depth of testing (default: unit)
                        unit | system | acceptance

Endpoint Override (optional — only needed if no default integration):
  -e, --endpoint        Bot integration config — JSON string or file path.
                        Same shape as 'hb init --endpoint'. Overrides default.

Other:
  --provider-id         Provider to use (default: first available)
  --name, -n            Experiment name (auto-generated if omitted)
  --lang                Language (default: english). Accepts codes: en, de, es...
  --adaptive            Enable adaptive mode (evolutionary attack strategy)
  --no-auto-start       Create without starting (manual mode)
  --wait, -w            Wait for completion
  --fail-on SEVERITY    Exit non-zero if findings >= severity
                        Values: critical, high, medium, low, any

Experiment Management

Command Description
experiments list List experiments
experiments show <id> Show experiment details
experiments status <id> Check status
experiments status <id> --watch Watch until completion
experiments wait <id> Wait with progressive backoff (30s -> 60s -> 120s -> 300s)
experiments logs <id> List experiment logs
experiments terminate <id> Stop a running experiment
experiments delete <id> Delete experiment (with confirmation)

status is also available as a top-level alias — without an ID it shows the most recent experiment:

hb status [experiment_id] [--watch]

Findings

Track long-term security vulnerabilities across experiments.

Command Description
findings List findings (filterable by --status, --severity)
findings update <id> Update finding status or severity

Finding states: openstale (30+ days unseen) → fixed (resolved). Findings can also regress (was fixed, reappeared).

Coverage

Command Description
coverage Test coverage summary
coverage --gaps Include untested categories

Campaigns

Continuous security assurance with automated campaign management (ASCAM).

Command Description
campaigns Show current campaign plan
campaigns break Stop a running campaign

ASCAM phases: Reconnaissance → Hardening → Red Teaming → Analysis → Monitoring

Shadow AI Discovery

Discover, assess, and govern AI services across your cloud environment.

Command Description
discover Scan cloud tenant for AI services

Options: --save (persist to inventory), --report (HTML report), --json (JSON output), --verbose (raw API responses)

Cloud Connectors

Register cloud connectors for persistent, repeatable discovery.

Command Description
connectors List registered connectors
connectors add Register a new cloud connector
connectors test <id> Test connector connectivity
connectors update <id> Update connector credentials
connectors remove <id> Remove connector
connectors add options
--vendor            Cloud vendor (default: microsoft)
--tenant-id         Cloud tenant ID (required)
--client-id         App registration client ID (required)
--client-secret     App registration client secret (prompted)
--name              Display name for the connector

AI Inventory

View and govern discovered AI assets.

Command Description
inventory List all inventory assets
inventory view <id> View asset details
inventory update <id> Update governance fields
inventory posture View shadow AI posture score
inventory onboard <id> Create security testing project from asset
inventory archive <id> Archive an asset

Options for inventory: --category, --risk-level, --json

Options for inventory update: --sanctioned / --unsanctioned, --owner, --department, --business-purpose, --has-policy / --no-policy, --has-risk-assessment / --no-risk-assessment

Upload Conversation Logs

Evaluate real production conversations against security judges.

Command Description
upload-logs <file> Upload JSON conversation logs

Options: --tag, --lang

API Keys

Command Description
api-keys list List API keys
api-keys create Create new key (--name required, --scopes: admin/write/read)
api-keys update <id> Update key name, scopes, or active state
api-keys revoke <id> Revoke (delete) an API key

Members

Command Description
members list List organisation members
members invite <email> Invite member (--role: admin/developer)
members remove <id> Remove member

Results & Export

# View experiment results
hb logs [experiment_id] [--format table|json|html] [--verdict pass|fail] [--page N] [--size N]

# Export branded HTML report
hb logs <experiment_id> --format=html [-o report.html]

# Security posture
hb posture [--json] [--trends]

# Test coverage
hb coverage [--gaps] [--json]

# Findings
hb findings [--status open] [--severity high] [--json]

# Export guardrails configuration
hb guardrails [--vendor humanbound|openai] [--format json|yaml] [-o FILE]

Documentation

hb docs

Opens documentation in browser.

MCP Server

Expose all Humanbound CLI capabilities as tools for AI assistants via the Model Context Protocol.

# Install with MCP dependencies
pip install humanbound-cli[mcp]

# Start the MCP server (stdio transport)
hb mcp

Setup with AI Assistants

Claude Code:

claude mcp add humanbound -- hb mcp

Cursor (.cursor/mcp.json):

{
  "mcpServers": {
    "humanbound": { "command": "hb", "args": ["mcp"] }
  }
}

Any MCP-compatible client — point it at hb mcp over stdio.

What's Exposed

Type Count Examples
Tools 55 hb_whoami, hb_run_test, hb_get_posture, hb_list_findings, hb_export_guardrails
Resources 3 humanbound://context, humanbound://posture/{project_id}, humanbound://coverage/{project_id}
Prompts 2 run_security_test (guided test workflow), security_review (full review workflow)
Full tool list

Context: hb_whoami, hb_list_organisations, hb_set_organisation, hb_set_project

Projects: hb_list_projects, hb_get_project, hb_update_project, hb_delete_project

Experiments: hb_list_experiments, hb_get_experiment, hb_get_experiment_status, hb_get_experiment_logs, hb_terminate_experiment, hb_delete_experiment

Test Execution: hb_run_test

Logs: hb_get_project_logs

Providers: hb_list_providers, hb_add_provider, hb_update_provider, hb_remove_provider

Findings: hb_list_findings, hb_update_finding

Coverage & Posture: hb_get_coverage, hb_get_posture, hb_get_posture_trends, hb_get_shadow_posture

Guardrails: hb_export_guardrails

Connectors: hb_create_connector, hb_list_connectors, hb_get_connector, hb_update_connector, hb_delete_connector, hb_test_connector, hb_trigger_discovery

Inventory: hb_list_inventory, hb_get_inventory_asset, hb_update_inventory_asset, hb_archive_inventory_asset, hb_onboard_inventory_asset

API Keys: hb_list_api_keys, hb_create_api_key, hb_update_api_key, hb_delete_api_key

Members: hb_list_members, hb_invite_member, hb_remove_member

Webhooks: hb_create_webhook, hb_delete_webhook, hb_get_webhook, hb_list_webhook_deliveries, hb_test_webhook, hb_replay_webhook

Campaigns: hb_get_campaign_plan, hb_break_campaign

Upload: hb_upload_conversations

Test with MCP Inspector

npx @modelcontextprotocol/inspector -- hb mcp

Examples

End-to-end: scan, create project, test, review

hb login
hb switch abc123

# Scan bot & create project (uses endpoint config file)
hb init -n "Support Bot" -e ./bot-config.json

# Run adversarial test (uses project's default integration)
hb test -t humanbound/adversarial/owasp_multi_turn -l unit

# Watch and review
hb status --watch
hb logs
hb posture

Multi-source project init

# Combine system prompt + live endpoint for best scope extraction
hb init \
  --name "Support Bot" \
  --prompt ./prompts/system.txt \
  --endpoint ./bot-config.json

# From repository + OpenAPI spec
hb init \
  --name "API Agent" \
  --repo ./my-agent \
  --openapi ./openapi.yaml

Bot config with auth + thread init

{
  "streaming": false,
  "thread_auth": {
    "endpoint": "https://bot.com/oauth/token",
    "headers": {},
    "payload": {"client_id": "x", "client_secret": "y"}
  },
  "thread_init": {
    "endpoint": "https://bot.com/threads",
    "headers": {"Content-Type": "application/json"},
    "payload": {}
  },
  "chat_completion": {
    "endpoint": "https://bot.com/chat",
    "headers": {"Content-Type": "application/json"},
    "payload": {"messages": [{"role": "user", "content": "$PROMPT"}]}
  }
}
# Use with init or test
hb init -n "My Bot" -e ./bot-config.json
hb test -e ./bot-config.json

Shadow AI discovery & governance

# Register a cloud connector
hb connectors add --tenant-id abc --client-id def --client-secret

# Scan, save to inventory, and export report
hb discover --save --report

# Review and govern assets
hb inventory
hb inventory update <id> --sanctioned --owner "security@company.com"

# Onboard high-risk asset for security testing
hb inventory onboard <id>
hb test

AI-assisted security testing (MCP)

# Add Humanbound to Claude Code
claude mcp add humanbound -- hb mcp

# Then in Claude Code, just ask:
#   "Run a security test on my Support Bot project and summarize the findings"
#   "What's my current security posture? Show me the trends"
#   "List all critical findings and suggest remediations"

Export guardrails

hb guardrails --vendor openai --format json -o guardrails.json

On-Premises

export HUMANBOUND_BASE_URL=https://api.your-domain.com
hb login

Files

Path Description
~/.humanbound/ Configuration directory
~/.humanbound/credentials.json Auth tokens (mode 600)

Exit Codes

Code Meaning
0 Success
1 Error or test failure (with --fail-on)

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

humanbound_cli-0.5.0.tar.gz (179.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

humanbound_cli-0.5.0-py3-none-any.whl (195.6 kB view details)

Uploaded Python 3

File details

Details for the file humanbound_cli-0.5.0.tar.gz.

File metadata

  • Download URL: humanbound_cli-0.5.0.tar.gz
  • Upload date:
  • Size: 179.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.12

File hashes

Hashes for humanbound_cli-0.5.0.tar.gz
Algorithm Hash digest
SHA256 b4bca852166f0cc53574147a8b51563cd5f6b4292ce662e9fe3ce9739b84e5ad
MD5 70d56dbc6cf12fe21ec200790f7a983a
BLAKE2b-256 faa88f9104401a5754d56ae810d5d1e933e3e0e6d7c0c27fb8d83c3fcad2965a

See more details on using hashes here.

File details

Details for the file humanbound_cli-0.5.0-py3-none-any.whl.

File metadata

  • Download URL: humanbound_cli-0.5.0-py3-none-any.whl
  • Upload date:
  • Size: 195.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.12

File hashes

Hashes for humanbound_cli-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4c4a4d8e5b5fe11aa2a19132eede6f8bd20b1998047f9896cdf56b1633f7f382
MD5 8a0a7d06281c126b902467ca9fab9a70
BLAKE2b-256 49043699c7d093958cf0467967a435ffbf2c7b937771222328fb1ea0f9961066

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page