Skip to main content

Agentiva — open-source runtime for AI agent safety

Project description

Agentiva

Preview deployments for AI agents.

See what your AI agent would do before it does it.

Tests OWASP Python License

litellm was compromised March 24, 2026 — SSH keys, AWS credentials, and database passwords stolen from 97M monthly downloads. Agentiva catches this class of attack at the action layer.

Quick start (2 minutes)

pip install agentiva
agentiva serve --port 8000
# Open localhost:3000 for the dashboard (from repo: cd dashboard && npm run dev)

Protect your agent (3 lines)

from agentiva import Agentiva

shield = Agentiva(mode="shadow")
tools = shield.protect([your_existing_tools])

# Your agent works exactly the same.
# Every action is intercepted, scored, and logged.

Run the demo

# See 4 real incident recreations
python demo/real_incidents_demo.py

# See PayBot (fintech startup) demo
python demo/paybot_demo.py

# See proof: before vs after comparison
python demo/proof_demo.py

Or use the project venv so dependencies resolve: source .venv/bin/activate then the commands above.

What it catches

Tested against real-world incidents:

  • litellm supply chain attack (March 2026) — credential exfiltration blocked
  • Amazon Kiro (December 2025) — infrastructure destruction blocked
  • Microsoft Copilot (January 2026) — zero-click data theft blocked
  • Replit agent (2026) — mass record deletion blocked

Verified results

Benchmark Result
Agentiva test suite 24,599 tests passing
OWASP LLM Top 10 21/21 (100%)
DeepTeam (Confident AI) 38/47 (80.85%)
Garak (NVIDIA) 2,500 probes scanned
PyRIT (Microsoft) 9/9 scenarios completed

Run benchmarks yourself:

python -m pytest tests/ -m "slow or not slow"  # Full test suite
python benchmarks/run_benchmark.py              # OWASP + incidents
python benchmarks/run_all_benchmarks.py         # All frameworks

Five operating modes

Mode What it does
Shadow Observe without executing
Simulation Preview impact before acting
Approval Human-in-the-loop for risky actions
Negotiation Agent learns to self-correct
Rollback Undo what the agent did

Dashboard

Real-time monitoring at localhost:3000:

  • Overview — stats, charts, recent activity
  • Live Feed — actions streaming via WebSocket
  • Audit Log — searchable history with compliance exports
  • Agents — registry with reputation and kill switch
  • Policies — YAML rule editor
  • Security Co-pilot — ask questions about your agent's behavior

Security co-pilot

Ask naturally:

  • "What was blocked?" → real data from your audit log
  • "Why was send_email blocked?" → specific tool analysis
  • "Is this HIPAA compliant?" → compliance check with regulation citations
  • "Is my agent safe for production?" → honest assessment

Basic mode works without any API key. Add OPENROUTER_API_KEY for Claude-powered deep analysis via OpenRouter.

Works with

LangChain, CrewAI, OpenAI Agents SDK, Anthropic, MCP Protocol, or any custom agent via REST API.

Compliance-ready evidence

Generates audit trails aligned with:

  • HIPAA — PHI access logs per 45 CFR § 164.312
  • SOC2 — Evidence for CC6-CC8 controls
  • PCI-DSS — Cardholder data monitoring per Req 3, 7, 10

Note: Agentiva helps prepare for compliance audits. Certification requires a third-party assessor.

Pricing

Tier Price Agents
Free $0/forever 1 agent
Pro $18/month Up to 3
Team $54/month Unlimited
Enterprise Custom Custom

Self-hosted is free forever. Cloud dashboard on waitlist.

Architecture

┌────────────────────┐
│  AI Agent          │  LangChain · CrewAI · OpenAI · MCP · custom tools
└─────────┬──────────┘
          │ tool_call
          ▼
┌─────────────────────────────────────────────────────────────────────┐
│                        AGENTIVA API (FastAPI)                        │
│  /api/v1/intercept · /api/v1/audit · /api/v1/chat · WebSocket feed   │
└───────────────────────────────┬─────────────────────────────────────┘
                                │
          ┌─────────────────────┴─────────────────────┐
          ▼                                           ▼
┌──────────────────────┐                 ┌──────────────────────┐
│  Interceptor         │                 │  Shield Chat          │
│  PolicyEngine (YAML) │                 │  Sessions + messages  │
│  SmartRiskScorer     │                 │  (SQLite persistence) │
│  PHI detector        │                 │  + optional LLM layer │
│  Behavior / drift    │                 └──────────┬────────────┘
└──────────┬───────────┘                            │
           │                                        │
           ▼                                        ▼
┌──────────────────────┐                 ┌──────────────────────┐
│  Modes               │                 │  Compliance KB        │
│  Shadow · Approve ·  │                 │  HIPAA · SOC 2 ·      │
│  Live · Negotiation  │                 │  PCI-DSS citations +  │
│  Simulator · Rollback│                 │  evidence SQL hooks   │
└──────────┬───────────┘                 └──────────────────────┘
           │
           ▼
┌─────────────────────────────────────────────────────────────────────┐
│  Persistence: action_logs (audit) · agent registry · approvals ·     │
│  chat_sessions / chat_messages                                       │
└─────────────────────────────────────────────────────────────────────┘
           │
           ▼
┌──────────────────────┐
│  Tools / APIs        │  Email · DB · Slack · shell · payments…
└──────────────────────┘

API reference (short)

Endpoint Method Description
/health GET Health check + mode + risk threshold
/api/v1/intercept POST Intercept an agent action
/api/v1/audit GET Query audit log
/api/v1/report GET Summary report
/api/v1/settings PUT Runtime mode + risk threshold
/ws/actions WebSocket Real-time action stream

Full OpenAPI at http://localhost:8000/docs.

Testing

python -m pytest tests/ -q
python -m pytest tests/ -m "slow or not slow" -q

Contributing

See CONTRIBUTING.md.

License

Apache 2.0 — see LICENSE.

Built by

Rishav Aryan — ML Engineer, George Mason University

GitHub · Twitter

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agentiva-0.1.0.tar.gz (167.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

agentiva-0.1.0-py3-none-any.whl (204.2 kB view details)

Uploaded Python 3

File details

Details for the file agentiva-0.1.0.tar.gz.

File metadata

  • Download URL: agentiva-0.1.0.tar.gz
  • Upload date:
  • Size: 167.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for agentiva-0.1.0.tar.gz
Algorithm Hash digest
SHA256 46148efd417a215387f9e2d89923fff4c19e6cba571f567b798cb05595ff7676
MD5 883512bc4c5dd766a6c07c565b4fd163
BLAKE2b-256 acddf9614f437933fe3e8cd58d2bdb95a0875c5a84282e1b504133ac8e478c9c

See more details on using hashes here.

File details

Details for the file agentiva-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: agentiva-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 204.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for agentiva-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 127c4eb9ea18c8bc6c6c7389e16565b8cf204869453c1448e25ae91c4e7267f8
MD5 20422795ac6ea559e6ed99634850d2ac
BLAKE2b-256 2757ca8f39d9d8ee0dba84abf6c03c83cfd99ee65edace4f87c67697182ee800

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page