Skip to main content

Talos: A secure, decentralized protocol for AI Agent communication

Project description

Talos Protocol: A Secure Communication and Trust Layer for Autonomous AI Agents

Academic Abstract: The rapid ascent of autonomous AI agents necessitates a trustable communication substrate that transcends centralized identity and authorization silos. The Talos Protocol introduces a decentralized, contract-driven architecture integrating self-sovereign identity (DIDs), capability-based authorization (RFC-style scopes), and forward-secure messaging (Double Ratchet). This work presents the first production-grade implementation of a trust layer specifically optimized for high-performance agentic interactions, achieving <2ms p50 authorization overhead while maintaining blockchain-anchored accountability.


1. Introduction

Autonomous agents lack a trustable substrate for cross-organizational interaction. Current paradigms rely on centralized OAuth or opaque platform-specific silos, which introduce single points of failure and prevent verifiable accountability. Talos addresses this by providing:

  • Cryptographic Identity: Self-sovereign DIDs for every agent and service.
  • Granular Authorization: Capability-based tokens with deterministic scope matching.
  • Agent-to-Agent Communication: Forward-secret channels with Double Ratchet encryption (Phase 10).
  • Production Hardening: Rate limiting, distributed tracing, health checks, graceful shutdown (Phase 11).
  • Verifiable Audit: Blockchain-anchored, non-repudiable logs of all tool invocations.
  • Performance: A Rust-based core capable of 600k+ auth/sec with <2ms p50 latency.

2. Related Work & Competitive Analysis

Feature TLS/OAuth (Standard) DID/VC (General) Talos Protocol
Identity Centralized (IdP) Decentalized (DID) Decentralized (DID)
Authorization Bearer Tokens Verifiable Creds Capability Tokens (L1)
Messaging TLS (Point-to-point) Varies Double Ratchet (E2EE)
Rate Limiting Basic (if any) None Token Bucket + Redis
Observability Basic Metrics Varies OpenTelemetry + Redaction
Accountability Database Logs Optional Ledger Blockchain-Anchored
Latency (p50) 50ms - 200ms >1s (usually) <2ms (C-Kernel)

3. System Architecture

Talos follows a Contract-Driven Design where the contracts repository serves as the single source of truth for all schemas and test vectors.

Non-negotiable: this project is contract-first; protocol logic and validation must come from published contracts artifacts, not re-implemented in consumers.

System Architecture Overview

graph TB
    subgraph "Client Layer"
        Agent[AI Agents]
        SDK[SDKs<br/>Python/TS/Go/Java/Rust]
    end

    subgraph "Talos Core Services"
        Gateway[AI Gateway<br/>LLM Safety & Logic]
        Audit[Audit Service<br/>Merkle Chaining]
        Config[Configuration<br/>Service]
        Core[Security Kernel<br/>FastAPI + Rust]
    end

    subgraph "Data Layer"
        PG_Primary[(Postgres<br/>Primary)]
        PG_Replica[(Postgres<br/>Replica)]
        Redis[(Redis<br/>Budgets/Rate Limits)]
        Jaeger[Jaeger<br/>Tracing]
    end

    subgraph "External"
        LLM[LLM Providers<br/>OpenAI/Anthropic]
        MCP[MCP Servers<br/>Tools]
    end

    Agent -->|E2EE SESSION| Core
    SDK -->|mTLS/REST| Core
    Core -->|Policy Check| Config
    Core -->|Async Audit| Audit
    Core -->|Authorized| Gateway
    Gateway -->|Safe Request| LLM
    Gateway -->|Secure Tools| MCP

    Core -->|Write| PG_Primary
    Core -->|Read| PG_Replica
    Config -->|State| Redis
    Audit -->|Receipts| PG_Primary

    style Core fill:#4a90e2
    style Gateway fill:#f9f
    style Config fill:#bbf
    style Audit fill:#77e2a8
    style Redis fill:#ff9966

Production Features (Phases 7-15)

Phase Feature Status
Phase 7 RBAC Enforcement
Phase 9.2 Tool Read/Write Separation
Phase 9.3 Runtime Resilience (TGA)
Phase 10 A2A Encrypted Channels
Phase 11 Rate Limiting, Tracing, Health Checks
Phase 12 Multi-Region (Circuit Breaker)
Phase 13 Secrets Rotation (Multi-KEK)
Phase 15 Adaptive Budgets

Core Components

  • contracts: JSON Schemas for identity, capabilities, and audit.
  • core: Rust implementation of cryptographic primitives (PyO3 bindings).
  • services/gateway: High-performance entry point for agent requests.
  • services/audit: Secure collector for non-repudiable event logs.

4. Technical Design (High-Level)

4.1 Agent-to-Agent Communication Channels (Phase 10)

Talos enables secure, forward-secret communication between autonomous agents via A2A Channels. Built on the Signal Double Ratchet protocol, A2A sessions provide:

  • Session Lifecycle: Create, accept, and rotate sessions with ratchet state persistence
  • Frame Encryption: Authenticated encryption with replay protection and sequence tracking
  • Group Messaging: Multi-party secure channels with membership management
  • API Surface: RESTful endpoints (/a2a/sessions, /a2a/frames, /a2a/groups)

Each frame includes a ciphertext_hash for integrity verification and sender_seq/recipient_seq for strict ordering guarantees.

4.2 Production Hardening (Phase 11)

The Gateway implements enterprise-grade reliability features:

  • Rate Limiting: Token bucket algorithm with Redis backend, surface-specific limits, fail-closed in production
  • Distributed Tracing: OpenTelemetry integration with automatic redaction of sensitive data (Authorization headers, A2A frames, secrets)
  • Health Checks: /health/live (always available) and /health/ready (dependency validation)
  • Graceful Shutdown: Request draining, background task cleanup, zero-downtime deployments

All features enforce strict fail-closed behavior in production mode per Phase 11 specification.

4.3 Multi-Region & High Availability (Phase 12)

The runtime layer supports read/write database splitting with circuit-breaker failover, ensuring sub-5ms latency across geographic regions while maintaining strong consistency for security-critical secrets.

4.4 Automated Secret Rotation (Phase 13)

Talos implements zero-downtime key rotation using a MultiKekProvider with background workers and Postgres advisory locking, mitigating the risk of long-term credential exposure.

4.5 Adaptive Budgeting (Phase 15)

Autonomous agents are constrained by atomic BudgetService enforcement, preventing runaway costs and ensuring fair resource allocation via off/warn/hard enforcement modes.


5. Security Analysis

Talos is designed to withstand the following threat vectors:

  • Identity Spoofing: Prevented by Ed25519-signed DIDs.
  • Replay Attacks: Mitigated by session-bound correlation IDs and sliding window caches.
  • Privilege Escalation: Blocked by deterministic scope containment rules in the Policy Engine.

6. Getting Started

Quick Start

./scripts/bootstrap.sh
docker-compose up -d

Table of Services

Service Port Description
Security Kernel 8000 Core Identity & Policy (Rust/Py)
AI Gateway 8001 LLM Orchestration & Safety
Audit Service 8002 Tamper-proof Logging & Merkle
Config Service 8003 Adaptive Budgets & Global Policy
Dashboard 3000 Admin UI Control Plane

📖 Full Documentation: Documentation | Deployment Guide


7. Production Status

Completed Phases (Production-Ready) ✅

  • Phase 7: RBAC Enforcement with policy engine
  • Phase 9.2: Tool Servers Read/Write Separation
  • Phase 9.3: Runtime Loop and Resilience with TGA
  • Phase 10: A2A Communication Channels (Double Ratchet E2EE)
  • Phase 11: Production Hardening (rate limiting, tracing, health checks, graceful shutdown)
  • Phase 12: Multi-Region Architecture (read/write splitting, circuit breaker)
  • Phase 13: Secrets Rotation Automation (atomic updates, advisory locks, Multi-KEK)
  • Phase 15: Adaptive Budgets (Redis Lua, atomic enforcement)

Future Work

  • Phase 14: Global Load Balancing (infrastructure-level via Ingress/Service Mesh)
  • Phase 16: Zero-Knowledge Proofs for capability obfuscation
  • Phase 17: Hardware Security Module (HSM) native integration

8. References

  • [1] Nakamoto, S. (2008). "Bitcoin: A Peer-to-Peer Electronic Cash System."
  • [2] Bernstein, D. J. (2012). "High-speed high-security signatures." (Ed25519).
  • [3] Signal Messenger. "The Double Ratchet Algorithm."
  • [4] W3C. "Decentralized Identifiers (DIDs) v1.0."
  • [5] IETF RFC 8785. "JSON Canonicalization Scheme (JCS)."

10. Contact


11. License

This project is licensed under the Apache License, Version 2.0. See the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

talos_protocol-5.15.4.tar.gz (256.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

talos_protocol-5.15.4-py3-none-any.whl (227.1 kB view details)

Uploaded Python 3

File details

Details for the file talos_protocol-5.15.4.tar.gz.

File metadata

  • Download URL: talos_protocol-5.15.4.tar.gz
  • Upload date:
  • Size: 256.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for talos_protocol-5.15.4.tar.gz
Algorithm Hash digest
SHA256 ddb117897bcc0d45946bd80f44a58a4b8b5771819ca8d6d4cdbe4fbf86af708c
MD5 d9a44d2a77dacdf868103f8217cfc982
BLAKE2b-256 f7d4c8d7bc6c4aa795875d7f21e1ddb2d1155eb508bd9e3211486443d062417c

See more details on using hashes here.

File details

Details for the file talos_protocol-5.15.4-py3-none-any.whl.

File metadata

File hashes

Hashes for talos_protocol-5.15.4-py3-none-any.whl
Algorithm Hash digest
SHA256 2c3a0b9d48e2e38036dd48ed83a1f69387c65319030ea202d21e4d0e75d42e14
MD5 31a0660bb9c7897d111d82811358c870
BLAKE2b-256 f8fc758518a08f4b6ea20c975d07ee49fadc0552aeb086834dee6c50dc7e35f5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page