Skip to main content

High-performance semantic compression SDK for AI agent communication

Project description

Emergent Language Translator

High-performance semantic compression API for AI agent communication. Achieves 95%+ compression with 88,000+ messages/second throughput.

Live API

Service Endpoint
Translator API https://emergent-language.fly.dev
Collector v2 https://emergent-collector.fly.dev
# Health check
curl https://emergent-language.fly.dev/health

# Single message translation
curl -X POST https://emergent-language.fly.dev/translate \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer eudaimonia-translator-demo" \
  -d '{"data": {"task": "analyze", "priority": "high"}}'

# Batch translation (recommended for throughput)
curl -X POST https://emergent-language.fly.dev/translate/batch \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer eudaimonia-translator-demo" \
  -d '{"messages": [{"task": "analyze"}, {"task": "execute"}]}'

Python SDK

Install the SDK for easy integration into your AI agents:

# Basic SDK (auto-batching, ~8,000 msg/s)
pip install emergent-language

# With collector sidecar support (~9,700 msg/s)
pip install emergent-language[collector]

Basic Usage

from emergent_language import EmergentClient

async with EmergentClient() as client:
    # Fire-and-forget - returns immediately
    await client.send({"task": "analyze", "data": "market"})
    await client.send({"agent_id": "agent_001", "status": "complete"})

# Check stats
print(f"Sent: {client.stats.messages_sent}")
print(f"Bytes saved: {client.stats.bytes_saved}")

With Collector Sidecar (Higher Throughput)

For larger deployments, run the collector sidecar alongside your agents:

# Terminal 1: Start collector
emergent-collector --port 8080

# Terminal 2: Your agents (SDK auto-detects collector)
python your_agent.py

The SDK automatically detects if a collector is running on localhost and routes through it.

Mode Throughput When
Direct (no collector) ~8,000 msg/s SDK batches in-process
Sidecar (with collector) ~9,700 msg/s Collector on localhost

Docker / Kubernetes Sidecar

# docker-compose.yml
services:
  collector:
    image: ghcr.io/maco144/emergent-collector:v2
    ports: ["8080:8080"]

  agent:
    image: your-agent:latest
    environment:
      # SDK auto-detects collector on localhost
      - COLLECTOR_URL=http://collector:8080

Performance Research Results

Comprehensive stress testing conducted on Fly.io infrastructure with real production workloads.

Peak Performance Achieved

Metric Value
Peak Throughput 88,545 msg/s
Optimal Batch Size 75 messages
Best Compression 96.5%
Daily Capacity 7.65 billion messages
Concurrent Agents 17,709 (@ 5 msg/s each)
Cost $0.63 per billion messages

Batch Size Optimization

Batch Size Throughput Compression P95 Latency Recommendation
10 12,724 msg/s 77.7% 45ms Small agent groups
25 32,677 msg/s 91.1% 78ms Good for small clusters
50 50,882 msg/s 94.6% 300ms Balanced
75 54,955 msg/s 95.9% 376ms Optimal sweet spot
100 49,706 msg/s 96.5% 438ms Best compression
200 44,289 msg/s 96.7% 1,777ms Diminishing returns
500 46,436 msg/s 97.8% 3,386ms High latency
1000 34,810 msg/s 98.2% 5,176ms Too large

Infrastructure Scaling

Configuration Throughput Cost/hr $/Billion msgs Notes
2x shared-cpu-1x 1,350 msg/s $0.02 $4.12 Baseline
5x performance-2x 41,401 msg/s $0.45 $3.02 Medium
10x shared-cpu-4x 38,476 msg/s $0.24 $1.73 Good
3x shared + 1x perf-8x 41,136 msg/s $0.25 $1.69 Good
3x shared-cpu-8x 66,405 msg/s $0.15 $0.63 Best value
12x performance-8x 88,545 msg/s $2.50 $7.84 Peak throughput

Key Research Findings

  1. RAM doesn't matter - Compression is 100% CPU-bound

    • 16GB vs 2GB RAM = 98% same performance
    • Use shared-cpu-8x (2GB) instead of performance-8x (16GB) for 72% cost savings
  2. Batch-75 is optimal - Best balance of throughput and compression

    • 95.9% compression ratio
    • 54,955 msg/s peak throughput
    • Ideal for clustering 75 agents per region
  3. Client CPU matters - Async HTTP clients need compute power

    • Strong client: 88,545 msg/s
    • Weak VPS (1-core): 25,171 msg/s
    • Same 2ms latency to API
  4. Collector v2 outperforms client batching - See below

Benchmark Visualization

Benchmark Results

See benchmark_table.txt for detailed ASCII tables.

Collector v2: The Performance Breakthrough

The Collector v2 sidecar is faster than client-side batching while keeping agent code dead simple.

Performance Comparison

Method Throughput Speedup Agent Complexity
Individual requests 927 msg/s 1.0x Simple
Client-side batching 3,461 msg/s 3.7x Complex (async)
Collector v2 (sidecar) 9,721 msg/s 10.5x Simple

Collector v2 is 2.8x faster than client-side batching!

Why It's Faster

Three key optimizations:

  1. Fire-and-forget: Agents don't wait for translation response
  2. Parallel batchers: 4 workers sending batches concurrently
  3. Shared queue: All agents feed one queue, batches fill faster
COLLECTOR V2 ARCHITECTURE:
┌─────────────────────────────────────────────────────────────────┐
│  Sidecar (same machine as agents)                               │
│                                                                 │
│  ┌─────────┐     ┌─────────────────────────────────────────┐   │
│  │ Agent 1 │────►│                                         │   │
│  │ Agent 2 │────►│  Shared Queue (fire-and-forget)         │   │
│  │   ...   │────►│           │                             │   │
│  │ Agent75 │────►│           ▼                             │   │
│  └─────────┘     │  ┌─────────────────────────────────┐   │   │
│                  │  │ Batcher 1 ──►│                   │   │   │
│                  │  │ Batcher 2 ──►│  Translator API   │   │   │
│                  │  │ Batcher 3 ──►│  (batch-75)       │   │   │
│                  │  │ Batcher 4 ──►│                   │   │   │
│                  │  └─────────────────────────────────┘   │   │
│                  └─────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────────┘

Running Collector v2

# Start collector v2 on same machine as agents
python collector_v2.py --port 8080 --batchers 4

# Agents just fire messages (don't wait for response)
curl -X POST http://localhost:8080/collect \
  -H "Content-Type: application/json" \
  -d '{"data": {"task": "analyze"}}'

# Check throughput stats
curl http://localhost:8080/stats

Kubernetes Sidecar Deployment

apiVersion: v1
kind: Pod
spec:
  containers:
  - name: agent
    image: your-agent:latest
    env:
    - name: TRANSLATOR_URL
      value: "http://localhost:8080/collect"  # Fire-and-forget to sidecar
  - name: collector
    image: ghcr.io/maco144/emergent-collector:v2
    ports:
    - containerPort: 8080
    args: ["--batchers", "4"]

Hive Mind Architecture

     Region A              Region B              Region C
    ┌────────┐            ┌────────┐            ┌────────┐
    │75 agents│           │75 agents│           │75 agents│
    └───┬────┘            └───┬────┘            └───┬────┘
        ▼                     ▼                     ▼
    ┌────────┐            ┌────────┐            ┌────────┐
    │Collector│           │Collector│           │Collector│
    │  v2    │            │  v2    │            │  v2    │
    └───┬────┘            └───┬────┘            └───┬────┘
        └──────────────┬──────┴──────────┬──────────┘
                       ▼                 ▼
              ┌─────────────────────────────────┐
              │   Translator API (Fly.io)       │
              │   3x shared-cpu-8x              │
              │   66,405 msg/s capacity         │
              │   $0.63 per billion messages    │
              └─────────────────────────────────┘

Capacity per cluster: ~10K msg/s = 2,000 agents @ 5 msg/s each

Cost-Optimized Architecture

Recommended setup:

3x shared-cpu-8x (8 CPUs, 2GB RAM each)
├── Throughput: 66,405 msg/s
├── Cost: ~$0.15/hr (~$20-40/mo with auto-stop)
├── Daily capacity: 5.7 billion messages
└── Supports: 13,281 concurrent agents

API Endpoints

Translator API

Endpoint Method Description
/health GET Health check
/translate POST Single message translation
/translate/batch POST Batch translation (recommended)
/metrics GET Prometheus metrics

Collector v2 API

Endpoint Method Description
/health GET Health check
/collect POST Fire-and-forget single message
/collect/bulk POST Fire-and-forget multiple messages
/stats GET Throughput and compression stats

Local Development

# Clone and install
git clone https://github.com/maco144/emergent-language
cd emergent-language
pip install -r requirements.txt

# Run translator locally
python app.py

# Run collector v2 locally
python collector_v2.py --port 8080 --batchers 4

# Run with Docker
docker build -t emergent-language .
docker run -p 8080:8080 emergent-language

Deployment

Fly.io (recommended):

fly launch
fly deploy
fly scale count 3
fly scale vm shared-cpu-8x

Emergent Symbol Encoding

The translator uses a custom binary encoding scheme optimized for AI agent communication:

  • 127 common keys mapped to single bytes (task, agent_id, status, etc.)
  • 80+ common values mapped to single bytes (analyze, execute, pending, etc.)
  • zlib compression for additional reduction
  • Batch header optimization for multi-message payloads

Example compression:

Original JSON:  {"task": "analyze", "data": "market", "priority": "high"}
Compressed:     e7 02 01 61 8a 01 83 20 61 8a 10
Reduction:      82% smaller

Files

File Description
app.py FastAPI Translator API
collector.py Original collector (blocking)
collector_v2.py High-performance collector (fire-and-forget)
benchmark_results.py Generate performance visualizations
benchmark_results.png Performance charts
benchmark_table.txt Detailed ASCII result tables
test_collector.py Collector comparison tests
fly.toml Fly.io Translator config
fly.collector.toml Fly.io Collector config
Dockerfile Translator container
Dockerfile.collector Collector container

The Optimization Journey

WHERE WE STARTED:
└─ Individual requests: ~1,000 msg/s

DISCOVERY 1: Batch endpoint
└─ 32x speedup → ~32,000 msg/s

DISCOVERY 2: Optimal batch size = 75
└─ Sweet spot for throughput + compression (95.9%)

DISCOVERY 3: RAM doesn't matter (CPU-bound)
└─ 98% performance with 87% less RAM → huge cost savings

DISCOVERY 4: Client CPU is a bottleneck
└─ Strong client: 88K msg/s vs Weak VPS: 25K msg/s

DISCOVERY 5: Collector v2 pattern
└─ Fire-and-forget + parallel batchers
└─ 10.5x faster than individual
└─ 2.8x faster than client-side batching (!!)

License

MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

emergent_language-1.0.0.tar.gz (14.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

emergent_language-1.0.0-py3-none-any.whl (12.2 kB view details)

Uploaded Python 3

File details

Details for the file emergent_language-1.0.0.tar.gz.

File metadata

  • Download URL: emergent_language-1.0.0.tar.gz
  • Upload date:
  • Size: 14.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for emergent_language-1.0.0.tar.gz
Algorithm Hash digest
SHA256 7c2fe90b389e1a9d72c814bfa85e42dae28b278aede0f717dc3b74c26d8ec6ce
MD5 82eaf677d863d1b200b38b1b654cb658
BLAKE2b-256 911cb993cbd3e5fe57e0a8579131cd67ea0279d3c9f782b05744d5816aea7903

See more details on using hashes here.

File details

Details for the file emergent_language-1.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for emergent_language-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d8ba300f52a082611e92788e008d559352550ad89440fc5911e33f6a2df9051b
MD5 3e0520adf53a0ad7224d3c43228e25be
BLAKE2b-256 f7877317ed3d85b3e954c19e3ed3dfdc63edaa1aefa602066b5e0f34e149cf69

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page