Skip to main content

Drop-in resilience layer for LLM apps via TrueFoundry's AI Gateway. Two lines of code; live chaos demo; client-side MCP failover.

Project description

Unsinkable Ship

Two lines of code. Your LLM agents become unsinkable.

A drop-in resilience layer for any Python LLM/agent app, powered by TrueFoundry's AI Gateway. When OpenAI browns out, Claude rate-limits, or your MCP server crashes — your agent keeps going. Your users never notice.

Built for the DevNetwork [AI + ML] Hackathon 2025 — TrueFoundry "Resilient Agents" challenge.

The pitch

Modern LLM apps are one provider outage away from a status-page incident. TrueFoundry's gateway already solves the infrastructure — fallback chains, retries, virtual MCP servers, observability. Unsinkable is the missing two-line bridge: it wires your existing OpenAI-SDK app to that gateway with zero refactor, and ships a chaos CLI + live dashboard so you can prove your resilience before production does.

Install (when published)

pip install unsinkable

Wire it in (2 lines)

# Before
from openai import OpenAI
client = OpenAI()

# After
from unsinkable import OpenAI
client = OpenAI()  # routed through TrueFoundry, with GPT-4o → Claude → Gemini fallback

That's the whole change. Your chat.completions.create(...) calls work unchanged. If the primary model errors or browns out, the gateway transparently falls back. Your shim emits live events to the dashboard so you can watch it happen.

See it survive chaos

# Terminal 1 — start the dashboard
unsinkable dashboard

# Terminal 2 — run the scripted 14-step demo (LLM + MCP resilience)
unsinkable demo

# OR manually:
unsinkable chaos break openai          # priority-0 OpenAI target fails → fallback fires
unsinkable chaos break anthropic       # similar, breaks Anthropic at gateway
unsinkable chaos break cascade         # both OpenAI and Anthropic down → Gemini saves you
unsinkable chaos brownout 8            # +8s latency per request
unsinkable chaos break mcp-primary     # MCP tool server primary skipped → secondary answers
unsinkable chaos clear                 # back to normal

The dashboard at http://localhost:8765 shows every request, every retry, every fallback hop — in real time.

Demo agent

python examples/research_buddy.py

A small MCP-powered research assistant. We use it as our chaos victim of choice.

Architecture

┌─────────────────┐    ┌──────────────────────┐    ┌────────────────────┐
│ Your agent code │    │ unsinkable.OpenAI    │    │ TrueFoundry        │
│ from unsinkable │───▶│ shim (SDK subclass)  │───▶│ AI Gateway         │
│ import OpenAI   │    │ + instrumentation    │    │ • Virtual Model    │
└─────────────────┘    └──────────────────────┘    │ • Virtual MCP      │
        │                       │                  │ • Retries          │
        ▼                       │                  └────────────────────┘
┌─────────────────┐             │                            │
│ Chaos CLI       │             ▼                            ▼
│ (fault inject)  │      ┌────────────────┐         ┌────────────────────┐
└─────────────────┘      │ Dashboard      │         │ Real providers     │
                         │ (FastAPI + SSE)│         │ + MCP servers      │
                         └────────────────┘         └────────────────────┘

TrueFoundry setup (~10 min, mostly tfy apply)

  1. Tenant + token — at https://<tenant>.truefoundry.cloud go to Access → Personal Access Tokens and create one. Copy .env.example to .env and fill in TFY_API_KEY + TFY_HOST.
  2. Install + log in to the CLI:
    pip install -U truefoundry
    tfy login --host $TFY_HOST --api-key $TFY_API_KEY
    
  3. Provider integrations (UI, ~5 min)AI Gateway → Model Integrations → New and add 5 integrations:
    • openai (real key) with gpt-4o-mini
    • anthropic (real key) with claude-sonnet-4-6
    • google-gemini (real key) with gemini-2.5-flash-lite
    • openai-broken (bogus key e.g. sk-broken-on-purpose) with gpt-4o
    • anthropic-broken (bogus key) with claude-sonnet-4-6
  4. Virtual Models (CLI, ~10 s):
    tfy apply -f gateway-config/resilient_chat.yaml \
              -f gateway-config/chaos_openai_down.yaml \
              -f gateway-config/chaos_anthropic_down.yaml
    
  5. Verify: python examples/smoke_test.py — should print "all checks passed".

Status

Hackathon-quality. Solo dev, ~48-hour sprint. See MEMORY.md for design decisions.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

unsinkable-0.1.0.tar.gz (17.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

unsinkable-0.1.0-py3-none-any.whl (21.6 kB view details)

Uploaded Python 3

File details

Details for the file unsinkable-0.1.0.tar.gz.

File metadata

  • Download URL: unsinkable-0.1.0.tar.gz
  • Upload date:
  • Size: 17.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for unsinkable-0.1.0.tar.gz
Algorithm Hash digest
SHA256 e65731fd811c15619d164ac70541ad95eff146eb731e93dad78302c71a9c2f4a
MD5 8b256c4c4a3fe0a2c8c0d49b7e9f0166
BLAKE2b-256 527f0ceb17f5958e6eca6ff0b776d5cc31980f29a085fb7d610622d650b3ad0f

See more details on using hashes here.

File details

Details for the file unsinkable-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: unsinkable-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 21.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for unsinkable-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b8ce6dd4b72c5f75d93169e0ff656481ca086874d27e319e5327955116f8dc6e
MD5 030e0e2f586318d15b5769f31940d44e
BLAKE2b-256 7bdbdb9f94221c8755ca784c4ae197844aa9c73a9addd96ca6bd3f2b3a090b93

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page