Skip to main content

Keiro client — call the EB1 multi-model ensemble API.

Project description

Keiro

EB1 multi-model ensemble inference. Run multiple frontier models in parallel and synthesize the best response.

Quick start

pip install keiro
keiro setup
from keiro import models

print(models("eb1-preview", "What is machine learning?"))

Or from the command line:

keiro "What is machine learning?"

How it works

EB1 sends your prompt to multiple frontier models (Claude, GPT, Gemini) in parallel, then a judge synthesizes the strongest elements into a single response. The result is more accurate and more complete than any individual model.

Models

Model Description
eb1-preview (default) Adaptive GNN-routed ensemble
eb1-delta-preview Adaptive ensemble with orchestration
eb1 Standard 5-model ensemble
eb1-pro Extended 6-model ensemble
eb1-frontier Highest quality, max reasoning
eb1-codex Optimized for code and SWE tasks
eb1-fast Low latency, lighter models
eb1-fast-preview Adaptive routing, low latency
eb1-frontier-preview Adaptive routing, max quality
claude-opus-4-6 Direct passthrough (no ensemble)
gpt-5.2 Direct passthrough
from keiro import models

# Default adaptive ensemble
answer = models("eb1-preview", "Solve this step by step: what is 23 * 47?")

# Max quality
answer = models("eb1-frontier", "Prove that sqrt(2) is irrational.")

# Low latency
answer = models("eb1-fast", "Summarize this in one sentence.")

# Direct passthrough to a single model
answer = models("claude-opus-4-6", "Write a haiku")

Prompt-first API

from keiro import models

# Structured response with usage metadata
reply = models.response("eb1-preview", "Explain quantum computing.")
print(reply.text)
print(reply.usage)

# Reusable model binding with fixed parameters
creative = models.instance("eb1-preview", temperature=0.8)
print(creative("Write a limerick about debugging."))

# Streaming
for chunk in models.stream("eb1-preview", "Draft a launch email."):
    print(chunk, end="")

Full client

from keiro import Client

client = Client()

# Chat completions API
response = client.chat(
    messages=[{"role": "user", "content": "Explain quantum computing."}],
    model="eb1-preview",
)
print(response["choices"][0]["message"]["content"])

# Rate limit visibility
print(client.rate_limits)
# RateLimitInfo(limit_requests=1000, remaining_requests=999, ...)

client.close()

CLI

keiro "What is ML?"                 # one-shot response
keiro                               # interactive REPL
keiro gui                           # local browser chat UI
keiro -m eb1-fast "Quick answer"    # specific model
echo context | keiro "Summarize"    # pipe context as input
keiro setup                         # configure credentials
keiro models                        # list available models

In the interactive REPL, streamed code fences render as numbered code blocks. Use /copy [n] to copy a block from the last assistant reply.

keiro gui opens the local browser chat UI in Chrome when available. If startup takes longer than expected, the CLI prints a manual URL and log path.

Configuration

Interactive setup (recommended):

keiro setup

This validates your API key against the gateway and saves credential metadata to ~/.keiro/credentials. Secret bytes are stored in owner-only sidecar files under ~/.keiro/secrets/, and the metadata file stores file:// references.

Explicit arguments:

from keiro import Client

client = Client(api_key="your-key", base_url="https://your-keiro-gateway.example")

API-key and endpoint precedence is explicit arguments, then credentials file. Runtime credential and gateway URL environment variables are ignored; run keiro setup or keiro endpoint local to update saved credentials.

Requirements

  • Python 3.11+
  • No GPU required (inference runs on hosted infrastructure)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

keiro-0.12.13.tar.gz (124.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

keiro-0.12.13-py3-none-any.whl (140.2 kB view details)

Uploaded Python 3

File details

Details for the file keiro-0.12.13.tar.gz.

File metadata

  • Download URL: keiro-0.12.13.tar.gz
  • Upload date:
  • Size: 124.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for keiro-0.12.13.tar.gz
Algorithm Hash digest
SHA256 d1008f712b9ca779efccf3ab047287303838f7dc90feb7ead31b72616f999c6e
MD5 1c95850c7f57fefb2a7f0fd72404537d
BLAKE2b-256 81e27f4179a49e7cc3656a8ae6c7dd4f82595957f7a9e707c66ec145c50f2831

See more details on using hashes here.

File details

Details for the file keiro-0.12.13-py3-none-any.whl.

File metadata

  • Download URL: keiro-0.12.13-py3-none-any.whl
  • Upload date:
  • Size: 140.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for keiro-0.12.13-py3-none-any.whl
Algorithm Hash digest
SHA256 bab0de16094d1e49414271c6af11f096241369c93d885fd47c1eec0009aa1433
MD5 d54bb4b7c1495140f768ebfcbfa78f8f
BLAKE2b-256 7ca6554c781b25818e555a39ff5ca281f6ed5ef726b75341a895fd53b416dd92

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page