Framework-agnostic library for deploying AI agents via HTTP, A2A, and MCP

Project description

KiboUP

Framework-agnostic library for deploying AI agents via HTTP, A2A, and MCP — with built-in observability, prompt management, evaluation, and agent discovery through KiboStudio.

Overview
Installation
Protocols: When to Use What
Quick Start
KiboStudio
Examples

Overview

KiboUP lets you build and deploy AI agents with one codebase and expose them over three industry-standard protocols:

Protocol	Best For	Server Class	Client Class
HTTP	Web apps, REST APIs, microservices	`KiboAgentApp`	`KiboAgentClient`
MCP	Tool-based agents, IDE integrations	`KiboAgentMcp`	`KiboMcpClient`
A2A	Agent-to-agent communication	`KiboAgentA2A`	`KiboA2AClient`

All three protocols share:

API key authentication middleware
Structured JSON logging with LLMUsage metadata
Health checks and task management
Optional KiboStudio integration for observability

Installation

# Core (HTTP only)
uv add kiboup

# With MCP support
uv add "kiboup[mcp]"

# With A2A support
uv add "kiboup[a2a]"

# With KiboStudio (observability, prompts, eval, discovery)
uv add "kiboup[studio]"

# Everything
uv add "kiboup[all]"

Protocols: When to Use What

HTTP (`KiboAgentApp` / `KiboAgentClient`)

Use HTTP when you need a standard REST API for your agent. This is the most versatile option — it works with any frontend, supports streaming via SSE, WebSocket connections, task tracking, and integrates seamlessly with KiboStudio tracing.

Best for: Web applications, mobile backends, microservice architectures, any client that speaks HTTP.

Features:

POST /invocations — invoke the agent
GET /ping — health check (Healthy / Busy)
GET /tasks — list active tasks
DELETE /tasks/{id} — cancel a task
WS /ws — WebSocket endpoint
SSE streaming support
API key authentication
Automatic KiboStudio trace reporting

MCP (`KiboAgentMcp` / `KiboMcpClient`)

Use MCP when your agent exposes tools that other agents or IDEs can discover and call. The Model Context Protocol is the standard for tool-based interactions — think of it as a plugin system for LLMs.

Best for: IDE integrations (Cursor, VS Code), tool-based agents, agents that expose capabilities as callable functions.

Features:

Tool registration via @app.tool() decorator
Resource and prompt registration
SSE and stdio transports
Compatible with MCP Inspector and all MCP clients
API key authentication

A2A (`KiboAgentA2A` / `KiboA2AClient`)

Use A2A when agents need to discover and communicate with each other using Google's Agent-to-Agent protocol. Each agent publishes an Agent Card at /.well-known/agent.json describing its skills.

Best for: Multi-agent systems, agent marketplaces, cross-organization agent communication.

Features:

Agent Card auto-generation at /.well-known/agent.json
Skill-based routing
Task lifecycle management (create, cancel)
Bearer token and API key authentication
Compatible with any A2A client

Quick Start

HTTP Agent (Server + Client)

Server (agent_server_example.py):

from langchain_openai import ChatOpenAI
from langgraph.graph import START, MessagesState, StateGraph

from kiboup import KiboAgentApp, LLMUsage

app = KiboAgentApp(
    api_keys={
        "sk-frontend-abc": "web-app",
        "sk-agent-xyz": "recommender-agent",
    }
)

llm = ChatOpenAI(model="gpt-4o-mini")
graph_builder = StateGraph(MessagesState)

def chatbot(state: MessagesState):
    return {"messages": [llm.invoke(state["messages"])]}

graph_builder.add_node("chatbot", chatbot)
graph_builder.add_edge(START, "chatbot")
graph = graph_builder.compile()

def _extract_llm_usage(ai_message) -> LLMUsage:
    usage_meta = getattr(ai_message, "usage_metadata", None) or {}
    resp_meta = getattr(ai_message, "response_metadata", {})
    return LLMUsage(
        model=resp_meta.get("model_name"),
        provider="openai",
        input_tokens=usage_meta.get("input_tokens"),
        output_tokens=usage_meta.get("output_tokens"),
        total_tokens=usage_meta.get("total_tokens"),
    )

@app.entrypoint
async def invoke(payload, context):
    prompt = payload.get("prompt", "")
    result = await graph.ainvoke({"messages": [{"role": "user", "content": prompt}]})
    last_message = result["messages"][-1]

    usage = _extract_llm_usage(last_message)
    context._llm_usage = usage

    return {
        "response": last_message.content,
        "called_by": context.client_id,
        "llm_usage": usage.to_dict(),
    }

if __name__ == "__main__":
    app.run(host="0.0.0.0", port=8080, reload=True)

Client (agent_client_example.py):

import asyncio
from kiboup import KiboAgentClient

async def main():
    async with KiboAgentClient(
        base_url="http://127.0.0.1:8080",
        api_key="sk-frontend-abc",
    ) as client:
        health = await client.ping()
        print(f"Server health: {health}")

        result = await client.invoke({"prompt": "What is the capital of France?"})
        print(f"Response: {result['response']}")

if __name__ == "__main__":
    asyncio.run(main())

Test with curl:

curl -X POST http://127.0.0.1:8080/invocations \
    -H "Content-Type: application/json" \
    -H "X-API-Key: sk-frontend-abc" \
    -d '{"prompt": "What is the capital of France?"}'

Streaming (SSE)

Server (stream_server_example.py):

from langchain_openai import ChatOpenAI
from langgraph.graph import START, MessagesState, StateGraph
from kiboup import KiboAgentApp

app = KiboAgentApp(api_keys={"sk-chat-abc": "chat-client"})
llm = ChatOpenAI(model="gpt-4o-mini", streaming=True)

graph_builder = StateGraph(MessagesState)

def chatbot(state: MessagesState):
    return {"messages": [llm.invoke(state["messages"])]}

graph_builder.add_node("chatbot", chatbot)
graph_builder.add_edge(START, "chatbot")
graph = graph_builder.compile()

@app.entrypoint
async def invoke(payload, context):
    prompt = payload.get("prompt", "")
    messages = payload.get("messages", [{"role": "user", "content": prompt}])

    async def token_stream():
        async for event in graph.astream_events({"messages": messages}, version="v2"):
            if event.get("event") == "on_chat_model_stream":
                content = event["data"]["chunk"].content
                if content:
                    yield {"token": content}
        yield {"done": True}

    return token_stream()

if __name__ == "__main__":
    app.run(host="0.0.0.0", port=8080)

Client (stream_client_example.py):

import asyncio, sys
from kiboup import KiboAgentClient

async def chat_loop():
    async with KiboAgentClient("http://localhost:8080", api_key="sk-chat-abc") as client:
        health = await client.ping()
        print(f"Connected ({health['status']})\n")

        while True:
            user_input = input("You: ")
            if not user_input.strip():
                continue

            sys.stdout.write("AI: ")
            async for chunk in client.stream({"prompt": user_input}):
                token = chunk.get("token")
                if token:
                    sys.stdout.write(token)
                    sys.stdout.flush()
            print("\n")

if __name__ == "__main__":
    asyncio.run(chat_loop())

MCP Server + Client

Server (mcp_server_example.py):

import logging
from langchain_openai import ChatOpenAI
from langgraph.graph import START, MessagesState, StateGraph
from kiboup import KiboAgentMcp, LLMUsage

llm = ChatOpenAI(model="gpt-4o-mini")
graph_builder = StateGraph(MessagesState)

def chatbot(state: MessagesState):
    return {"messages": [llm.invoke(state["messages"])]}

graph_builder.add_node("chatbot", chatbot)
graph_builder.add_edge(START, "chatbot")
graph = graph_builder.compile()

app = KiboAgentMcp(
    name="LangGraph MCP Server",
    api_keys={"sk-mcp-abc": "web-app"},
)

@app.tool()
async def ask(question: str) -> str:
    """Ask a question to the LangGraph agent powered by GPT-4o-mini."""
    result = await graph.ainvoke(
        {"messages": [{"role": "user", "content": question}]}
    )
    return result["messages"][-1].content

@app.tool()
def summarize(text: str) -> str:
    """Summarize the given text using GPT-4o-mini."""
    result = graph.invoke(
        {"messages": [{"role": "user", "content": f"Summarize this text:\n\n{text}"}]}
    )
    return result["messages"][-1].content

if __name__ == "__main__":
    app.run(transport="sse")

Client (mcp_client_example.py):

import asyncio
from kiboup import KiboMcpClient

async def main():
    async with KiboMcpClient("http://localhost:8000/sse", api_key="sk-mcp-abc") as client:
        tools = await client.list_tools()
        print(f"Available tools: {tools}")

        result = await client.call_tool("ask", {"question": "What is the capital of France?"})
        print(f"Result: {result}")

if __name__ == "__main__":
    asyncio.run(main())

A2A Server + Client

Server (a2a_server_example.py):

import logging
from langchain_openai import ChatOpenAI
from langgraph.graph import START, MessagesState, StateGraph
from kiboup import LLMUsage
from kiboup.a2a.server import AgentExecutor, AgentSkill, KiboAgentA2A, TaskUpdater

llm = ChatOpenAI(model="gpt-4o-mini")
graph_builder = StateGraph(MessagesState)

def chatbot(state: MessagesState):
    return {"messages": [llm.invoke(state["messages"])]}

graph_builder.add_node("chatbot", chatbot)
graph_builder.add_edge(START, "chatbot")
graph = graph_builder.compile()

app = KiboAgentA2A(
    name="LangGraph Chat Agent",
    description="A simple chat agent using LangGraph with GPT-4o-mini",
    api_keys={"sk-a2a-xyz": "agent-client"},
    skills=[
        AgentSkill(
            id="chat",
            name="Chat",
            description="Answer questions using GPT-4o-mini via LangGraph",
            tags=["chat", "qa", "langgraph"],
            input_modes=["text/plain"],
            output_modes=["text/plain"],
        )
    ],
)

@app.executor
class ChatAgent(AgentExecutor):
    async def execute(self, context, event_queue):
        from a2a.utils import new_agent_text_message
        user_input = context.get_user_input()
        result = await graph.ainvoke(
            {"messages": [{"role": "user", "content": user_input}]}
        )
        await event_queue.enqueue_event(
            new_agent_text_message(result["messages"][-1].content)
        )

    async def cancel(self, context, event_queue):
        updater = TaskUpdater(event_queue, context.task_id, context.context_id)
        await updater.cancel()

if __name__ == "__main__":
    app.run()

Client (a2a_client_example.py):

import asyncio
from kiboup import KiboA2AClient

async def main():
    async with KiboA2AClient("http://localhost:8000", api_key="sk-a2a-xyz") as client:
        print(f"Agent: {client.agent_card.name}")
        print(f"Skills: {[s.name for s in client.agent_card.skills]}")

        response = await client.send("What is the capital of France?")
        print(f"Response: {response}")

if __name__ == "__main__":
    asyncio.run(main())

Chainlit Chat UI

import chainlit as cl
from kiboup import KiboAgentClient

SERVER_URL = "http://localhost:8080"
API_KEY = "sk-chat-abc"

@cl.on_chat_start
async def on_start():
    cl.user_session.set("history", [])

@cl.on_message
async def on_message(message: cl.Message):
    history = cl.user_session.get("history", [])
    history.append({"role": "user", "content": message.content})

    response = cl.Message(content="")
    await response.send()

    full_response = ""
    async with KiboAgentClient(SERVER_URL, api_key=API_KEY) as client:
        async for chunk in client.stream({
            "prompt": message.content,
            "messages": history,
        }):
            token = chunk.get("token")
            if token:
                full_response += token
                await response.stream_token(token)

    await response.update()
    history.append({"role": "assistant", "content": full_response})
    cl.user_session.set("history", history)

Start the streaming server first, then run:

uv run chainlit run examples/chainlit_example.py

mTLS (Mutual TLS)

KiboUP supports automatic mutual TLS for all protocols. Certificates are auto-generated on first run and renewed automatically before expiry.

Server:

from kiboup import KiboAgentApp

app = KiboAgentApp()

@app.entrypoint
async def invoke(payload, context):
    return {"response": "Hello from mTLS!"}

app.run(host="0.0.0.0", port=8443, mtls=True)

Client:

import asyncio
from kiboup import KiboAgentClient

async def main():
    async with KiboAgentClient(
        base_url="https://localhost:8443",
        mtls=True,
    ) as client:
        result = await client.invoke({"prompt": "Hello!"})
        print(result["response"])

asyncio.run(main())

Custom certificate directory via environment variable:

KIBO_CERTS_DIR=/path/to/certs uv run python my_server.py

Custom configuration via MTLSConfig:

from kiboup import MTLSConfig

config = MTLSConfig(
    certs_dir="/custom/certs",
    hostname="myagent.example.com",
    validity_days=365,
    renew_before_days=30,
)

app.run(port=8443, mtls=config)

Certificates are stored in ~/.kiboserve/certs/ by default. The CA certificate is valid for 10 years; server and client certificates for 1 year with auto-renewal at 30 days before expiry.

KiboStudio

KiboStudio is the built-in developer console for observability, prompt management, evaluation, and agent discovery. It runs as a standalone web server with a SQLite backend.

Getting Started

from kiboup.studio import KiboStudio

studio = KiboStudio(db_path="kibostudio.db", debug=True)

if __name__ == "__main__":
    studio.run(host="0.0.0.0", port=8000, reload=True)

Open http://127.0.0.1:8000 in your browser.

Agent Discovery & Multi-Agent Collaboration

KiboStudio acts as a service registry where agents register themselves, send heartbeats, and discover each other at runtime.

from kiboup import KiboAgentApp
from kiboup.studio import StudioClient

app = KiboAgentApp()
studio = StudioClient(
    studio_url="http://127.0.0.1:8000",
    agent_id="researcher",
    agent_name="researcher",
    agent_endpoint="http://127.0.0.1:8081",
    capabilities=["research", "delegate"],
)
app.attach_studio(studio)

Once registered, agents can discover each other:

agents = await studio.list_agents()
writer = next((a for a in agents if a.get("agent_id") == "writer"), None)
endpoint = writer["endpoint"]

The Discovery tab in the UI shows all registered agents with health status, uptime, memory usage, and capabilities.

Traces & Observability

Every invocation through KiboAgentApp with an attached StudioClient automatically reports traces with:

Span hierarchy: invocation > agent_run > llm_call / tool_call / retrieval
Input/Output data for each span
LLM token usage: model, provider, input/output/total tokens
Duration and status (ok/error)
Attributes: custom key-value pairs

The Traces tab groups traces by agent and shows timing, status, and token consumption at a glance.

Graph Visualization

The Graph tab renders an ADK-style visual graph of each trace's span hierarchy:

Agent nodes: green filled ellipses with robot emoji
Tool nodes: rounded rectangles with wrench emoji
LLM nodes: rounded rectangles with brain emoji
Retrieval nodes: rounded rectangles with magnifier emoji
Dark background (#333537), left-to-right layout, bezier curve edges with arrowheads
Click any node to inspect its span details

Chat Interface

The Chat tab provides a built-in chat interface to test any registered agent directly from the browser. Select an agent, type a message, and see the response rendered with full markdown support.

Feature Flags & Parameters

Control agent behavior at runtime without redeploying:

Feature Flags — toggle capabilities on/off:

delegate_enabled = await studio.is_flag_enabled("delegate_to_writer")
if not delegate_enabled:
    return {"response": research, "delegated": False}

Parameters — dynamic configuration values:

writer_style = await studio.get_param("writer_style", default="markdown")

Both support global (apply to all agents) and per-agent scopes. The SDK caches values with a 30-second TTL for performance.

Prompt Management

The Prompts tab lets you manage prompt templates with:

Version history
Variable extraction
Active version selection
Model configuration per version

Agents can fetch prompts at runtime:

prompt = await studio.get_prompt("research_system_prompt")
content = prompt["content"]

Evaluation (LLM-as-Judge)

The Eval tab runs automated quality evaluation on traces using an LLM-as-judge approach (GPT-4o-mini). It scores each trace on four metrics:

Metric	Description
Answer Relevancy	How relevant is the response to the input question
Coherence	Logical flow and consistency of the response
Completeness	Whether the response fully addresses the query
Harmfulness	Detection of harmful or inappropriate content

Each metric is scored 0.0 to 1.0. Results are stored per trace and displayed with visual score bars.

StudioClient SDK

The StudioClient provides a full async Python SDK for agents to interact with KiboStudio:

from kiboup.studio import StudioClient

studio = StudioClient(
    studio_url="http://127.0.0.1:8000",
    agent_id="my-agent",
    agent_name="My Agent",
    agent_endpoint="http://127.0.0.1:8081",
    capabilities=["chat"],
    heartbeat_interval_s=15,
)

async with studio:
    # Discovery
    agents = await studio.list_agents()

    # Feature flags
    enabled = await studio.is_flag_enabled("my_flag")

    # Parameters
    value = await studio.get_param("my_param", default="fallback")

    # Prompts
    prompt = await studio.get_prompt("system_prompt")

    # Traces
    await studio.send_traces(trace_data)

The client can also be embedded directly into KiboAgentClient or KiboMcpClient:

async with KiboAgentClient(
    base_url="http://localhost:8080",
    studio_url="http://localhost:8000",
    agent_id="my-agent",
) as client:
    result = await client.invoke({"prompt": "Hello"})
    flags = await client.studio.get_flags()

Examples

Example	File	Description
HTTP Server	`examples/agent_server_example.py`	LangGraph + GPT-4o-mini with LLMUsage tracking
HTTP Client	`examples/agent_client_example.py`	Async client with health check and invocation
SSE Streaming Server	`examples/stream_server_example.py`	Token-by-token streaming via SSE
SSE Streaming Client	`examples/stream_client_example.py`	Interactive CLI chat with streaming
MCP Server	`examples/mcp_server_example.py`	Tool-based MCP server with `ask` and `summarize`
MCP Client	`examples/mcp_client_example.py`	MCP client listing tools and calling them
A2A Server	`examples/a2a_server_example.py`	A2A agent with skill registration
A2A Client	`examples/a2a_client_example.py`	A2A client reading agent card and sending messages
Chainlit UI	`examples/chainlit_example.py`	Web chat interface with streaming
KiboStudio	`examples/studio_example.py`	Launch the developer console
mTLS	`examples/mtls_example.py`	mTLS server + client with auto-generated certificates
Multi-Agent	`examples/multi_agent_example.py`	Researcher + Writer agents with discovery, flags, and params

Run any example:

OPENAI_API_KEY=sk-... uv run python examples/<example_file>.py

License

MIT

Project details

Release history Release notifications | RSS feed

0.2.2

Apr 25, 2026

0.2.1

Mar 8, 2026

This version

0.2.0

Mar 4, 2026

0.1.0

Feb 28, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kiboup-0.2.0.tar.gz (5.2 MB view details)

Uploaded Mar 4, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

kiboup-0.2.0-py3-none-any.whl (90.3 kB view details)

Uploaded Mar 4, 2026 Python 3

File details

Details for the file kiboup-0.2.0.tar.gz.

File metadata

Download URL: kiboup-0.2.0.tar.gz
Upload date: Mar 4, 2026
Size: 5.2 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.12

File hashes

Hashes for kiboup-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`fe04488906dc58e8bed1d83f6d7d3b1e31a7f91db4e2f93cf87bd9b917508e8a`
MD5	`b37d9266faa0f713c82ecac44f3ba875`
BLAKE2b-256	`cdc0c664b2ef89a33ad7a0b3c30202f5b701c3745ed2efeafa5ff4bd3cabf0ac`

See more details on using hashes here.

File details

Details for the file kiboup-0.2.0-py3-none-any.whl.

File metadata

Download URL: kiboup-0.2.0-py3-none-any.whl
Upload date: Mar 4, 2026
Size: 90.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.12

File hashes

Hashes for kiboup-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1065e14ba7a5e2c4762d11a290a9e85ad48a932c387d9446ba3ced13dce8d5d3`
MD5	`5a10d9577d515336e719f931a8afab2e`
BLAKE2b-256	`c8212d7fd0ece8b00829bf01c86bffa05908fd8ace193a08c5b9daad11eb4c50`

See more details on using hashes here.

kiboup 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

KiboUP

Table of Contents

Overview

Installation

Protocols: When to Use What

HTTP (KiboAgentApp / KiboAgentClient)

MCP (KiboAgentMcp / KiboMcpClient)

A2A (KiboAgentA2A / KiboA2AClient)

Quick Start

HTTP Agent (Server + Client)

Streaming (SSE)

MCP Server + Client

A2A Server + Client

Chainlit Chat UI

mTLS (Mutual TLS)

KiboStudio

Getting Started

Agent Discovery & Multi-Agent Collaboration

Traces & Observability

Graph Visualization

Chat Interface

Feature Flags & Parameters

Prompt Management

Evaluation (LLM-as-Judge)

StudioClient SDK

Examples

License

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

HTTP (`KiboAgentApp` / `KiboAgentClient`)

MCP (`KiboAgentMcp` / `KiboMcpClient`)

A2A (`KiboAgentA2A` / `KiboA2AClient`)