Local-first repair loop for debugging and improving AI agents.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

kayba

These details have not been verified by PyPI

Project links

Homepage

Project description

Kyoko

Kyoko is the all-in-one, fully local tool for debugging and improving your AI agents.

Point it at any agent you're building (instrument it with OpenTelemetry or the SDKs), or plug straight into CLI agents you already run like Codex, Claude Code, OpenClaw, and Hermes. Kyoko captures what your agent actually does and runs a closed repair loop over it: it analyses real runs into a living state reflection of the system, files recurring and generalised failures as issues, drafts concrete fixes, and proves them with replay and evals before anything ships. Everything runs on your machine (traces, database, and dashboard), and any model or external call is opt-in.

Most agent tooling stops at showing you traces; you still have to read them, guess what went wrong, write the fix, and hope it didn't break something else. Kyoko closes that gap end to end, in one place.

That state reflection is cumulative: Kyoko keeps learning from traces, issues, fixes, replays, and evals, so it can surface the problems humans would not think to measure by hand while still respecting the detectors and judges you explicitly choose.

Why Kyoko

OpenTelemetry-native. Ingests OTLP/GenAI spans; SDKs and importers for the rest.
Runs on your coding agent. Codex, Claude Code, OpenClaw, Hermes do the analysis and author fixes through their own CLI login, so no API keys and no extra spend.
Fully local. SQLite + loopback UI. Nothing leaves your machine; external calls opt-in.
Cumulative analysis. Builds a state reflection from traces, issues, evals, and fixes, so repeated behavior becomes more accurate fixes over time.
Measured, not guessed. Failure rate from real evals, not status flags.
Safe by default. No change ships without passing the gate. No shortcuts, anywhere.
Zero-fuss. One kyoko CLI, near-zero deps, --json everywhere. No server, no cloud.

The loop

        ┌─────────────────┐           ┌─────────────────┐
        │  1. Analyse     │ ─────-──▶ │  2. Issues      │
        │  traces in      │           │  recurring      │
        │                 │           │  failures       │
        └─────────────────┘           └─────────────────┘
                 ▲                            │
                 │ measure                    │ accept
                 │                            ▼
        ┌─────────────────┐  ┌──────┐ ┌─────────────────┐
        │  4. Evals       │◀-┤ gate ├─│  3. Proposals   │
        │  failure rate   │  └──────┘ │  candidate      │
        │                 │   apply   │  fixes          │
        └─────────────────┘           └─────────────────┘

   Gate = checks · replay · policy · locks; a fix applies only if it passes.
   Evals score the result and feed the next analysis; the loop tightens.

Analyse: Kyoko reads your agent's traces for you, diagnoses what went wrong, and updates a state reflection of how the system behaves over time. No manual log-digging.
Issues: it surfaces the failures to you automatically as first-class, evidence-backed issues, grouped by category and severity so you fix the pattern, not the symptom, including problems you did not predefine as a metric.
Proposals: each accepted issue becomes a concrete fix (to context/skills or the agent's harness), then runs the gate: generated checks, bounded replay, autonomy policy, and human locks. It applies only if it passes.
Evals: a measurement plane of deterministic detectors and LLM judges scores runs into a failure rate, before vs after. Failure is decided by evals, never by a status flag on a trace.

Run it your way. The same loop, the same gate. You pick the autonomy level:

Human-in-the-loop: Kyoko surfaces issues and drafts fixes, and you review and approve each change before it applies.
Fully autonomous: the policy auto-applies any change that clears replay, evals, and human locks, and parks anything that doesn't for you to look at.

Either way, nothing behavior-changing ships without passing the gate.

Quick demo

Kyoko requires Python 3.12 or newer. From this checkout:

python3 -m pip install .
kyoko demo --db /tmp/kyoko-demo.db --json
kyoko serve --db /tmp/kyoko-demo.db

Open http://127.0.0.1:8765.

The demo runs the full loop against bundled fixture data, so it needs no live model, framework adapter, or replay server.

Install

git clone https://github.com/kayba-ai/kyoko.git
cd kyoko
python3 -m pip install .

After the package is published, prefer an isolated CLI install:

pipx install kyoko

See docs/INSTALL.md for uv, editable installs, the installer script, upgrades, and common setup fixes.

Use it in your project

Run this from the root of an agent project:

kyoko project-bootstrap \
  --project-dir . \
  --profile-name my-agent \
  --source-framework generic-python \
  --replay-framework generic-python \
  --mcp-target codex

project-bootstrap writes .kyoko/kyoko.db, source/replay scaffolds, MCP config, operator presets, and .kyoko/NEXT_STEPS.md. Then check readiness and start the dashboard:

kyoko doctor --db .kyoko/kyoko.db --safe-smokes --json
kyoko serve --db .kyoko/kyoko.db

Point telemetry at Kyoko with the Python or TypeScript SDK, a generated adapter, or an importer. See Getting Started for the end-to-end walkthrough.

What you get

Telemetry in: Python SDK, TypeScript SDK, generated source adapters, OTLP/GenAI JSON, Hermes import, OpenClaw import.
Diagnosis: per-trace and cumulative analysis that folds behavior into a state reflection, then turns recurring or generalised weaknesses into evidence-backed issues with category, severity, and the spans where they happened.
Fixes out: issues become validated LearningProposal records, authored by you or an operator agent (Codex, Claude, or a generic command).
Verification: generated checks plus bounded replay against external commands or managed loopback replay servers.
Measurement: an evidence-only eval plane (deterministic detectors and LLM-judge evals) for what you choose to measure, alongside analysis that surfaces unmeasured patterns from observed behavior.
Surfaces: a local dashboard, a JSON-everywhere CLI, and a stdio MCP server for coding agents, all sharing the same gated apply path.

Area	Supported paths
Source telemetry	Python SDK, TypeScript SDK, generated source adapters, OTLP/GenAI JSON, Hermes import, OpenClaw import
Replay	External replay commands, managed HTTP replay servers, generated replay scaffolds
Operator agents	Codex, Claude, generic command adapters, local presets
Agent clients	Dashboard, JSON CLI, stdio MCP server
Framework scaffolds	Generic Python/TypeScript, LangGraph, Pydantic AI, OpenAI Agents, CrewAI, Hermes, OpenClaw, AI SDK

See docs/INTEGRATIONS.md and examples/README.md.

How safety works

Every behavior-changing path (operator output, imports, MCP tools, and kyoko improve) flows through one gate:

Validate the proposal against its schema.
Resolve the evidence it references.
Generate or select checks.
Run bounded replay and the checks.
Evaluate the autonomy policy.
Enforce human locks on protected targets.
Apply context or harness changes only if the gate allows it.

Context writes update Kyoko-managed skills and delivery rules; harness writes create reviewable patch transactions against an explicit workspace root. Replay server URLs are loopback-only unless you pass --allow-remote-server, and evidence exported to prompts, MCP, API, or bundles is redacted by default. See docs/SECURITY.md and docs/ARCHITECTURE.md.

Documentation

Getting Started: demo, project bootstrap, telemetry, inspection, and the repair loop.
Install: install paths, verification, data location, and common setup fixes.
Integrations: source adapters, replay adapters, operator agents, MCP, and SDKs.
CLI Reference: grouped command reference.
Architecture: runtime model, data model, and the gate.
Security: local data, loopback serving, tokens, redaction, and write boundaries.
Scope: what v0 is and is not.
Development: tests, dashboard bundle, release smoke, and contract artifacts.

Specs, schemas, fixtures, and design decisions live under docs/ as reference contracts.

Contributing

Issues and pull requests are welcome. See CONTRIBUTING.md for local setup, the test and validation gates, and how to submit a change. To report a security vulnerability, follow SECURITY.md rather than opening a public issue.

Repository layout

kyoko/              Python import package, CLI runtime, dashboard/API, bundled assets
frontend/           React/Vite dashboard source
sdk/typescript/     Dependency-free TypeScript telemetry SDK
examples/           Source and replay hook examples
scripts/            Installer, release smoke, fixture and artifact helpers
tests/              Python unittest suite and CLI contract tests
docs/               User docs plus specs, schemas, fixtures, and decisions

License

Apache-2.0. See LICENSE.

Built by Kayba and the open-source community.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

kayba

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

0.1.2

Jun 10, 2026

0.1.1

Jun 9, 2026

This version

0.1.0

Jun 8, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kyoko-0.1.0.tar.gz (2.9 MB view details)

Uploaded Jun 8, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

kyoko-0.1.0-py3-none-any.whl (2.2 MB view details)

Uploaded Jun 8, 2026 Python 3

File details

Details for the file kyoko-0.1.0.tar.gz.

File metadata

Download URL: kyoko-0.1.0.tar.gz
Upload date: Jun 8, 2026
Size: 2.9 MB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for kyoko-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`1576d4de9f824b555a347116654933c32c7da357acc1283108fbc873bd996c66`
MD5	`6f05d8da33a6faff9c6318578c4c79d5`
BLAKE2b-256	`e74ae346f273b4d0f7af176a8f1d3f5b904f01b094a90349a267344653c7a843`

See more details on using hashes here.

Provenance

The following attestation bundles were made for kyoko-0.1.0.tar.gz:

Publisher: release.yml on kayba-ai/Kyoko

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: kyoko-0.1.0.tar.gz
- Subject digest: 1576d4de9f824b555a347116654933c32c7da357acc1283108fbc873bd996c66
- Sigstore transparency entry: 1758614329
- Sigstore integration time: Jun 8, 2026
Source repository:
- Permalink: kayba-ai/Kyoko@a69091fa67a8d3cca640b98ca6b61fc65968af52
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/kayba-ai
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@a69091fa67a8d3cca640b98ca6b61fc65968af52
- Trigger Event: push

File details

Details for the file kyoko-0.1.0-py3-none-any.whl.

File metadata

Download URL: kyoko-0.1.0-py3-none-any.whl
Upload date: Jun 8, 2026
Size: 2.2 MB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for kyoko-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`dce592f4462c143ea54147289ccde85f406ae2bffac96eac5ae0fdb2bd13f32b`
MD5	`382fd26e44ae28b01ec37642d5f65bdb`
BLAKE2b-256	`0ffe0d6e1061cd8b4d0c7abee11d6deb6c660aa49d18d180fc27007e1be20313`

See more details on using hashes here.

Provenance

The following attestation bundles were made for kyoko-0.1.0-py3-none-any.whl:

Publisher: release.yml on kayba-ai/Kyoko

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: kyoko-0.1.0-py3-none-any.whl
- Subject digest: dce592f4462c143ea54147289ccde85f406ae2bffac96eac5ae0fdb2bd13f32b
- Sigstore transparency entry: 1758614398
- Sigstore integration time: Jun 8, 2026
Source repository:
- Permalink: kayba-ai/Kyoko@a69091fa67a8d3cca640b98ca6b61fc65968af52
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/kayba-ai
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@a69091fa67a8d3cca640b98ca6b61fc65968af52
- Trigger Event: push

kyoko 0.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Kyoko

Why Kyoko

The loop

Quick demo

Install

Use it in your project

What you get

How safety works

Documentation

Contributing

Repository layout

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance