Open compatibility and evaluation layer for OSS TTS.
Project description
TimbreGrid
TimbreGrid is a local-first compatibility, evaluation, and routing layer for OpenAI-compatible open-source text-to-speech systems.
Open-source TTS has many promising models and servers, but comparing and integrating them is still awkward: every project has different install steps, voice names, audio formats, runtime assumptions, and benchmark claims. TimbreGrid makes those pieces explicit through manifests, raw benchmark JSON, conformance checks, diagnostic reports, routing policy, and a small reference /v1/audio/speech gateway.
Status: early MVP. The fake gateway, manifest registry, diagnostic CLI, benchmark CLI, conformance suite, benchmark validation, optional Kokoro adapter, and optional KittenTTS adapter are implemented. Chatterbox and Qwen3-TTS are currently manifest-only examples.
Use TimbreGrid when you want to:
- diagnose whether an OpenAI-compatible TTS server behaves well enough for basic
/v1/audio/speechusage; - run a local OSS TTS model behind a reference OpenAI-compatible speech endpoint;
- compare adapters with reviewable raw benchmark output instead of hand-written summary tables;
- describe model capabilities, licenses, voices, formats, and runtime requirements in validated manifests;
- test another TTS server's basic OpenAI-compatible speech behavior;
- route
model="auto"requests by benchmark evidence, manifest capabilities, response format, availability, and license policy; - keep local/custom voice provenance and consent metadata explicit before cloning workflows become first-class.
Current Value
| Area | What works now |
|---|---|
| Diagnostics | timbregrid doctor produces a compatibility report for any OpenAI-compatible /v1/audio/speech server. |
| Runtime | fake:tts, optional kokoro:82m, and optional kitten-tts:nano-0.8 can serve POST /v1/audio/speech. |
| Evaluation | Benchmark suites emit raw JSON and validation recomputes counts, failures, latency averages, memory, and prompt coverage. |
| Compatibility | Basic OpenAI-compatible speech conformance checks and Python OpenAI SDK compatibility tests are included. |
| Registry | YAML manifests generate registry/index.json, the support matrix, release assets, and a hosted latest registry. |
| Routing | model="auto" can choose from available models using benchmark data and manifest policy. |
| Voice governance | Builtin and local voices are discoverable through GET /v1/audio/voices; local/custom voices require consent and provenance metadata. |
Quickstart
Try the compatibility stack without downloading model weights. The built-in fake adapter is deterministic and exists so manifests, benchmarks, conformance, routing, Docker, and SDK compatibility can be tested quickly.
Run the latest published alpha container:
docker run --rm -p 8889:8889 ghcr.io/kiyeonjeon21/timbregrid:alpha
Then call the OpenAI-compatible speech endpoint:
curl http://localhost:8889/v1/audio/speech \
-H "Content-Type: application/json" \
-d '{"model":"fake:tts","input":"Hello from TimbreGrid","voice":"alloy","response_format":"wav"}' \
--output speech.wav
The generated speech.wav is test audio, not natural speech. Install the optional Kokoro or KittenTTS adapters below when you want real local synthesis.
For a real voice demo, see docs/real-audio-demo.md.
To diagnose an existing OpenAI-compatible TTS server instead of TimbreGrid's gateway, see docs/external-server-proof.md.
Run the CLI directly from GitHub while the PyPI alpha is being prepared:
uvx --from git+https://github.com/kiyeonjeon21/timbregrid timbregrid --help
After the PyPI alpha is published, use:
uvx --from timbregrid==0.1.0a2 timbregrid --help
Run From Source
Install dependencies with uv, then validate the built-in model manifest:
uv sync --all-groups
uv run timbregrid manifest validate manifests/fake-tts.yaml
Run a benchmark and validate the raw JSON output:
uv run timbregrid bench fake:tts \
--suite realtime-agent \
--hardware-profile generic-ci \
--output /tmp/timbregrid-bench.json
uv run timbregrid bench validate /tmp/timbregrid-bench.json
Start the reference gateway from source:
uv run timbregrid serve --model fake:tts --port 8889
Call the same speech endpoint:
curl http://localhost:8889/v1/audio/speech \
-H "Content-Type: application/json" \
-d '{"model":"fake:tts","input":"Hello from TimbreGrid","voice":"alloy","response_format":"wav"}' \
--output speech.wav
What Works Today
- Validate TimbreGrid model manifests from YAML, including link, license, runtime, format, and consent consistency.
- Generate a static registry index and support matrix from manifests.
- Run fake-adapter benchmark suites and write raw JSON output.
- Validate benchmark JSON examples and submissions for model ids, suites, hardware profiles, prompts, and aggregate metrics.
- Produce a basic compatibility diagnosis for an OpenAI-compatible TTS server with
timbregrid doctor. - Run basic OpenAI-compatible speech conformance checks.
- Serve
POST /v1/audio/speechforfake:tts. - Expose builtin and local voice metadata through
GET /v1/audio/voicesand enforce known voice metadata during synthesis. - Verify Python OpenAI SDK compatibility and run a direct SDK example against the local gateway.
- Route
model="auto"requests by benchmark data, manifest capabilities, response format, availability, and license policy. - Run
kokoro:82mwhen optional Kokoro dependencies andespeak-ngare installed. - Run
kitten-tts:nano-0.8when optional KittenTTS dependencies are installed.
Not included yet:
- Chatterbox or Qwen3-TTS inference adapters.
- SQLite voice metadata storage or custom voice synthesis.
- Published PyPI package. The package metadata is prepared for the next alpha, but the first upload still requires maintainer Trusted Publishing setup and workflow execution.
- SSE audio streaming.
- Pipecat or LiveKit integration examples; Open WebUI is currently a docs-only TTS guide.
Model Registry
Manifests live under manifests/. Generated registry artifacts live at:
The latest published registry is hosted at:
Versioned registry artifacts are also attached to GitHub releases.
Regenerate and check them with:
uv run timbregrid registry build
uv run timbregrid registry build --check
uv run timbregrid registry audit --skip-network
Required PR checks use --skip-network so external GitHub or Hugging Face outages do not block unrelated changes. Release and scheduled registry checks run the full URL audit.
Known model entries:
fake:tts: deterministic test adapter.kokoro:82m: optional executable adapter viatimbregrid[kokoro].kitten-tts:nano-0.8: optional executable edge/CPU adapter when KittenTTS is installed from a source checkout.chatterbox:tts: manifest-only expressive/cloning example.qwen3-tts:0.6b-base: manifest-only multilingual/cloning example.
Benchmarks
Benchmark suites are defined for:
realtime-agentnarrationmultilingualcloningdialogue
Example:
uv run timbregrid bench fake:tts \
--suite realtime-agent \
--hardware-profile cpu \
--output /tmp/fake.json
uv run timbregrid bench validate /tmp/fake.json
The checked-in benchmark under benchmarks/examples is deterministic fake data. It documents the JSON format and supports tests; it is not a hardware performance claim.
Raw real-hardware submissions live under benchmarks/submissions. The checked-in Kokoro and KittenTTS Apple Silicon artifacts are contributor machine runs, not general performance guarantees.
Benchmark validation recomputes run counts, failures, failure rate, average latency metrics, peak memory, and suite prompts before accepting a submission.
See docs/benchmarking.md and docs/benchmark-submissions.md.
Examples
Run the OpenAI Python SDK example against a local gateway:
uv run python examples/openai_sdk_speech.py
For KittenTTS:
TIMBREGRID_MODEL=kitten-tts:nano-0.8 \
TIMBREGRID_VOICE=Jasper \
TIMBREGRID_OUTPUT=/tmp/kitten-sdk.wav \
uv run python examples/openai_sdk_speech.py
For Kokoro:
TIMBREGRID_MODEL=kokoro:82m \
TIMBREGRID_VOICE=af_heart \
TIMBREGRID_OUTPUT=/tmp/kokoro-sdk.wav \
uv run python examples/openai_sdk_speech.py
Doctor And Conformance
For a user-facing diagnosis of a TTS server, run:
uv run timbregrid doctor http://localhost:8889/v1 \
--model fake:tts \
--voice alloy \
--response-format wav \
--output doctor.json
The doctor command wraps conformance results into a compatibility report for basic Open WebUI-style and Pipecat OpenAI TTS-style readiness. It is not a full integration certification.
Example against an external Speaches server:
uv run timbregrid doctor http://localhost:8000/v1 \
--model speaches-ai/Kokoro-82M-v1.0-ONNX \
--voice af_heart \
--response-format wav \
--output demo-assets/speaches-doctor.json
Run conformance checks against any OpenAI-compatible TTS server:
uv run timbregrid conformance http://localhost:8889/v1 \
--endpoint audio.speech \
--model fake:tts \
--voice alloy \
--response-format wav \
--output conformance.json
See docs/doctor.md and docs/conformance.md.
Routing
Explain how model="auto" is resolved:
uv run timbregrid route explain \
--model auto \
--voice alloy \
--response-format wav \
--purpose realtime \
--license-policy commercial_ok \
--target-latency-ms 350 \
--hardware-profile generic-ci
If matching benchmark data is missing, routing falls back to manifest capabilities and model availability.
Voice Metadata
List builtin voices and local voice records:
curl "http://localhost:8889/v1/audio/voices?model=fake:tts"
Local voice records can be supplied as JSON without committing private assets:
TIMBREGRID_VOICE_CATALOG=/path/to/voices.json uv run timbregrid serve
Speech requests must use a known builtin voice or a local catalog voice for the selected model. Custom or local voices must set builtin=false, set source to local or custom, include consent="granted", and provide a non-empty provenance value. TimbreGrid exposes these records for governance and discovery; it does not synthesize cloned voices yet.
Optional KittenTTS Adapter
Install KittenTTS dependencies from a source checkout:
uv sync --all-groups
uv pip install \
"kittentts @ https://github.com/KittenML/KittenTTS/releases/download/0.8.1/kittentts-0.8.1-py3-none-any.whl" \
"onnxruntime<1.26"
To keep Kokoro installed in the same environment, include the Kokoro extra in the sync command before installing KittenTTS:
uv sync --all-groups --extra kokoro
uv pip install \
"kittentts @ https://github.com/KittenML/KittenTTS/releases/download/0.8.1/kittentts-0.8.1-py3-none-any.whl" \
"onnxruntime<1.26"
Try the adapter:
uv run timbregrid models inspect kitten-tts:nano-0.8
uv run timbregrid manifest validate manifests/kitten-tts-nano-0.8.yaml
uv run timbregrid serve --model kitten-tts:nano-0.8 --port 8889
Use response_format="wav" or response_format="pcm" and a KittenTTS voice such as Jasper.
See docs/kitten-tts.md for the packaging caveat behind this explicit install path.
Integrations
TimbreGrid can be used as a local OpenAI-compatible TTS backend for tools that call /v1/audio/speech.
- Open WebUI integration guide
- Open WebUI + TimbreGrid compose example
- External server proof with Speaches
Integration examples are intentionally narrow until streaming and broader gateway compatibility stabilize.
Release Status
The alpha release path publishes GitHub release assets, a hosted registry, and a lightweight GHCR image. PyPI publishing is prepared through Trusted Publishing, with maintainer setup notes in docs/pypi-publishing.md. Release maintainer notes live in docs/release-runbook.md.
Docker
Run the published alpha image:
docker run --rm -p 8889:8889 ghcr.io/kiyeonjeon21/timbregrid:alpha
Or build the fake gateway container locally:
docker compose up --build
The Docker image is intentionally lightweight. It does not include Kokoro, espeak-ng, or PyTorch-class model dependencies.
Optional Kokoro Adapter
Install optional Kokoro dependencies:
uv sync --all-groups --extra kokoro
Kokoro may also require the system espeak-ng package. On macOS:
brew install espeak-ng
Try the adapter:
uv run timbregrid models inspect kokoro:82m
uv run timbregrid manifest validate manifests/kokoro-82m.yaml
uv run timbregrid bench kokoro:82m \
--suite realtime-agent \
--hardware-profile cpu \
--output /tmp/kokoro.json
uv run timbregrid serve --model kokoro:82m --port 8889
Use response_format="wav" and a Kokoro voice such as af_heart.
Roadmap
Detailed phases and checklists live in docs/roadmap.md. Public status is intentionally conservative:
| Phase | Status | Focus |
|---|---|---|
| Phase 0: Spec-first planning | complete | Manifest schema, speech models, benchmark suites, conformance cases, example manifests. |
| Phase 1: Useful OSS before runtime | partial | Manifest validation, benchmark CLI, conformance tooling, submission validation, and Kokoro/KittenTTS Apple Silicon artifacts exist; broader hardware coverage still needs contributors. |
| Phase 2: Reference gateway MVP | partial | Fake gateway, optional Kokoro and KittenTTS adapters, Docker smoke path, and benchmark-aware routing work; expressive/cloning adapters are next. |
| Phase 3: Community registry | partial | Local registry, generated support matrix, release assets, hosted latest registry, PR/issue templates, deterministic registry audit, scheduled URL audit, and CI checks exist; checksum validation and broader install smoke coverage remain. |
| Phase 4: Voice governance and integrations | partial | Local voice records, consent/provenance metadata, /v1/audio/voices, synthesis-time voice checks, a real-audio demo guide, and Open WebUI docs exist; Pipecat and LiveKit examples remain. |
Near-term next work:
- Publish external-server doctor proof guides, starting with Speaches and then LocalAI where feasible.
- Use
timbregrid doctorreports to harden integration examples (Open WebUI guide and compose example wired todoctorpreflight; Pipecat and LiveKit pending). - Collect more real raw benchmark examples for CPU, CUDA, and additional Apple Silicon environments.
- Implement an expressive or cloning adapter, likely Chatterbox first.
- Harden checksum metadata, optional install smoke coverage, and the first PyPI alpha release.
Contributing
See CONTRIBUTING.md. Manifest, benchmark, conformance, and adapter contributions are welcome.
Focused contribution guides:
Before opening a PR, run:
uv run pytest
uv run timbregrid registry audit --skip-network
uv run timbregrid registry build --check
for benchmark in benchmarks/examples/*.json; do uv run timbregrid bench validate "$benchmark"; done
for benchmark in benchmarks/submissions/*.json; do uv run timbregrid bench validate "$benchmark"; done
Security
Do not submit cloned voice samples, private datasets, API keys, or consent records to this repository. See SECURITY.md.
License
TimbreGrid core is licensed under the MIT License. See LICENSE.
Upstream model code and weights keep their own licenses as listed in each model manifest.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file timbregrid-0.1.0a2.tar.gz.
File metadata
- Download URL: timbregrid-0.1.0a2.tar.gz
- Upload date:
- Size: 231.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
88a61300430a6715e770df80f315baaeaede38767f6bccb7355e5e00852d494d
|
|
| MD5 |
ede3f4477aa8ff1077d04c8414f92142
|
|
| BLAKE2b-256 |
9d68e08e51627364c90e4f09b9867b70f64515f9b933e907ce06249fb7a07ea8
|
Provenance
The following attestation bundles were made for timbregrid-0.1.0a2.tar.gz:
Publisher:
publish-pypi.yml on kiyeonjeon21/timbregrid
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
timbregrid-0.1.0a2.tar.gz -
Subject digest:
88a61300430a6715e770df80f315baaeaede38767f6bccb7355e5e00852d494d - Sigstore transparency entry: 1486894261
- Sigstore integration time:
-
Permalink:
kiyeonjeon21/timbregrid@831ac480fc16fcdf4dd6d799196009931c5b56a3 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/kiyeonjeon21
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@831ac480fc16fcdf4dd6d799196009931c5b56a3 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file timbregrid-0.1.0a2-py3-none-any.whl.
File metadata
- Download URL: timbregrid-0.1.0a2-py3-none-any.whl
- Upload date:
- Size: 42.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f6a131a6be9480d08aeb1cbbd8ec4a39ff327f36b74cb7d64bfd28e44f9b19ad
|
|
| MD5 |
bb68d210b1fab847fe1de926ba596b24
|
|
| BLAKE2b-256 |
2f57b91c331f96578eb431011d68eeed1bd1a7013799660261d1df9ee12308ea
|
Provenance
The following attestation bundles were made for timbregrid-0.1.0a2-py3-none-any.whl:
Publisher:
publish-pypi.yml on kiyeonjeon21/timbregrid
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
timbregrid-0.1.0a2-py3-none-any.whl -
Subject digest:
f6a131a6be9480d08aeb1cbbd8ec4a39ff327f36b74cb7d64bfd28e44f9b19ad - Sigstore transparency entry: 1486894369
- Sigstore integration time:
-
Permalink:
kiyeonjeon21/timbregrid@831ac480fc16fcdf4dd6d799196009931c5b56a3 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/kiyeonjeon21
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@831ac480fc16fcdf4dd6d799196009931c5b56a3 -
Trigger Event:
workflow_dispatch
-
Statement type: