Open-source MCP server that reads text aloud locally using Windows SAPI without API keys or cloud services.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

Text to Speech MCP Server

Text to Speech is an open-source Model Context Protocol (MCP) server that lets AI assistants read text aloud on the user's computer. On Windows it uses the built-in Speech API (SAPI) by default, so no API key, account, subscription, or cloud text-to-speech service is required.

The server exposes one model-controlled tool:

speak_text(text: string)

Use it for user-provided text, assistant answers, accessibility workflows, or spoken progress updates while an agent works.

Features

Local playback through Windows SAPI by default.
No cloud API and no API key for the default setup.
FIFO playback: concurrent requests are spoken one at a time, in order.
Blocking tool completion: each call returns after its audio finishes.
Bounded input and queue sizes to prevent unbounded resource use.
Temporary generated WAV files are removed after playback by default.
Standard MCP stdio transport through the official Python SDK.
Optional Piper, Transformers MMS, and local HTTP backends for advanced users.

The MCP server source is open source under the MIT License. Windows SAPI is a proprietary component included with Windows; it is not an open-source speech engine.

Requirements

Windows 10 or Windows 11 for the zero-configuration SAPI backend.
Python 3.10 or newer.
An MCP client such as Codex, Claude Desktop, or another compatible client.
uv/uvx is recommended for package-based MCP installation.

Install

After the package is published to PyPI, configure an MCP client to run:

uvx text-to-speech-mcp

For Codex, add this to ~/.codex/config.toml:

[mcp_servers.text_to_speech]
command = "uvx"
args = ["text-to-speech-mcp"]
startup_timeout_sec = 30
tool_timeout_sec = 300
enabled = true

Restart the MCP client after changing its configuration.

Install from source

git clone https://github.com/engr-faizanali/text-to-speech-mcp.git
cd text-to-speech-mcp
python -m pip install -e .

Then configure the client to run text-to-speech-mcp directly.

Prompt Examples

Read arbitrary text:

Use the Text to Speech tool to read aloud: The deployment completed successfully.

Read the final answer:

Use the Text to Speech tool to read your final response aloud before displaying it.

Read visible intermediate progress updates in order:

Use the text_to_speech MCP server's speak_text tool for spoken progress updates.

For every meaningful intermediate update that you display to me:
1. Write a concise, natural-language version of the update.
2. Call speak_text with that text.
3. Wait for the call to finish before producing or speaking the next update.
4. Then display the same update in text.

Also read the final answer aloud before displaying it. Never narrate hidden
reasoning, chain-of-thought, secrets, credentials, raw tool output, terminal
logs, or source code. Do not invoke speech calls in parallel. If the tool is
unavailable, continue normally in text and report the failure once.

The text_to_speech portion is the client-side server name from the Codex configuration. Other clients may display a different namespace while keeping the tool name speak_text.

Tool Contract

Field	Value
Tool name	`speak_text`
Input	`text`, required string, 1-10,000 characters
Result	Completion message after local playback finishes
Ordering	FIFO, one active playback at a time
Queue limit	32 pending requests
Network use with SAPI	None

The tool is model-controlled under MCP. The user decides when to ask the model to call it, and the MCP client may show or require approval for tool calls.

Privacy

With the default SAPI backend, text is passed from the MCP client to a local Python process and then to Windows speech components. It is not sent to this project, an external API, or a cloud TTS provider. Generated WAV files are written under %TEMP%\text-to-speech-mcp and deleted after playback unless TEXT_TO_SPEECH_KEEP_AUDIO=true is set.

Do not ask an AI assistant to speak secrets, credentials, private keys, hidden reasoning, or sensitive tool output.

Optional Backends

The default requires no configuration:

TEXT_TO_SPEECH_BACKEND = "sapi"

Advanced users can set TEXT_TO_SPEECH_BACKEND to piper, transformers_mms, or http. These options require their own local model, binary, Python dependencies, or endpoint. See PACKAGE_MCP.md.

Legacy CODEX_TTS_BACKEND and CODEX_TTS_FALLBACK_BACKEND environment variables remain supported for compatibility.

Development

python -m pip install -e ".[dev]"
python -m unittest discover -s tests -v
python scripts/validate_release.py --online
python -m build
python -m twine check dist/*

See MCP_PUBLIC_RELEASE.md for the full release process.

Standards

MCP transport: stdio
MCP tool implementation: official Python MCP SDK
Registry metadata: server.json using the 2025-12-11 schema
Package registry: PyPI
Registry ownership marker: this README's mcp-name comment
Registry namespace: io.github.engr-faizanali/text-to-speech

Official references:

License

MIT. See LICENSE.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

faizan_ali

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.2.1

Jun 22, 2026

This version

0.2.0

Jun 22, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

text_to_speech_mcp-0.2.0.tar.gz (12.2 kB view details)

Uploaded Jun 22, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

text_to_speech_mcp-0.2.0-py3-none-any.whl (9.6 kB view details)

Uploaded Jun 22, 2026 Python 3

File details

Details for the file text_to_speech_mcp-0.2.0.tar.gz.

File metadata

Download URL: text_to_speech_mcp-0.2.0.tar.gz
Upload date: Jun 22, 2026
Size: 12.2 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for text_to_speech_mcp-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`41c0e7ee4ffe9b45ed90afb3d085cb5d43e86b4e11be91d66c4a20ad1ea11597`
MD5	`ea7291d29a84aa203343c91b3c6525b4`
BLAKE2b-256	`72bb3bde1947f9310f0b1a3061d172610a19e3a6a25bdc3d3307f11ae3f87874`

See more details on using hashes here.

Provenance

The following attestation bundles were made for text_to_speech_mcp-0.2.0.tar.gz:

Publisher: publish.yml on Engr-FaizanAli/text-to-speech-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: text_to_speech_mcp-0.2.0.tar.gz
- Subject digest: 41c0e7ee4ffe9b45ed90afb3d085cb5d43e86b4e11be91d66c4a20ad1ea11597
- Sigstore transparency entry: 1906790196
- Sigstore integration time: Jun 22, 2026
Source repository:
- Permalink: Engr-FaizanAli/text-to-speech-mcp@8c41bab5c63f82ffb41e3d598ff9a538ca84a549
- Branch / Tag: refs/tags/v0.2.0
- Owner: https://github.com/Engr-FaizanAli
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@8c41bab5c63f82ffb41e3d598ff9a538ca84a549
- Trigger Event: push

File details

Details for the file text_to_speech_mcp-0.2.0-py3-none-any.whl.

File metadata

Download URL: text_to_speech_mcp-0.2.0-py3-none-any.whl
Upload date: Jun 22, 2026
Size: 9.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for text_to_speech_mcp-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f7eed41af6835ee160f4e988cb80f88d9681206386dc29fea770d2a7701c73b3`
MD5	`22df5789e56eb6a70352771a0fb8e193`
BLAKE2b-256	`2d2d44568fddc8a34b7bdf95d93a2384dbf38a671da58cd51fdcb41794f2da32`

See more details on using hashes here.

Provenance

The following attestation bundles were made for text_to_speech_mcp-0.2.0-py3-none-any.whl:

Publisher: publish.yml on Engr-FaizanAli/text-to-speech-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: text_to_speech_mcp-0.2.0-py3-none-any.whl
- Subject digest: f7eed41af6835ee160f4e988cb80f88d9681206386dc29fea770d2a7701c73b3
- Sigstore transparency entry: 1906790291
- Sigstore integration time: Jun 22, 2026
Source repository:
- Permalink: Engr-FaizanAli/text-to-speech-mcp@8c41bab5c63f82ffb41e3d598ff9a538ca84a549
- Branch / Tag: refs/tags/v0.2.0
- Owner: https://github.com/Engr-FaizanAli
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@8c41bab5c63f82ffb41e3d598ff9a538ca84a549
- Trigger Event: push

text-to-speech-mcp 0.2.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Text to Speech MCP Server

Features

Requirements

Install

Install from source

Prompt Examples

Tool Contract

Privacy

Optional Backends

Development

Standards

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance