Skip to main content

AWS Polly TTS MCP server and CLI for language learning

Project description

langlearn-tts

Text-to-speech toolkit for language learning. Provides both an MCP server (for Claude Desktop) and a CLI with identical functionality.

Currently supports AWS Polly with ElevenLabs and OpenAI TTS backends planned. The goal: pick the TTS provider that fits your setup — no AWS account required once alternative backends ship.

Features

  • Single synthesis — convert text to MP3 in any supported language
  • Batch synthesis — synthesize multiple texts, optionally merged into one file
  • Pair synthesis — stitch two languages together: [English audio] [pause] [L2 audio]
  • Pair batch — batch process vocabulary lists as stitched pairs
  • Auto-play — MCP tools play audio immediately after synthesis via afplay
  • Configurable speech rate — default 90% speed for learner-friendly pacing
  • 93 voices, 41 languages — any voice from the AWS Polly voice list works out of the box

Prerequisites

  • Python 3.13+
  • uv — Python package manager
  • ffmpeg — required for audio stitching
  • AWS credentials configured (see AWS Configuration)

Installation

From PyPI (recommended)

# Using uv
uv tool install langlearn-tts

# Or using pip
pip install langlearn-tts

From source (development)

git clone https://github.com/jmf-pobox/langlearn-tts-mcp.git
cd langlearn-tts-mcp
uv sync

Verify the installation:

langlearn-tts --help

AWS Configuration

The tool requires AWS credentials with polly:SynthesizeSpeech and polly:DescribeVoices permissions. Configure using any standard AWS method:

Option A — AWS CLI (recommended):

aws configure

Option B — Environment variables:

export AWS_ACCESS_KEY_ID=your-key
export AWS_SECRET_ACCESS_KEY=your-secret
export AWS_DEFAULT_REGION=us-east-1

Option C — Credentials file (~/.aws/credentials):

[default]
aws_access_key_id = your-key
aws_secret_access_key = your-secret
region = us-east-1

ffmpeg

Audio stitching (pairs, merged batches) requires ffmpeg:

# macOS
brew install ffmpeg

# Ubuntu/Debian
sudo apt install ffmpeg

Claude Desktop Setup

Automatic (recommended)

langlearn-tts install

This registers the MCP server with Claude Desktop. Options:

  • --output-dir PATH — custom audio output directory (default: ~/Claude-Audio)
  • --uvx-path PATH — override the uvx binary path

Restart Claude Desktop after running install.

Manual

Add to ~/Library/Application Support/Claude/claude_desktop_config.json:

{
  "mcpServers": {
    "langlearn-tts": {
      "command": "/absolute/path/to/uvx",
      "args": ["--from", "langlearn-tts", "langlearn-tts-server"],
      "env": {
        "POLLY_OUTPUT_DIR": "/absolute/path/to/output/directory"
      }
    }
  }
}

Claude Desktop does not inherit your shell PATH. All paths must be absolute. Find your uvx path with which uvx.

The POLLY_OUTPUT_DIR environment variable sets the default output directory. If unset, files are saved to ~/Claude-Audio/.

Restart Claude Desktop after editing the config.

Troubleshooting

langlearn-tts doctor

Checks Python version, ffmpeg, AWS credentials, Polly access, uvx, Claude Desktop config, and output directory. Required checks must pass (exit code 1 on failure); optional checks show markers.

Voices

Any voice from the AWS Polly voice list is supported. Voice names are case-insensitive. The tool queries the Polly API on first use and caches the result.

Common voices for language learning:

Voice Language Engine
joanna English (US) neural
matthew English (US) neural
daniel German neural
vicki German neural
lucia Spanish (European) neural
lupe Spanish (US) neural
léa French neural
tatyana Russian standard
seoyeon Korean neural
takumi Japanese neural
zhiyu Chinese (Mandarin) neural

The engine (neural, standard, generative, long-form) is selected automatically — neural preferred when available.

CLI Usage

# Single synthesis
langlearn-tts synthesize "Guten Morgen" --voice daniel -o morning.mp3

# Custom speech rate (percentage, default 90)
langlearn-tts synthesize "Привет" --voice tatyana --rate 70 -o privet.mp3

# Pair: English + German stitched with a pause
langlearn-tts synthesize-pair "good morning" "Guten Morgen" \
  --voice1 joanna --voice2 daniel -o pair.mp3

# Batch from JSON file (["hello", "world", "good morning"])
langlearn-tts synthesize-batch words.json -d output/

# Batch merged into single file
langlearn-tts synthesize-batch words.json -d output/ --merge --pause 800

# Pair batch from JSON file ([["strong", "stark"], ["house", "Haus"]])
langlearn-tts synthesize-pair-batch pairs.json -d output/

MCP Tools

All four tools are available in Claude Desktop once the server is configured:

Tool Description
synthesize Single text to MP3
synthesize_batch Multiple texts, optionally merged
synthesize_pair Two texts stitched with a pause
synthesize_pair_batch Multiple pairs, optionally merged

Each tool accepts auto_play (default: true) to play audio immediately after synthesis.

Roadmap

Provider Abstraction Layer

A TTSProvider protocol that decouples CLI/MCP tools from any specific backend. Enables --provider flag, provider auto-detection from API keys, and provider-specific doctor checks.

ElevenLabs Backend

Highest voice quality. 29+ languages, 5,000+ voices, voice cloning. Setup: pip install langlearn-tts[elevenlabs] + ELEVENLABS_API_KEY env var. Free tier: 10K chars/month.

OpenAI TTS Backend

Broadest adoption — most users already have an OpenAI key. 6 built-in voices, 50+ languages. Setup: pip install langlearn-tts[openai] + OPENAI_API_KEY env var. $15/1M chars (tts-1).

Development

# Install with dev dependencies
uv sync --all-extras

# Run tests
uv run pytest tests/ -v

# Linting and formatting
uv run ruff check src/ tests/
uv run ruff format src/ tests/

# Type checking
uv run mypy src/ tests/
uv run pyright src/ tests/

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

langlearn_tts-0.1.1.tar.gz (12.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

langlearn_tts-0.1.1-py3-none-any.whl (15.7 kB view details)

Uploaded Python 3

File details

Details for the file langlearn_tts-0.1.1.tar.gz.

File metadata

  • Download URL: langlearn_tts-0.1.1.tar.gz
  • Upload date:
  • Size: 12.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for langlearn_tts-0.1.1.tar.gz
Algorithm Hash digest
SHA256 4b6add8946649f59277fe7e0c83ab75e6c2ee62016b3d6e86e61ba37880d5540
MD5 a506f8daf67ffb4b41b77348071fff37
BLAKE2b-256 fe0b6c32a662154effec72ab903e80fcb93726f18e6bbfecedfdb9a2b6ffefa2

See more details on using hashes here.

File details

Details for the file langlearn_tts-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: langlearn_tts-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 15.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for langlearn_tts-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 35db307cc2b4466382ed5705e1fc7009228e024b86af8fd9a14a60f081f3c1b7
MD5 611f88bace86e91dbee27b4527f4efdb
BLAKE2b-256 ef9a85ef6634ecca7f8494a22eefd910ca47d6427860571c22aa18a1d945beb2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page