AWS Polly TTS MCP server and CLI for language learning
Project description
langlearn-tts
Generate audio flashcards and vocabulary drills from text. Ask Claude to synthesize words and phrases in any language, or batch-process entire vocabulary lists from the command line. Audio is slowed to 90% speed by default so learners can hear pronunciation clearly.
The pair mode is the core workflow: give it an English word and its translation, and it produces a single MP3 — [English audio] [pause] [target language audio] — ready for Anki, spaced repetition, or passive listening.
Available as both a Claude Desktop MCP server (ask Claude to generate audio in conversation) and a CLI with identical functionality. Currently supports AWS Polly; ElevenLabs and OpenAI TTS backends are planned so you can pick the provider that fits your setup.
Features
- Single synthesis — convert text to MP3 in any supported language
- Batch synthesis — synthesize multiple texts, optionally merged into one file
- Pair synthesis — stitch two languages together:
[English] [pause] [L2] - Pair batch — batch-process vocabulary lists as stitched pairs
- Auto-play — MCP tools play audio immediately after synthesis
- Configurable speech rate — default 90% for learner-friendly pacing
- 93 voices, 41 languages — any AWS Polly voice works out of the box
Quick Start
1. Install uv (Python package manager)
If you don't have uv yet:
# macOS / Linux
curl -LsSf https://astral.sh/uv/install.sh | sh
# Windows
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
uv manages Python versions automatically — you don't need to install Python separately.
2. Install langlearn-tts
uv tool install langlearn-tts
This installs the langlearn-tts CLI and langlearn-tts-server MCP server globally.
3. Install ffmpeg
Required for audio stitching (pairs, merged batches). Single synthesis works without it.
# macOS (requires Homebrew — install from https://brew.sh if needed)
brew install ffmpeg
# Ubuntu/Debian
sudo apt install ffmpeg
# Windows
winget install ffmpeg
4. Configure AWS credentials
The tool uses AWS Polly, which requires an AWS account with polly:SynthesizeSpeech and polly:DescribeVoices permissions.
Option A — AWS CLI (recommended):
Install the AWS CLI, then:
aws configure
Enter your Access Key ID, Secret Access Key, and region (e.g., us-east-1).
Option B — Environment variables:
export AWS_ACCESS_KEY_ID=your-key
export AWS_SECRET_ACCESS_KEY=your-secret
export AWS_DEFAULT_REGION=us-east-1
Option C — Credentials file (~/.aws/credentials):
[default]
aws_access_key_id = your-key
aws_secret_access_key = your-secret
region = us-east-1
5. Verify
langlearn-tts doctor
All required checks should show ✓. Fix any that show ✗ before continuing.
From source (development)
git clone https://github.com/jmf-pobox/langlearn-tts-mcp.git
cd langlearn-tts-mcp
uv sync --all-extras
uv run langlearn-tts --help
Claude Desktop Setup
Automatic (recommended)
langlearn-tts install
This registers the MCP server with Claude Desktop. Options:
--output-dir PATH— custom audio output directory (default:~/Claude-Audio)--uvx-path PATH— override theuvxbinary path
Restart Claude Desktop after running install.
Manual
Add to ~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"langlearn-tts": {
"command": "/absolute/path/to/uvx",
"args": ["--from", "langlearn-tts", "langlearn-tts-server"],
"env": {
"LANGLEARN_TTS_OUTPUT_DIR": "/absolute/path/to/output/directory"
}
}
}
}
Claude Desktop does not inherit your shell PATH. All paths must be absolute. Find your uvx path with which uvx.
The LANGLEARN_TTS_OUTPUT_DIR environment variable sets the default output directory. If unset, files are saved to ~/Claude-Audio/.
Restart Claude Desktop after editing the config.
AI Tutor Prompts
langlearn-tts ships with 28 ready-made AI tutor prompts — one for each combination of 7 languages and 4 levels. Paste a prompt into a Claude Desktop Project's Instructions field, and Claude becomes a language tutor that generates audio during lessons.
Browse prompts
# List all available prompts
langlearn-tts prompt list
# Print a prompt (pipe to clipboard with pbcopy on macOS)
langlearn-tts prompt show german-high-school | pbcopy
Set up a Claude Desktop Project
- In Claude Desktop, click Projects in the sidebar
- Click Create Project and name it (e.g., "German with Herr Schmidt")
- Open the project, click Set custom instructions
- Paste the prompt content into the Instructions field
- Start a new conversation within that project
Using a Project keeps the tutor persona scoped to language learning. Other conversations are unaffected.
Available languages and levels
| Language | High School | 1st Year | 2nd Year | Advanced |
|---|---|---|---|---|
| German | Herr Schmidt | Professorin Weber | Professor Hartmann | Professor Becker |
| Spanish | Profesora Elena | Profesor Garcia | Profesora Carmen | Profesora Reyes |
| French | Madame Moreau | Professeur Laurent | Professeur Dubois | Professeur Beaumont |
| Russian | Irina Petrovna | Professor Dmitri | Professor Natasha | Professor Mikhail |
| Korean | Kim-seonsaengnim | Professor Park | Professor Kim | Professor Yoon |
| Japanese | Tanaka-sensei | Yamamoto-sensei | Suzuki-sensei | Mori-sensei |
| Chinese | Laoshi Wang | Professor Chen | Professor Zhang | Professor Wei |
Each prompt creates a tutor persona calibrated to the student's level, based on Mollick & Mollick's "Assigning AI" framework. Customize any prompt by adjusting student background, voice selection, speech rate, or focus areas.
Troubleshooting
langlearn-tts doctor
Checks Python version, ffmpeg, AWS credentials, Polly access, uvx, Claude Desktop config, and output directory. Required checks must pass (exit code 1 on failure); optional checks show ○ markers.
Voices
Any voice from the AWS Polly voice list is supported. Voice names are case-insensitive. The tool queries the Polly API on first use and caches the result.
Common voices for language learning:
| Voice | Language | Engine |
|---|---|---|
| joanna | English (US) | neural |
| matthew | English (US) | neural |
| daniel | German | neural |
| vicki | German (female) | neural |
| lucia | Spanish (European) | neural |
| lupe | Spanish (US) | neural |
| léa | French | neural |
| tatyana | Russian | standard |
| seoyeon | Korean | neural |
| takumi | Japanese | neural |
| zhiyu | Chinese (Mandarin) | neural |
The engine (neural, standard, generative, long-form) is selected automatically — neural preferred when available.
CLI Usage
# Single synthesis
langlearn-tts synthesize "Guten Morgen" --voice daniel -o morning.mp3
# Custom speech rate (percentage, default 90)
langlearn-tts synthesize "Привет" --voice tatyana --rate 70 -o privet.mp3
# Pair: English + German stitched with a pause
langlearn-tts synthesize-pair "good morning" "Guten Morgen" \
--voice1 joanna --voice2 daniel -o pair.mp3
# Batch from JSON file (["hello", "world", "good morning"])
langlearn-tts synthesize-batch words.json -d output/
# Batch merged into single file
langlearn-tts synthesize-batch words.json -d output/ --merge --pause 800
# Pair batch from JSON file ([["strong", "stark"], ["house", "Haus"]])
langlearn-tts synthesize-pair-batch pairs.json -d output/
# Browse AI tutor prompts
langlearn-tts prompt list
langlearn-tts prompt show german-high-school | pbcopy
MCP Tools
All four tools are available in Claude Desktop once the server is configured:
| Tool | Description |
|---|---|
synthesize |
Single text to MP3 |
synthesize_batch |
Multiple texts, optionally merged |
synthesize_pair |
Two texts stitched with a pause |
synthesize_pair_batch |
Multiple pairs, optionally merged |
Each tool accepts auto_play (default: true) to play audio immediately after synthesis.
Roadmap
Provider Abstraction Layer
A TTSProvider protocol that decouples CLI/MCP tools from any specific backend. Enables --provider flag, provider auto-detection from API keys, and provider-specific doctor checks.
ElevenLabs Backend
Highest voice quality. 29+ languages, 5,000+ voices, voice cloning. Setup: pip install langlearn-tts[elevenlabs] + ELEVENLABS_API_KEY env var. Free tier: 10K chars/month.
OpenAI TTS Backend
Broadest adoption — most users already have an OpenAI key. 6 built-in voices, 50+ languages. Setup: pip install langlearn-tts[openai] + OPENAI_API_KEY env var. $15/1M chars (tts-1).
Development
# Install with dev dependencies
uv sync --all-extras
# Run tests
uv run pytest tests/ -v
# Linting and formatting
uv run ruff check src/ tests/
uv run ruff format src/ tests/
# Type checking
uv run mypy src/ tests/
uv run pyright src/ tests/
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file langlearn_tts-0.2.0.tar.gz.
File metadata
- Download URL: langlearn_tts-0.2.0.tar.gz
- Upload date:
- Size: 43.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2dba3ae6bf1a39c54dd710f08ef4b6f780a78d500db6631d79ca4be9657d62ca
|
|
| MD5 |
5c72a83d2b2bf65034486766280840f7
|
|
| BLAKE2b-256 |
3ad99a97cae6aec7b24991d0a28b314f221deaa96cfa7aa3ff1947d74f29121a
|
File details
Details for the file langlearn_tts-0.2.0-py3-none-any.whl.
File metadata
- Download URL: langlearn_tts-0.2.0-py3-none-any.whl
- Upload date:
- Size: 77.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1f626dce931b8e11b36b027d52f33e7dbe4828761fe4156b835d38446d4ee278
|
|
| MD5 |
e8c7e4efa700e6a3c04f8abeadb561fa
|
|
| BLAKE2b-256 |
1fa04fbbdcaffe18e7ffdd98314e99620611c705aaf9e823e92c41d99d838fe6
|