A lightweight, explainable semantic TTS normalizer for Python apps and voice-agent pipelines.
Project description
Utterwise 🗣️
Utterwise is a lightweight, deterministic semantic text normalizer for Python voice assistants, TTS apps, tutoring tools, and small voice-agent pipelines.
It exists because compact TTS models often read useful assistant text badly: dates, money, temperatures, URLs, versions, equations, and ambiguous numbers can sound awkward unless the text is normalized before it reaches the speech model.
from utterwise import normalize
normalize("Call 911 if it reaches 25°C on 03/04/2026.")
# "Call nine one one if it reaches twenty five degrees Celsius on third of April twenty twenty six"
📚 Contents
- 📦 Install
- ✨ Why Use Utterwise?
- 🤔 Why Not Regex?
- ✅ Supported Features
- 📖 Usage and Examples
- 🖥️ CLI
- ⚠️ Limitations
- 🛠️ Development
📦 Install
pip install utterwise
Math and LaTeX parser support is optional:
pip install "utterwise[math]"
For local development (with uv):
uv sync --all-extras
✨ Why Use Utterwise?
- Deterministic by design → The same input produces the same output every time.
- Lightweight by default → No heavy NLP dependency is required for the core path.
- Fast detection gate → Processors only run when the text needs them.
- Semantic before speech → Tokens are classified before they are verbalized.
- Context-aware ambiguity handling →
911can be emergency, flight number, or cardinal. - Explainable output → Every token can report its rule, confidence, candidates, and span.
- Practical for voice assistants → Covers dates, currency, temperature, URLs, email, math, and more.
🤔 Why Not Regex?
Regex is useful for finding patterns, but speech normalization also needs meaning. The same characters can have different spoken forms depending on context:
Call 911 immediately -> nine one one
Flight 911 departs at 6 -> nine eleven
911 divided by 3 is 303 -> nine hundred eleven
Utterwise keeps scored candidates, checks nearby words, resolves ambiguity, and then verbalizes the winning interpretation. That is the core difference between a cleanup script and a speech-focused normalizer.
✅ Supported Features
| Feature | Status | Example |
|---|---|---|
| Numbers and large cardinals | Supported | 42, 9999999 |
| Decimals and leading zeroes | Supported | 3.145, 007 |
| Years and ambiguity | Supported | 1998, 911 |
| URLs and emails | Supported | openai.com, hello@example.com |
| Versions | Supported | Python 3.12, v0.1.0 |
| Phones and flight numbers | Supported | +1-800-555-0100, Flight 911 |
| Acronyms | Supported | NASA, HTTP |
| Percentages | Supported | 12.5% |
| Temperatures | Supported | 25°C, 98.6°F |
| Currency | Supported | $12.50, €45, £9.99 |
| Dates | Supported | Jan 3, 2026, 2026-04-03, 03/04/2026 |
| Math and LaTeX | Supported (Optional) | x^2, \sqrt{x+1} |
| SSML | Minimal | <speak>...</speak> |
See USAGE.md for API examples, runtime configuration, and CLI output.
🖥️ CLI
utterwise "Call 911 immediately"
utterwise --ssml "hello@example.com"
utterwise --pretty "Python 3.12 costs $12.50"
⚠️ Limitations
- Utterwise is rule-based; confidence scores are rule confidence, not statistical probabilities.
- Slash dates default to day/month/year unless configured as month/day/year.
- SSML output is currently intentionally minimal.
- Policy names are accepted, but style-specific policy output is not implemented yet.
- Locale-specific speech styles, rich SSML policies, and chemistry normalization are planned later.
🛠️ Development
Run tests:
uv run --extra dev pytest
.venv\Scripts\python.exe -m pytest
Run the interactive development menu:
.venv\Scripts\python.exe tests\menu.py
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file utterwise-0.1.1.tar.gz.
File metadata
- Download URL: utterwise-0.1.1.tar.gz
- Upload date:
- Size: 28.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.17 {"installer":{"name":"uv","version":"0.11.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
92a12eed90e4324ebadb94cba3f884efe892789d7f45e254f07e3727cf109c0c
|
|
| MD5 |
54619d013e48144d0cca01915240567f
|
|
| BLAKE2b-256 |
16f04b38f9a603631750bdfc5acc30dce3e987b75b9a6692bf413258e27c7653
|
File details
Details for the file utterwise-0.1.1-py3-none-any.whl.
File metadata
- Download URL: utterwise-0.1.1-py3-none-any.whl
- Upload date:
- Size: 33.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.17 {"installer":{"name":"uv","version":"0.11.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ae72fa6ad18d12828015b9be83f439097040670c4d418549826fb60981b29bfc
|
|
| MD5 |
322d7e1508e46041e4c6b55aa2ccada1
|
|
| BLAKE2b-256 |
b2f727f8faf909a395ecbb04e74cfafaf7c15882ff7089847d500237bc5faf38
|