Skip to main content

A lightweight, explainable semantic TTS normalizer for Python apps and voice-agent pipelines.

Project description

Utterwise 🗣️

Python 3.10+ PyPI License Package Status Dependencies

Utterwise is a lightweight, deterministic semantic text normalizer for Python voice assistants, TTS apps, tutoring tools, and small voice-agent pipelines.

It exists because compact TTS models often read useful assistant text badly: dates, money, temperatures, URLs, versions, equations, and ambiguous numbers can sound awkward unless the text is normalized before it reaches the speech model.

from utterwise import normalize

normalize("Call 911 if it reaches 25°C on 03/04/2026.")
# "Call nine one one if it reaches twenty five degrees Celsius on third of April twenty twenty six"

📚 Contents

📦 Install

pip install utterwise

Math and LaTeX parser support is optional:

pip install "utterwise[math]"

For local development (with uv):

uv sync --all-extras

✨ Why Use Utterwise?

  • Deterministic by designThe same input produces the same output every time.
  • Lightweight by defaultNo heavy NLP dependency is required for the core path.
  • Fast detection gateProcessors only run when the text needs them.
  • Semantic before speechTokens are classified before they are verbalized.
  • Context-aware ambiguity handling911 can be emergency, flight number, or cardinal.
  • Explainable outputEvery token can report its rule, confidence, candidates, and span.
  • Practical for voice assistantsCovers dates, currency, temperature, URLs, email, math, and more.

🤔 Why Not Regex?

Regex is useful for finding patterns, but speech normalization also needs meaning. The same characters can have different spoken forms depending on context:

Call 911 immediately      -> nine one one
Flight 911 departs at 6   -> nine eleven
911 divided by 3 is 303   -> nine hundred eleven

Utterwise keeps scored candidates, checks nearby words, resolves ambiguity, and then verbalizes the winning interpretation. That is the core difference between a cleanup script and a speech-focused normalizer.

✅ Supported Features

Feature Status Example
Numbers and large cardinals Supported 42, 9999999
Decimals and leading zeroes Supported 3.145, 007
Years and ambiguity Supported 1998, 911
URLs and emails Supported openai.com, hello@example.com
Versions Supported Python 3.12, v0.1.0
Phones and flight numbers Supported +1-800-555-0100, Flight 911
Acronyms Supported NASA, HTTP
Percentages Supported 12.5%
Temperatures Supported 25°C, 98.6°F
Currency Supported $12.50, €45, £9.99
Dates Supported Jan 3, 2026, 2026-04-03, 03/04/2026
Math and LaTeX Supported (Optional) x^2, \sqrt{x+1}
SSML Minimal <speak>...</speak>

See USAGE.md for API examples, runtime configuration, and CLI output.

🖥️ CLI

utterwise "Call 911 immediately"
utterwise --ssml "hello@example.com"
utterwise --pretty "Python 3.12 costs $12.50"

⚠️ Limitations

  • Utterwise is rule-based; confidence scores are rule confidence, not statistical probabilities.
  • Slash dates default to day/month/year unless configured as month/day/year.
  • SSML output is currently intentionally minimal.
  • Policy names are accepted, but style-specific policy output is not implemented yet.
  • Locale-specific speech styles, rich SSML policies, and chemistry normalization are planned later.

🛠️ Development

Run tests:

uv run --extra dev pytest
.venv\Scripts\python.exe -m pytest

Run the interactive development menu:

.venv\Scripts\python.exe tests\menu.py

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

utterwise-0.1.1.tar.gz (28.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

utterwise-0.1.1-py3-none-any.whl (33.9 kB view details)

Uploaded Python 3

File details

Details for the file utterwise-0.1.1.tar.gz.

File metadata

  • Download URL: utterwise-0.1.1.tar.gz
  • Upload date:
  • Size: 28.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.17 {"installer":{"name":"uv","version":"0.11.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for utterwise-0.1.1.tar.gz
Algorithm Hash digest
SHA256 92a12eed90e4324ebadb94cba3f884efe892789d7f45e254f07e3727cf109c0c
MD5 54619d013e48144d0cca01915240567f
BLAKE2b-256 16f04b38f9a603631750bdfc5acc30dce3e987b75b9a6692bf413258e27c7653

See more details on using hashes here.

File details

Details for the file utterwise-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: utterwise-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 33.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.17 {"installer":{"name":"uv","version":"0.11.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for utterwise-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 ae72fa6ad18d12828015b9be83f439097040670c4d418549826fb60981b29bfc
MD5 322d7e1508e46041e4c6b55aa2ccada1
BLAKE2b-256 b2f727f8faf909a395ecbb04e74cfafaf7c15882ff7089847d500237bc5faf38

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page