Skip to main content

Transcribe Esperanto text into phonetic Polish for use in professional TTS engines.

Project description

vocx

vocx transcribes Esperanto text into phonetic Polish for use in professional text-to-speech (TTS) engines.

Background

Commercial TTS engines tend not to support minority languages, particularly constructed languages such as Esperanto. It turns out Esperanto shares lots of sounds with Polish. By transcribing Esperanto to Polish, we can make commercial TTS engines give us a good approximation for spoken Esperanto.

This is a Python port of the original Go library.

Installation

pip install vocx

Usage

Library

from vocx import Transcriber

t = Transcriber()
t.transcribe("Ĉu vi ŝatas Esperanton? Esperanto estas facila lingvo.")
# "czu wij szatas esperanton? esperanto estas fatssila lijngwo."

Custom rules

To override the default rules used during transcription, call load_rules, passing a custom JSON rules document. See src/vocx/default_rules.py for the correct structure.

from vocx import Transcriber

t = Transcriber()
t.load_rules(my_rules_json)

A rules document has four sections:

  • letters — single-character substitutions (applied lowercased).
  • fragments — ordered regular-expression replacements applied to each word.
  • overrides — whole-word replacements (surrounding punctuation is preserved).
  • numbers — the words used when transcribing numeric tokens.

Command line

# Transcribe arguments
vocx "Saluton, kiel vi fartas?"
# saluton, kijel wij fartas?

# Transcribe stdin
echo "Saluton" | vocx

# Use a custom rules file
vocx --rules my_rules.json "Saluton"

Development

uv sync
uv run pytest

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vocx-0.1.0.tar.gz (11.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vocx-0.1.0-py3-none-any.whl (8.0 kB view details)

Uploaded Python 3

File details

Details for the file vocx-0.1.0.tar.gz.

File metadata

  • Download URL: vocx-0.1.0.tar.gz
  • Upload date:
  • Size: 11.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for vocx-0.1.0.tar.gz
Algorithm Hash digest
SHA256 2ca3f6aa84833e6c2586daa9328479b804527cd323c17b09c7e5f4ed70219e67
MD5 107dca88cf08cf2e08d50636af11b40f
BLAKE2b-256 a41d6abf520795fa303ec16ad47fe7950e5e63afb24d8bf27ecd60ceb45491af

See more details on using hashes here.

Provenance

The following attestation bundles were made for vocx-0.1.0.tar.gz:

Publisher: publish-pypi.yml on eugenzor/vocx

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vocx-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: vocx-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 8.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for vocx-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 de0fb0f394f9e588349592b6c3d7cb4d14266cbeaf42a4fd2ab8bed5a5799c6a
MD5 eb77995c70f5e2223a38aae040b9f483
BLAKE2b-256 dfd4212f77d3ebf727bade00bf829b6d89cb46bd3bf79bdcf6b4abe5f3743dbc

See more details on using hashes here.

Provenance

The following attestation bundles were made for vocx-0.1.0-py3-none-any.whl:

Publisher: publish-pypi.yml on eugenzor/vocx

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page