Skip to main content

Multilingual G2P (Grapheme-to-Phoneme) for TTS — eSpeak-ng free, MIT licensed

Project description

piper-plus-g2p

Multilingual G2P (Grapheme-to-Phoneme) for TTS. eSpeak-ng free. MIT licensed. 8 languages.

Standalone package — piper-plus TTS エンジンなしで G2P 単体として利用可能です。任意の TTS エンジンと組み合わせて使えます。

Why piper-plus-g2p?

  • MIT licensed -- no eSpeak-ng (GPL) dependency in your TTS pipeline
  • 8 languages -- JA, EN, ZH, KO, ES, FR, PT, SV with consistent IPA output
  • IPA-first design -- returns pure IPA token sequences, ready for any TTS model

Comparison

piper-plus-g2p phonemizer gruut Misaki
License MIT GPL (eSpeak-ng) MIT Apache-2.0
Languages 8 100+ 20+ EN only
eSpeak-ng required No Yes No No
IPA output Yes Yes Yes Yes

Installation

pip install piper-plus-g2p               # Rule-based languages (ES, FR, PT, SV)
pip install piper-plus-g2p[ja,en]        # Japanese + English
pip install piper-plus-g2p[all]          # All 8 languages

Note: The ja extra requires pyopenjtalk-plus, which provides pre-built wheels for Linux, macOS, and Windows. See pyopenjtalk-plus for platform details.

Quick Start

from piper_plus_g2p import get_phonemizer

ja = get_phonemizer("ja")
ja.phonemize("こんにちは")
# -> ["k", "o", "[", "N_n", "n", "i", "ch", "i", "w", "a"]

en = get_phonemizer("en")
en.phonemize("Hello world")
# -> ["h", "ə", "l", "ˈ", "o", "ʊ", " ", "w", "ˈ", "ɜ", "ː", "l", "d"]

Supported Languages

Language Code Extra Backend Notes
Japanese ja piper-plus-g2p[ja] pyopenjtalk-plus Context-dependent N variants, prosody info
English en piper-plus-g2p[en] g2p-en CMU-dict + neural fallback
Chinese zh piper-plus-g2p[zh] pypinyin Pinyin-to-IPA conversion
Korean ko piper-plus-g2p[ko] g2pk2 Optional dependency
Spanish es -- Rule-based No external dependency
French fr -- Rule-based No external dependency
Portuguese pt -- Rule-based No external dependency
Swedish sv -- Rule-based No external dependency

Advanced Usage

Multilingual (Composite Language Code)

Pass a hyphen-joined code like "ja-en-zh" to get_phonemizer to automatically create a MultilingualPhonemizer. Language detection is Unicode-based, so mixed-script text is handled without explicit tagging.

from piper_plus_g2p import get_phonemizer

multi = get_phonemizer("ja-en-zh")
tokens = multi.phonemize("こんにちは Hello 你好")

PiperEncoder

Converts IPA token lists into integer phoneme_ids for Piper ONNX models.

from piper_plus_g2p.encode import PiperEncoder, get_phoneme_id_map

id_map = get_phoneme_id_map("ja")
encoder = PiperEncoder(id_map)
phoneme_ids = encoder.encode(["k", "o", "[", "N_n", "n", "i", "ch", "i", "w", "a"])

Piper Model Compatibility

piper-plus-g2p produces phoneme tokens directly compatible with Piper TTS ONNX models. Use PiperEncoder with the model's phoneme_id_map from config.json:

import json
from piper_plus_g2p import get_phonemizer
from piper_plus_g2p.encode import PiperEncoder

with open("model/config.json") as f:
    config = json.load(f)

encoder = PiperEncoder(config["phoneme_id_map"])
phonemizer = get_phonemizer("ja")
tokens = phonemizer.phonemize("こんにちは")
phoneme_ids = encoder.encode(tokens)
# -> ready for ONNX inference

Use strict=True to raise errors on unknown tokens instead of silently skipping them:

encoder = PiperEncoder(config["phoneme_id_map"], strict=True)

Cross-Platform Consistency

piper-plus-g2p is also available as:

  • Rust crate: piper-plus-g2p on crates.io
  • Go module: github.com/ayutaz/piper-plus/src/go/phonemize
  • npm package: @piper-plus/g2p for browser/WASM

All three implementations share the same PUA mapping table and are validated against a common test fixture.

Requirements

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

piper_plus_g2p-0.2.0.tar.gz (145.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

piper_plus_g2p-0.2.0-py3-none-any.whl (68.4 kB view details)

Uploaded Python 3

File details

Details for the file piper_plus_g2p-0.2.0.tar.gz.

File metadata

  • Download URL: piper_plus_g2p-0.2.0.tar.gz
  • Upload date:
  • Size: 145.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for piper_plus_g2p-0.2.0.tar.gz
Algorithm Hash digest
SHA256 e99c9c7c81660ea6146542b1f035c41aeb16e9eb958f3a9b01daf64b97e19165
MD5 ace43eb10056ff1775294f301a69ccb5
BLAKE2b-256 3f9360719a594ec396432df84de669fceac26bf7b8220aa5d9538ad5f620f914

See more details on using hashes here.

Provenance

The following attestation bundles were made for piper_plus_g2p-0.2.0.tar.gz:

Publisher: g2p-python-ci.yml on ayutaz/piper-plus

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file piper_plus_g2p-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: piper_plus_g2p-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 68.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for piper_plus_g2p-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 151662d920f27f6d674f073fabd386ec17f60aa5ab7f7a35193ba6c8a444599d
MD5 e645e75b980874e11b95a7a5ee6b0364
BLAKE2b-256 e4d4c3fb6dd8e5481452d0943713460d9c6702198b3e0503cd6a04dd0bb0ac24

See more details on using hashes here.

Provenance

The following attestation bundles were made for piper_plus_g2p-0.2.0-py3-none-any.whl:

Publisher: g2p-python-ci.yml on ayutaz/piper-plus

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page