Skip to main content

Lightweight Japanese text-to-IPA phoneme converter extracted from misaki

Project description

misaki-ja-lightning ⚡

Lightweight Japanese text-to-IPA phoneme converter extracted from the misaki library. This package contains only the Japanese G2P (grapheme-to-phoneme) functionality with minimal dependencies.

Features

  • 🇯🇵 Convert Japanese text (hiragana, katakana, kanji) to IPA phonemes
  • 🔢 Convert numbers to Japanese kana
  • ⚡ Lightning-fast with minimal dependencies
  • 🎯 Focused on Japanese language only
  • 🔧 Uses pyopenjtalk for accurate phoneme conversion

Installation

pip install misaki-ja-lightning

Usage

Basic G2P Conversion

from misaki_ja_lightning import JAG2P

# Initialize the converter
g2p = JAG2P()

# Convert Japanese text to IPA phonemes
text = "こんにちは、世界"
phonemes, tokens = g2p(text)

print(phonemes)  # IPA phoneme string with pitch information

Number to Kana Conversion

from misaki_ja_lightning import Convert, ConvertKanji

# Convert Arabic numbers to Japanese
result = Convert(12345, 'hiragana')
print(result)  # いちまんにせんさんびゃくよんじゅうご

# Convert to kanji
result = Convert(12345, 'kanji')
print(result)  # 一万二千三百四十五

# Convert to romaji
result = Convert(12345, 'romaji')
print(result)  # ichi man ni sen san byaku yon juu go

# Supported formats: 'hiragana', 'kanji', 'romaji'
# Note: 'katakana' is not supported in num2kana module

# Convert kanji numbers back to Arabic
number = ConvertKanji("一万二千三百四十五")
print(number)  # 12345

Token-level Processing

from misaki_ja_lightning import JAG2P

g2p = JAG2P()
phonemes, tokens = g2p("今日は良い天気ですね")

for token in tokens:
    print(f"Text: {token.text}")
    print(f"Phonemes: {token.phonemes}")
    print(f"Tag: {token.tag}")
    print(f"Pitch: {token._.pitch}")
    print("---")

What's Included

This lightweight package includes only:

  • ja.py - Japanese G2P converter using pyopenjtalk
  • num2kana.py - Number to Japanese kana converter
  • token.py - Token data structure

Differences from Original Misaki

  • ✅ Japanese-only (removed other languages)
  • ✅ Removed cutlet dependency
  • ✅ Removed addict dependency
  • ✅ Simplified token structure
  • ✅ Only pyopenjtalk version (no cutlet option)
  • ✅ Minimal dependencies

Requirements

  • Python >= 3.8
  • pyopenjtalk (forked version with /tmp support for serverless environments)

Note: This package uses a forked version of pyopenjtalk that downloads the dictionary to /tmp instead of the package directory. This allows it to work in serverless environments like Vercel where the filesystem is read-only.

License

MIT License (inherited from original misaki library)

Credits

This package is extracted from misaki by hexgrad. All credit for the original implementation goes to the misaki authors.

The num2kana module is based on Convert-Numbers-to-Japanese by Greatdane (MIT License).

Related Projects

Use Cases

Perfect for:

  • Text-to-speech applications
  • Japanese language learning tools
  • Phoneme-based synthesis
  • Lightweight Japanese text processing

Support

For issues and questions, please visit the GitHub Issues page.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

misaki_ja_lightning-0.1.1.tar.gz (16.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

misaki_ja_lightning-0.1.1-py3-none-any.whl (12.0 kB view details)

Uploaded Python 3

File details

Details for the file misaki_ja_lightning-0.1.1.tar.gz.

File metadata

  • Download URL: misaki_ja_lightning-0.1.1.tar.gz
  • Upload date:
  • Size: 16.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for misaki_ja_lightning-0.1.1.tar.gz
Algorithm Hash digest
SHA256 59ac588b7279854d4005c6263667a35dbcb5cb20a0718e0c1e223b820b0b6641
MD5 551f216f278af2c0e498a02522a8f2ff
BLAKE2b-256 641a3b13634691f810e35be11ab00b7509fa86be1976474a87d6636e253d4d3f

See more details on using hashes here.

File details

Details for the file misaki_ja_lightning-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for misaki_ja_lightning-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 ab1fd3109acb5f94dc45cc7494169007602d7df030d25f13eb99f8a0db10737d
MD5 2d398367ef3abc416a0b9945f3127a8a
BLAKE2b-256 978042bab102e49e7d209a5551ce822f855f4ee0f58a11aa0a33cf204c6c4a48

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page