Skip to main content

Simple Python package for getting japanese reading (yomigana) using MeCab

Project description

MeCab Text Cleaner

CI Status Documentation Status Test coverage percentage

Poetry black pre-commit

PyPI Version Supported Python versions License

This is a simple Python package for getting japanese readings (yomigana) and accents using MeCab. Please also consider using pyopenjtalk (no accents) or pyopenjtalk_g2p_prosody (ESPnet) (with accents), as this package does not account for accent changes in compound words.

Installation

Install this via pip or pipx (or your favourite package manager):

pipx install mecab-text-cleaner[unidecode,unidic]
pip install mecab-text-cleaner[unidecode,unidic]

Usage

> mtc いい天気ですね。
イ] ]ンキ デス ネ。
> mtc いい天気ですね。 --ascii
i] te]nki desu ne.
> mtc いい天気ですね --no-add-atype --no-add-blank-between-words
イーテンキデスネ
> mtc いい天気ですね --no-add-atype --no-add-blank-between-words -r kana
イイテンキデスネ
from mecab_text_cleaner import to_reading, to_ascii_clean

assert to_reading("     空、雲。\n雨!(") == "ソ]ラ、 ク]モ。\nア]メ!("
assert to_ascii_clean("      한空、雲。\n雨!(") == "han so]ra, ku]mo. \na]me!("

Contributors ✨

Thanks goes to these wonderful people (emoji key):

This project follows the all-contributors specification. Contributions of any kind welcome!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mecab_text_cleaner-0.1.1.tar.gz (9.7 kB view details)

Uploaded Source

Built Distribution

mecab_text_cleaner-0.1.1-py3-none-any.whl (8.9 kB view details)

Uploaded Python 3

File details

Details for the file mecab_text_cleaner-0.1.1.tar.gz.

File metadata

  • Download URL: mecab_text_cleaner-0.1.1.tar.gz
  • Upload date:
  • Size: 9.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.7

File hashes

Hashes for mecab_text_cleaner-0.1.1.tar.gz
Algorithm Hash digest
SHA256 6f56cb65a3ce0f55801ed0323f9070a2fca6c58e257be84758391fc264b94bf8
MD5 ac6cabb599ea49417e1e2d675a380c76
BLAKE2b-256 dd9d9aa8b4eead8a3b3c2f8f8cbdf3774b727ea3bd9fac673b45e6f26f1b4fdb

See more details on using hashes here.

File details

Details for the file mecab_text_cleaner-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for mecab_text_cleaner-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 f7e87ed974daeb50184c55e8fc25fc3d0bb4e1949a534eb1116b1bb12c8eff1f
MD5 9b3a0e4c2a60ec715d50246b5ad35330
BLAKE2b-256 3151da52e56d0647889f3699159168873f9bcf7125cf7eec48854a770b7dc9c3

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page