Skip to main content

Simple Python package for getting japanese reading (yomigana) using MeCab

Project description

MeCab Text Cleaner

CI Status Documentation Status Test coverage percentage

Poetry black pre-commit

PyPI Version Supported Python versions License

This is a simple Python package for getting japanese readings (yomigana) and accents using MeCab. Please also consider using pyopenjtalk (no accents) or pyopenjtalk_g2p_prosody (ESPnet) (with accents), as this package does not account for accent changes in compound words.

Installation

Install this via pip or pipx (or your favourite package manager):

pipx install mecab-text-cleaner[unidecode,unidic]
pip install mecab-text-cleaner[unidecode,unidic]

Usage

> mtc いい天気ですね。
イ] ]ンキ デス ネ。
> mtc いい天気ですね。 --ascii
i] te]nki desu ne.
> mtc いい天気ですね --no-add-atype --no-add-blank-between-words
イーテンキデスネ
> mtc いい天気ですね --no-add-atype --no-add-blank-between-words -r kana
イイテンキデスネ
from mecab_text_cleaner import to_reading, to_ascii_clean

assert to_reading("     空、雲。\n雨!(") == "ソ]ラ、 ク]モ。\nア]メ!("
assert to_ascii_clean("      한空、雲。\n雨!(") == "han so]ra, ku]mo. \na]me!("

Contributors ✨

Thanks goes to these wonderful people (emoji key):

This project follows the all-contributors specification. Contributions of any kind welcome!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mecab_text_cleaner-0.1.0.tar.gz (9.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mecab_text_cleaner-0.1.0-py3-none-any.whl (8.9 kB view details)

Uploaded Python 3

File details

Details for the file mecab_text_cleaner-0.1.0.tar.gz.

File metadata

  • Download URL: mecab_text_cleaner-0.1.0.tar.gz
  • Upload date:
  • Size: 9.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.5

File hashes

Hashes for mecab_text_cleaner-0.1.0.tar.gz
Algorithm Hash digest
SHA256 b978a936c17cd8a1be73a0776033224eca472b0c76a007dfdc29e26e3f611996
MD5 0c7e2b45465ee1e8e9d1e13381bff801
BLAKE2b-256 cd4335ced871958ce5202a029d8e9b80c6fea9be39c36f37f4f3f17db25ae718

See more details on using hashes here.

File details

Details for the file mecab_text_cleaner-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for mecab_text_cleaner-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 da72eb92553caf58fbe0e3da1a431c58a8985bee642effc399a10610e6dba701
MD5 0b1337fd2e535d92c4bde47d3f54a8dd
BLAKE2b-256 1041ab609a8832fff30bf2161c81abc9afeb7c94ab85259a187ca69fd82d0b8d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page