異体字正規化モジュール
Project description
異体字正規化モジュール
「髙(はしご高)」「﨑(立つ崎)」などの異体字を、標準文字(JIS文字集合に含まれる文字)へと変換します。
導入方法
$ pip install ja-cvu-normalizer
利用例
from ja_cvu_normalizer.ja_cvu_normalizer import JaCvuNormalizer
text = "髙橋"
ja_cvu_normalizer = JaCvuNormalizer()
print(ja_cvu_normalizer.normalize(text))
# -> 高橋
謝辞
resource/ISO-2022-JP.txt
は異字体変換テーブルのリポジトリから拝借させていただきました。
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
ja_cvu_normalizer-0.2.10.tar.gz
(34.3 kB
view details)
Built Distribution
File details
Details for the file ja_cvu_normalizer-0.2.10.tar.gz
.
File metadata
- Download URL: ja_cvu_normalizer-0.2.10.tar.gz
- Upload date:
- Size: 34.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.4.26
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5aac6ca6edaa639aa672ef29fb0bad73658f9e074eb26a61dfea0fb282970437 |
|
MD5 | 0d6a20da9072f0352d594ba83a205f01 |
|
BLAKE2b-256 | 340f286f0b60e85e76b3f78269e93d604df01f847700dd7ba7087324aa51782c |
File details
Details for the file ja_cvu_normalizer-0.2.10-py3-none-any.whl
.
File metadata
- Download URL: ja_cvu_normalizer-0.2.10-py3-none-any.whl
- Upload date:
- Size: 33.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.4.26
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 26f00b6c5d9c5468430a6b7765f73cbcafec7578897d80e487b96c55127246f0 |
|
MD5 | 3391f9b0aa54fcf052dca46b648358c7 |
|
BLAKE2b-256 | 48f36cf26589d228b02892328d4a7a74b638e957a0aecef27ca9ceb55356e02c |