Skip to main content

All things you need for Uyghur Language.

Project description

维吾尔文处理

English

动机

本人因课题需要,需要对维吾尔文进行自然语言处理,但维文的文本是自右向左排版的,在很多时候不利手动处理(如选中复制,虽然应避免手动处理,但仍有场景),而且与其他自左向右的文字混在一起时会出现一些问题(例如显示在在右侧的数字其实一个在句首一个在句末),且很多字母在词首词中词尾和单列时是不一样的,所以有些没有连字字体的系统需要单独定义字符。为了方便我这个小辣鸡我需要一个小工具转换老维文为拉丁维文。

术语

UEY : 老维文,中国新疆官方唯一官方字母表,在公共媒体和日常生活中使用;

UKY : 西里尔维文,在中亚尤其是哈萨克斯坦使用;

ULY : 维吾尔语拉丁字母是在2008年推出的,只在计算机相关领域作为辅助书写系统使用,但在所有设备上扩大使用UEY键盘后,现在基本上已经废弃了。

UYY : 新维字(也叫拼音Yeziⱪi或UPNY),这种字母也是基于拉丁文的,但现在大多数想用拉丁文打字的人都用ULY代替。

安装

$ pip install uyghur

用法

from uyghur.conversion import uey2uly

print(uey2uly('پلام، جهان'))

测试

在 CPython 3.8/3.9/3.10 和 Pypy 3.8 测试,如需自行运行测试,执行下列命令

tox

待办

  • UEY2ULY
  • ULY2UEY
  • UEY2UKY
  • UKY2UEY
  • UEY2UYY
  • UYY2UEY
  • TEXT2SPEECH

参考文献

  1. DB65/T 3690-2015 现行维吾尔文与拉丁维吾尔文编码字符转换规则

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

uyghur-0.1.1.tar.gz (43.0 kB view details)

Uploaded Source

Built Distribution

uyghur-0.1.1-py3-none-any.whl (29.2 kB view details)

Uploaded Python 3

File details

Details for the file uyghur-0.1.1.tar.gz.

File metadata

  • Download URL: uyghur-0.1.1.tar.gz
  • Upload date:
  • Size: 43.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.6.0 importlib_metadata/4.8.1 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.63.1 CPython/3.10.4

File hashes

Hashes for uyghur-0.1.1.tar.gz
Algorithm Hash digest
SHA256 e569c604d317608d4cc4c93902401a8cb18c77ec8bf4f28209d92f4eaf661bff
MD5 eef94e0ce05d2883796c32fe200ec3ac
BLAKE2b-256 747b0ee5307a58a2ae92d2b07fdff748d940badb4e1615e802548748b11d947e

See more details on using hashes here.

File details

Details for the file uyghur-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: uyghur-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 29.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.6.0 importlib_metadata/4.8.1 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.63.1 CPython/3.10.4

File hashes

Hashes for uyghur-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 1eb6c42da7392f63968f5850e211e8c86453c0ae59f0d1ea501be40044184e7e
MD5 72edc31d54e16739117550b64d7bab01
BLAKE2b-256 4a0a558f2df20e282c05ed5f1d1421b8e1f8155598dc70efde20773699b3d8c8

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page