汉字五笔转换模块/工具

These details have not been verified by PyPI

Project links

Project description

pywubi - 汉字转五笔编码

python_version

English

一个将汉字转换为五笔编码的 Python 工具库。当前支持 86 版 五笔编码，内置约 21,004 个汉字的码表。

特性

单字转码：将单个汉字转换为五笔编码
词组转码：按五笔词组规则（二字、三字、四字及以上）生成编码
多编码查询：返回汉字的所有可能编码
反查功能：根据五笔编码反查对应汉字
模糊反查：不确定的字根可用 z 代替，进行模糊查询猜字
简码查询：获取汉字的最短简码及简码级别
用户词组词库：自定义词组的添加、查询、删除与反查，持久化到独立文件
混合文本：自动分割汉字与非汉字，标点符号原样保留
零依赖：无任何第三方依赖

安装

pip install pywubi

快速开始

from pywubi import wubi

# 逐字转码（默认模式）
wubi('我爱你')
# ['trnt', 'epdc', 'wqiy']

# 返回所有可能的编码
wubi('我爱你', multicode=True)
# [['trnt', 'trn', 'q'], ['epdc', 'epd', 'ep'], ['wqiy', 'wqi', 'wq']]

# 以词组方式取码
wubi('我爱你', single=False)
# ['tewq']

# 混合文本，标点原样保留
wubi('天气不错，出去走走!')
# ['gdi', 'rnb', 'gii', 'qajg', '，', 'bmt', 'fcu', 'tfht', 'tfht', '!']

API 文档

`wubi(hans, multicode=False, single=True)`

将汉字字符串转换为五笔编码。

参数	类型	默认值	说明
`hans`	`str`	-	汉字字符串
`multicode`	`bool`	`False`	是否返回所有编码
`single`	`bool`	`True`	`True` 表示逐字转码，`False` 表示词组取码

返回：list，五笔编码列表

`single_wubi(han, multicode=False)`

将单个汉字转换为五笔编码。

参数	类型	默认值	说明
`han`	`str`	-	单个汉字
`multicode`	`bool`	`False`	是否返回所有编码

返回：str（单编码）或 list[str]（多编码）

`combine_wubi(hans)`

将词组转换为五笔编码。

参数	类型	说明
`hans`	`str`	汉字词组

返回：str，词组的五笔编码

取码规则：

二字词：各取前两码（共 4 码）
三字词：前两字各取首码 + 第三字取前两码（共 4 码）
四字及以上：取第 1、2、3、末字的首码（共 4 码）

`lookup(char)`

查询单个汉字的所有五笔编码。

from pywubi import lookup

lookup('为')   # ['ylyi', 'yly', 'yl', 'o']
lookup('?')    # []

`reverse_lookup(code)`

根据五笔编码反查汉字。

from pywubi import reverse_lookup

reverse_lookup('trnt')  # ['我']
reverse_lookup('q')     # ['我']
reverse_lookup('ggll')  # ['一']

`fuzzy_reverse_lookup(code, limit=10)`

根据五笔编码模糊查询汉字，不确定的字根可用 z 代替。

五笔 86 版编码仅使用 a-y，z 键天然空闲，可作为通配符匹配任意字根。输入不含 z 时等同于精确反查；输入长度决定匹配编码长度。

参数	类型	默认值	说明
`code`	`str`	-	五笔编码，不确定的位置用 `z`/`Z` 代替
`limit`	`int`	`10`	最大返回数量，`0` 表示不限制

返回：list[tuple[str, str]]，格式为 [(汉字, 匹配编码), ...]，按编码字母序排列

from pywubi import fuzzy_reverse_lookup

fuzzy_reverse_lookup('vz')       # [('姑', 'vd'), ('灵', 'vo'), ...]
fuzzy_reverse_lookup('zzzg')     # 只知道末码是 g，返回末码为 g 的四码字
fuzzy_reverse_lookup('trnt')     # 无 z，退化为精确反查
fuzzy_reverse_lookup('zz', limit=5)  # 限制返回 5 条

`brief_code(char)`

获取汉字的最短简码。

from pywubi import brief_code

brief_code('我')  # 'q'
brief_code('一')  # 'g'
brief_code('?')   # None

`brief_level(char)`

获取汉字的简码级别（1=一级简码，2=二级简码，3=三级简码，4=全码）。

from pywubi import brief_level

brief_level('我')  # 1
brief_level('一')  # 1
brief_level('〇')  # 4
brief_level('?')   # None

`add_user_phrase(word, *, dict_path=None)`

添加用户词组，自动生成五笔编码并保存到用户词组文件。

只需输入中文词组（≥2 字），程序会校验输入、生成编码并持久化保存。用户词组文件与内置码表完全隔离，不会影响官方数据。

参数	类型	默认值	说明
`word`	`str`	-	中文词组（至少 2 个字）
`dict_path`	`str \| None`	`None`	可选，指定用户词组文件路径

返回：str，生成的编码

from pywubi import add_user_phrase

add_user_phrase('爱你')            # 'epwq'
add_user_phrase('中华人民共和国')  # 'kwwl'
add_user_phrase('我爱你')          # 'tewq'

用户词组默认保存位置：

平台	路径
Windows	`%APPDATA%\pywubi\user_phrases.json`
macOS	`~/Library/Application Support/pywubi/user_phrases.json`
Linux	`~/.config/pywubi/user_phrases.json`

也可通过环境变量 PYWUBI_USER_DICT 或 dict_path 参数自定义路径。

`lookup_phrase(word, *, dict_path=None)`

查询用户词组的编码。

from pywubi import add_user_phrase, lookup_phrase

add_user_phrase('爱你')
lookup_phrase('爱你')   # 'epwq'
lookup_phrase('你好')   # None（未添加过）

`remove_user_phrase(word, *, dict_path=None)`

删除用户词组。返回 True 表示删除成功，False 表示词组不存在。

from pywubi import add_user_phrase, remove_user_phrase

add_user_phrase('爱你')
remove_user_phrase('爱你')  # True
remove_user_phrase('爱你')  # False（已不存在）

`list_user_phrases(*, dict_path=None)`

列出所有用户词组。返回 dict[str, str]，格式为 {"词组": "编码", ...}。

from pywubi import add_user_phrase, list_user_phrases

add_user_phrase('爱你')
add_user_phrase('你好')
list_user_phrases()  # {'爱你': 'epwq', '你好': 'wqvb'}

`reverse_lookup_phrase(code, *, dict_path=None)`

根据编码反查用户词组。同一编码可对应多个词组（重码），按列表返回。

from pywubi import add_user_phrase, reverse_lookup_phrase

add_user_phrase('爱你')
reverse_lookup_phrase('epwq')  # ['爱你']
reverse_lookup_phrase('xxxx')  # []

开发

# 安装开发依赖
pip install -e ".[dev]"

# 运行测试
pytest

更新日志

0.3.0

新增用户词组词库功能，支持自定义词组的添加、查询、删除与反查
新增 add_user_phrase() 输入中文词组自动生成编码并保存
新增 lookup_phrase() 查询用户词组编码
新增 remove_user_phrase() 删除用户词组
新增 list_user_phrases() 列出所有用户词组
新增 reverse_lookup_phrase() 按编码反查用户词组
用户词组持久化到独立 JSON 文件，不污染内置码表
支持跨平台默认路径、环境变量和参数自定义路径

0.2.0

码表存储从 Python 源码改为 JSON，加载更快、体积更小
新增懒加载机制，import pywubi 不再立即加载码表
新增 lookup() 查询单字所有编码
新增 reverse_lookup() 编码反查汉字
新增 brief_code() 获取最短简码
新增 brief_level() 获取简码级别
添加完整单元测试

0.1.0

修复 single_seg 尾部非汉字字符丢失的问题
修正拼写错误（utlis -> utils，conbin_wubi -> combine_wubi）
所有导入改为包内相对导入
添加类型注解（Type Hints）
添加 .gitignore，清理 .idea/ 目录
修正 README 拼写错误

0.0.2

初始发布版本

License

MIT License，详见 LICENSE。

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.3.0

Apr 14, 2026

0.2.0

Apr 10, 2026

0.0.2

Jan 2, 2019

0.0.1

Jan 2, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pywubi-0.3.0.tar.gz (138.4 kB view details)

Uploaded Apr 14, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pywubi-0.3.0-py3-none-any.whl (134.9 kB view details)

Uploaded Apr 14, 2026 Python 3

File details

Details for the file pywubi-0.3.0.tar.gz.

File metadata

Download URL: pywubi-0.3.0.tar.gz
Upload date: Apr 14, 2026
Size: 138.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for pywubi-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`5f2d36a6490549ff23b1a888ed49ba0e0e8160efd783abfc32e19057bfa84f9e`
MD5	`e5d82d53ad3ebbbab5fdc1e18ad6bc2c`
BLAKE2b-256	`85dafdc7983030700fe9481b657db8277b9d7edda3fd606c793378b4399892d4`

See more details on using hashes here.

File details

Details for the file pywubi-0.3.0-py3-none-any.whl.

File metadata

Download URL: pywubi-0.3.0-py3-none-any.whl
Upload date: Apr 14, 2026
Size: 134.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for pywubi-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b619ef4d29cdda96e1f5b007f3a7318f75287f2682b53c04458ca485cf498470`
MD5	`69bb724ac3f1f6be59e7b8f0e451985a`
BLAKE2b-256	`2a3b391c1ee258f8a06e06ce296d2c8c1f2ff3bc2054d2844909e016b6a93f58`

See more details on using hashes here.

pywubi 0.3.0

Navigation

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Project description

pywubi - 汉字转五笔编码

特性

安装

快速开始

API 文档

wubi(hans, multicode=False, single=True)

single_wubi(han, multicode=False)

combine_wubi(hans)

lookup(char)

reverse_lookup(code)

fuzzy_reverse_lookup(code, limit=10)

brief_code(char)

brief_level(char)

add_user_phrase(word, *, dict_path=None)

lookup_phrase(word, *, dict_path=None)

remove_user_phrase(word, *, dict_path=None)

list_user_phrases(*, dict_path=None)

reverse_lookup_phrase(code, *, dict_path=None)

开发

更新日志

0.3.0

0.2.0

0.1.0

0.0.2

License

Project details

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`wubi(hans, multicode=False, single=True)`

`single_wubi(han, multicode=False)`

`combine_wubi(hans)`

`lookup(char)`

`reverse_lookup(code)`

`fuzzy_reverse_lookup(code, limit=10)`

`brief_code(char)`

`brief_level(char)`

`add_user_phrase(word, *, dict_path=None)`

`lookup_phrase(word, *, dict_path=None)`

`remove_user_phrase(word, *, dict_path=None)`

`list_user_phrases(*, dict_path=None)`

`reverse_lookup_phrase(code, *, dict_path=None)`