汉字五笔转换模块/工具
Project description
pywubi — Chinese Character to Wubi Encoding
A Python library for converting Chinese characters to Wubi (五笔) input method encoding. Currently supports the 86-version scheme with a built-in dictionary of ~21,004 characters.
Features
- Single-character encoding — convert individual Chinese characters to Wubi codes
- Phrase encoding — generate codes following Wubi phrase rules (2-char, 3-char, 4+ char)
- Multi-code query — return all possible encodings for a character
- Reverse lookup — find characters by Wubi code
- Fuzzy reverse lookup — use
zin place of unknown radicals to guess characters - Brief code query — get the shortest code and its level (1st / 2nd / 3rd / full)
- Mixed text — automatically split Chinese and non-Chinese; punctuation is preserved as-is
- Zero dependencies — no third-party packages required
Installation
pip install pywubi
Quick Start
from pywubi import wubi
# Character-by-character (default)
wubi('我爱你')
# ['trnt', 'epdc', 'wqiy']
# Return all possible codes
wubi('我爱你', multicode=True)
# [['trnt', 'trn', 'q'], ['epdc', 'epd', 'ep'], ['wqiy', 'wqi', 'wq']]
# Phrase mode
wubi('我爱你', single=False)
# ['tewq']
# Mixed text — punctuation preserved
wubi('天气不错,出去走走!')
# ['gdi', 'rnb', 'gii', 'qajg', ',', 'bmt', 'fcu', 'tfht', 'tfht', '!']
API Reference
wubi(hans, multicode=False, single=True)
Convert a Chinese string to Wubi encodings.
| Parameter | Type | Default | Description |
|---|---|---|---|
hans |
str |
— | Chinese character string |
multicode |
bool |
False |
Return all possible codes |
single |
bool |
True |
True for char-by-char, False for phrase mode |
Returns: list — list of Wubi codes
single_wubi(han, multicode=False)
Convert a single Chinese character to Wubi encoding.
| Parameter | Type | Default | Description |
|---|---|---|---|
han |
str |
— | A single Chinese character |
multicode |
bool |
False |
Return all possible codes |
Returns: str (single code) or list[str] (multiple codes)
combine_wubi(hans)
Convert a phrase to Wubi encoding.
| Parameter | Type | Description |
|---|---|---|
hans |
str |
Chinese phrase |
Returns: str — Wubi code for the phrase
Encoding rules:
- 2-char phrase: first 2 codes of each character (4 codes total)
- 3-char phrase: 1st code of char 1 & 2 + first 2 codes of char 3 (4 codes total)
- 4+ char phrase: 1st code of char 1, 2, 3, and last (4 codes total)
lookup(char)
Look up all Wubi codes for a single character.
from pywubi import lookup
lookup('为') # ['ylyi', 'yly', 'yl', 'o']
lookup('?') # []
reverse_lookup(code)
Reverse-lookup characters by Wubi code.
from pywubi import reverse_lookup
reverse_lookup('trnt') # ['我']
reverse_lookup('q') # ['我']
reverse_lookup('ggll') # ['一']
fuzzy_reverse_lookup(code, limit=10)
Fuzzy reverse-lookup characters by Wubi code; use z for unknown radical keys.
Wubi 86 only uses keys a-y; z is naturally unused and serves as a wildcard matching any radical key. When the input contains no z, it behaves the same as an exact reverse lookup. Input length determines the matched code length.
| Parameter | Type | Default | Description |
|---|---|---|---|
code |
str |
— | Wubi code; use z/Z for unknown positions |
limit |
int |
10 |
Max results to return; 0 for unlimited |
Returns: list[tuple[str, str]] — [(character, matched_code), ...] sorted by code
from pywubi import fuzzy_reverse_lookup
fuzzy_reverse_lookup('vz') # [('姑', 'vd'), ('灵', 'vo'), ...]
fuzzy_reverse_lookup('zzzg') # only last key is 'g', find all 4-code chars ending in g
fuzzy_reverse_lookup('trnt') # no z — degrades to exact reverse lookup
fuzzy_reverse_lookup('zz', limit=5) # limit to 5 results
brief_code(char)
Get the shortest (brief) code for a character.
from pywubi import brief_code
brief_code('我') # 'q'
brief_code('一') # 'g'
brief_code('?') # None
brief_level(char)
Get the brief-code level (1 = 1st-level, 2 = 2nd-level, 3 = 3rd-level, 4 = full code).
from pywubi import brief_level
brief_level('我') # 1
brief_level('一') # 1
brief_level('〇') # 4
brief_level('?') # None
Development
# Install dev dependencies
pip install -e ".[dev]"
# Run tests
pytest
Changelog
0.2.0
- Dictionary storage changed from Python source to JSON — faster loading, smaller size
- Added lazy-loading:
import pywubino longer loads the full dictionary immediately - Added
lookup()to query all codes for a character - Added
reverse_lookup()to find characters by code - Added
brief_code()to get the shortest code - Added
brief_level()to get the brief-code level - Added comprehensive unit tests
0.1.0
- Fixed
single_segbug where trailing non-Chinese characters were lost - Fixed typos (
utlis→utils,conbin_wubi→combine_wubi) - Switched to relative imports within the package
- Added type hints
- Added
.gitignore, removed.idea/from tracking - Fixed README typos
0.0.2
- Initial release
PyPI Account Verification
I am the owner of the PyPI account "sfyc23" and the maintainer of this repository: https://github.com/sfyc23/python-wubi
I am currently requesting account recovery for the PyPI project/package "pywubi".
This note is added to help PyPI administrators verify that I still control the source repository associated with the package.
GitHub profile: https://github.com/sfyc23
PyPI project: https://pypi.org/project/pywubi/
Date: 2026-03-30
License
MIT License — see LICENSE for details.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pywubi-0.2.0.tar.gz.
File metadata
- Download URL: pywubi-0.2.0.tar.gz
- Upload date:
- Size: 134.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9221527c9242f88e396a2de0e7d56bbb95d981ae340b43b30f981a888d032c51
|
|
| MD5 |
b5903266d7d1acdf4116b7616f7e98d2
|
|
| BLAKE2b-256 |
0ff678d337e9f9db96eafb1aa1422a6d16f36e05785b2f918a7995a57af65214
|
File details
Details for the file pywubi-0.2.0-py3-none-any.whl.
File metadata
- Download URL: pywubi-0.2.0-py3-none-any.whl
- Upload date:
- Size: 131.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a33bd1c300e8ee44a7ae1de3b56833ef01e8a81e3723e91dd44e8eadc829f548
|
|
| MD5 |
7ce73e6d38f524e8836126ceb2b3c6b6
|
|
| BLAKE2b-256 |
a5e2bcde610417b20a8987c1bd210c981f206a6a8c0bbd88a537a7b47ec96f19
|