Unicode To CP932 Transcoder
Project description
UCP9 - Unicode To CP932 Transcoder
A small python package which helps transcode cp932-incompatible kanji characters to their cp932-compatible equivalents
This module provides a transcoding service:
- FROM: An arbitrary unicode character.
- TO: A cp932-compatible, semantically similar but differently encoded version of the same character.
Usage:
import ucp9
ucp9.convert(string, option)
[string]: a string that contains cp932-incompatible characters
[option]: Option to handle cp932-inconvertible characters.
- "keep": keep the cp932-inconvertible characters. NOTE: the return string WON'T be cp932-compatible.
- "remove": remove the cp932-inconvertible characters.
- "replace": (Default behaviour) replace the cp932-inconvertible characters with "?"
Currently supported unicode character blocks:
- Kangxi Radicals
- Print Standard Character
- Old type
- CJK Radicals Supplement
- Katakana Phonetic Extensions
- CJK Unified Ideographs
- CJK Compatibility Ideographs
- CJK Compatibility Ideographs Supplements
Note:
- cp932-incompatible: characters which cannot encode to cp932 using string.encode(), but could potentially have equivalent cp932-encodable versions of themselves.
- cp932-inconvertible: characters which cannot encode to cp932, and doesn't have a cp932-encodable version.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
ucp9-0.2.1.tar.gz
(13.8 kB
view hashes)
Built Distribution
ucp9-0.2.1-py3-none-any.whl
(13.6 kB
view hashes)