Look up Unicode character name or code point label and search in Unicode character names.
Project description
unicode-charnames
This package is built for Unicode version 16.0, released in September 2024.
The library provides:
- A function to retrieve the character name (the normative “Name” property) or the code point label (for characters without names) for any Unicode character.
- A function to get the code point (in the usual 4- to 6-digit hexadecimal format) for a given Unicode character name. The search is case-sensitive and requires an exact match.
- A function to search for characters by name. The search is case-insensitive but requires an exact substring match.
The generic term “character name” refers to the Unicode character “Name” property value for an encoded character. For code points that do not have character names (unassigned, reserved code points, and other special code point types), the Unicode standard uses constructed code point labels in angle brackets to represent these characters.
Installation and updates
To install the package, run:
pip install unicode-charnames
To upgrade to the latest version, run:
pip install unicode-charnames --upgrade
Unicode character database (UCD) version
To retrieve the version of the Unicode character database in use:
>>> from unicode_charnames import UCD_VERSION
>>> UCD_VERSION
'16.0.0'
Example usage
from unicode_charnames import charname, codepoint, search_charnames
# charname
for char in '龠💓\u00E5\u0002':
print(charname(char))
# CJK UNIFIED IDEOGRAPH-9FA0
# BEATING HEART
# LATIN SMALL LETTER A WITH RING ABOVE
# <control-0002>
# codepoint
for name in [
'LATIN CAPITAL LETTER E WITH ACUTE',
'SQUARE ERA NAME REIWA',
'SUPERCALIFRAGILISTICEXPIALIDOCIOUS'
]:
print(codepoint(name))
# 00C9
# 32FF
# None
# search_charnames
for x in search_charnames('break'):
print('\t'.join(x))
# 00A0 NO-BREAK SPACE
# 2011 NON-BREAKING HYPHEN
# 202F NARROW NO-BREAK SPACE
# 4DEA HEXAGRAM FOR BREAKTHROUGH
# FEFF ZERO WIDTH NO-BREAK SPACE
Related resource
This library is based on Section 4.8, “Name,” in the Unicode Core Specification, version 16.0.0.
Licenses
The code is licensed under the MIT license.
Usage of Unicode data files is subject to the UNICODE TERMS OF USE. Additional rights and restrictions regarding Unicode data files and software are outlined in the Unicode Data Files and Software License, a copy of which is included as UNICODE-LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file unicode_charnames-16.0.0.tar.gz
.
File metadata
- Download URL: unicode_charnames-16.0.0.tar.gz
- Upload date:
- Size: 286.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 83ebbc5115daeeecdd6b16856d63124267a6f5e4c89416fefbd745fac711e4b5 |
|
MD5 | b974bb2dbfdbbc142630a273f4fd02ce |
|
BLAKE2b-256 | f125d16bb7075c324992087b9b97465f438ed85facf537147fd6e304ba71efd6 |
File details
Details for the file unicode_charnames-16.0.0-py3-none-any.whl
.
File metadata
- Download URL: unicode_charnames-16.0.0-py3-none-any.whl
- Upload date:
- Size: 306.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9b4c08d8e129f2c2858d4456c0030cd5ec4fc360d651dc4b04ee7859c5606564 |
|
MD5 | 0411cd4a6eed53cdd512be79547b9236 |
|
BLAKE2b-256 | 51521fe5b4188c160e1fca7279ca3f0350ca32ccb9417a9b7d046824715564f2 |