Python bindings for ICU4C
Project description
icupy
Python bindings for ICU4C using pybind11.
Changes from ICU4C
-
Naming Conventions
Renamed functions, methods, and C++ enumerators to conform to PEP 8.
- Function Names:
Use
lower_case_with_underscores
style. - Method Names:
Use
lower_case_with_underscores
style. Also, use one leading underscore only for protected methods. - C++ Enumerators: Use
UPPER_CASE_WITH_UNDERSCORES
style without a leading "k". (e.g.,kDateOffset
→DATE_OFFSET
) - APIs that match Python reserved words: e.g.,
with()
→with_()
- Function Names:
Use
-
Error Handling
-
Unlike the C/C++ APIs,
icupy
raises theicupy.icu.ICUError
exception if an error code indicates a failure instead of receiving an error codeUErrorCode
.You can access the icu::ErrorCode object from
ICUError.args[0]
. For example:from icupy import icu try: ... except icu.ICUError as e: print(e.args[0]) # → icupy.icu.ErrorCode print(e.args[0].get()) # → icupy.icu.UErrorCode
-
Examples
-
icu::UnicodeString with error callback
from icupy import icu cnv = icu.ucnv_open('utf-8') action = icu.UCNV_TO_U_CALLBACK_ESCAPE context = icu.ConstVoidPtr(icu.UCNV_ESCAPE_C) icu.ucnv_set_to_ucall_back(cnv, action, context) utf8 = b'\x61\xfe\x62' # Impossible bytes s = icu.UnicodeString(utf8, -1, cnv) str(s) # → 'a\\xFEb' action = icu.UCNV_TO_U_CALLBACK_ESCAPE context = icu.ConstVoidPtr(icu.UCNV_ESCAPE_XML_DEC) icu.ucnv_set_to_ucall_back(cnv, action, context) s = icu.UnicodeString(utf8, -1, cnv) str(s) # → 'aþb'
-
icu::UnicodeString with user callback
from icupy import icu def _to_callback( _context: object, _args: icu.UConverterToUnicodeArgs, _code_units: bytes, _length: int, _reason: icu.UConverterCallbackReason, _error_code: icu.UErrorCode, ) -> icu.UErrorCode: if _reason == icu.UCNV_ILLEGAL: _source = ''.join(['%{:02X}'.format(x) for x in _code_units]) icu.ucnv_cb_to_uwrite_uchars(_args, _source, len(_source), 0) _error_code = icu.U_ZERO_ERROR return _error_code cnv = icu.ucnv_open('utf-8') action = icu.UConverterToUCallbackPtr(_to_callback) context = icu.ConstVoidPtr(None) icu.ucnv_set_to_ucall_back(cnv, action, context) utf8 = b'\x61\xfe\x62' # Impossible bytes s = icu.UnicodeString(utf8, -1, cnv) str(s) # → 'a%FEb'
-
from icupy import icu tz = icu.TimeZone.create_time_zone('America/Los_Angeles') fmt = icu.DateFormat.create_instance_for_skeleton('yMMMMd', icu.Locale.get_english()) fmt.set_time_zone(tz) dest = icu.UnicodeString() s = fmt.format(0, dest) str(s) # → 'December 31, 1969'
-
from icupy import icu fmt = icu.MessageFormat( "At {1,time,::jmm} on {1,date,::dMMMM}, " "there was {2} on planet {0,number}.", icu.Locale.get_us(), ) tz = icu.TimeZone.get_gmt() subfmts = fmt.get_formats() subfmts[0].set_time_zone(tz) subfmts[1].set_time_zone(tz) date = 1637685775000.0 # 2021-11-23T16:42:55Z obj = icu.Formattable( [ icu.Formattable(7), icu.Formattable(date, icu.Formattable.IS_DATE), icu.Formattable(icu.UnicodeString('a disturbance in the Force')), ] ) dest = icu.UnicodeString() s = fmt.format(obj, dest) str(s) # → 'At 4:42 PM on November 23, there was a disturbance in the Force on planet 7.'
-
from icupy import icu fmt = icu.number.NumberFormatter.with_().unit(icu.MeasureUnit.get_meter()).per_unit(icu.MeasureUnit.get_second()) print(fmt.locale(icu.Locale.get_us()).format_double(3000).to_string()) # → '3,000 m/s' print(fmt.locale(icu.Locale.get_france()).format_double(3000).to_string()) # → '3 000 m/s' print(fmt.locale('ar').format_double(3000).to_string()) # → '٣٬٠٠٠ م/ث'
-
from icupy import icu text = icu.UnicodeString('In the meantime Mr. Weston arrived with his small ship.') bi = icu.BreakIterator.create_sentence_instance(icu.Locale('en')) bi.set_text(text) list(bi) # → [20, 55] # filter based on common English language abbreviations bi = icu.BreakIterator.create_sentence_instance(icu.Locale('en@ss=standard')) bi.set_text(text) list(bi) # → [55]
-
icu::IDNA (UTS #46)
from icupy import icu uts46 = icu.IDNA.create_uts46_instance(icu.UIDNA_NONTRANSITIONAL_TO_ASCII) dest = icu.UnicodeString() info = icu.IDNAInfo() uts46.name_to_ascii(icu.UnicodeString('faß.ExAmPlE'), dest, info) info.get_errors() # → 0 str(dest) # → 'xn--fa-hia.example'
-
For more examples, see tests.
Installation
Prerequisites
- Python >=3.9
- ICU4C (ICU - The International Components for Unicode) (>=70 recommended)
- C++17 compatible compiler (see supported compilers)
- CMake >=3.7
Installing prerequisites
-
Windows
Install the following dependencies.
- Python >=3.9
- Pre-built ICU4C binary package (>=70 recommended)
- Visual Studio 2015 Update 3 or newer. Visual Studio 2019 or newer recommended
- CMake >=3.7
- Note: Add CMake to the system PATH.
-
Linux
To install dependencies, run the following command:
-
Ubuntu/Debian:
sudo apt install g++ cmake libicu-dev python3-dev python3-pip
-
Fedora:
sudo dnf install gcc-c++ cmake icu libicu-devel python3-devel
If your system's ICU is out of date, consider building ICU4C from source or installing pre-built ICU4C binary package.
-
Building icupy from source
-
Configuring environment variables:
-
Windows:
-
Set the
ICU_ROOT
environment variable to the root of the ICU installation (default isC:\icu
). For example, if the ICU is located inC:\icu4c
:set ICU_ROOT=C:\icu4c
or in PowerShell:
$env:ICU_ROOT = "C:\icu4c"
-
To verify settings using
icuinfo
(64-bit):%ICU_ROOT%\bin64\icuinfo
or in PowerShell:
& $env:ICU_ROOT\bin64\icuinfo
-
-
Linux:
-
If the ICU is located in a non-regular place, set the
PKG_CONFIG_PATH
andLD_LIBRARY_PATH
environment variables. For example, if the ICU is located in/usr/local
:export PKG_CONFIG_PATH=/usr/local/lib/pkgconfig:$PKG_CONFIG_PATH export LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH
-
To verify settings using
pkg-config
:$ pkg-config --cflags --libs icu-uc -I/usr/local/include -L/usr/local/lib -licuuc -licudata
-
-
-
Installing from PyPI:
pip install icupy
Optionally, CMake environment variables are available. For example, using the Ninja build system and Clang:
CMAKE_GENERATOR=Ninja CXX=clang++ pip install icupy
Alternatively, installing development version from the git repository:
pip install git+https://github.com/miute/icupy.git
Usage
-
Configuring environment variables:
-
Windows:
-
Set the
ICU_ROOT
environment variable to the root of the ICU installation (default isC:\icu
). For example, if the ICU is located inC:\icu4c
:set ICU_ROOT=C:\icu4c
or in PowerShell:
$env:ICU_ROOT = "C:\icu4c"
-
-
Linux:
-
If the ICU is located in a non-regular place, set the
LD_LIBRARY_PATH
environment variables. For example, if the ICU is located in/usr/local
:export LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH
-
-
-
Using
icupy
:import icupy.icu as icu # or from icupy import icu
License
This project is licensed under the MIT License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file icupy-0.20.0.tar.gz
.
File metadata
- Download URL: icupy-0.20.0.tar.gz
- Upload date:
- Size: 566.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7726bf3d4d430076c84c8ecd31f4661d9a7ee6e8ebbb6d93dedc2af54793b8fd |
|
MD5 | 550489018d6562bf0143cd1e416505db |
|
BLAKE2b-256 | 9f60d0fe0379fe359ee41ca07555c5de1eaa1e0731b570e0f5f13528531bbc05 |