Skip to main content

Python bindings for ICU4C

Project description

icupy

PyPI PyPI - Python Version PyPI - License build wheels tests codecov pre-commit.ci status

Python bindings for ICU4C using pybind11.

Changes from ICU4C

  • Naming Rules

    • Renamed C functions and C++ class methods from mixed case to snake case. (e.g., findAndReplace()find_and_replace())
    • Renamed C++ enumerators to upper snake case without "k" prefix. (e.g., kDateOffsetDATE_OFFSET)
    • Renamed APIs that match Python reserved words. e.g.,
      • with()with_()
  • Error Handling

    • Unlike the C/C++ APIs, icupy raises the ICUError exception if an error code indicates a failure instead of receiving an error code UErrorCode.

      You can access the icu::ErrorCode object from ICUError.args[0]. For example:

      from icupy import icu
      try:
          ...
      except icu.ICUError as e:
          print(e.args[0])  # → icupy.icu.ErrorCode
          print(e.args[0].get())  # → icupy.icu.UErrorCode
      

Examples

  • icu::UnicodeString with error callback

    from icupy import icu
    cnv = icu.ucnv_open('utf-8')
    action = icu.UCNV_TO_U_CALLBACK_ESCAPE
    context = icu.ConstVoidPtr(icu.UCNV_ESCAPE_C)
    icu.ucnv_set_to_ucall_back(cnv, action, context)
    utf8 = b'\x61\xfe\x62'  # Impossible bytes
    s = icu.UnicodeString(utf8, -1, cnv)
    str(s)  # → 'a\\xFEb'
    
    action = icu.UCNV_TO_U_CALLBACK_ESCAPE
    context = icu.ConstVoidPtr(icu.UCNV_ESCAPE_XML_DEC)
    icu.ucnv_set_to_ucall_back(cnv, action, context)
    s = icu.UnicodeString(utf8, -1, cnv)
    str(s)  # → 'aþb'
    
  • icu::UnicodeString with user callback

    from icupy import icu
    def _to_callback(
        _context: object,
        _args: icu.UConverterToUnicodeArgs,
        _code_units: bytes,
        _length: int,
        _reason: icu.UConverterCallbackReason,
        _error_code: icu.UErrorCode,
    ) -> icu.UErrorCode:
        if _reason == icu.UCNV_ILLEGAL:
            _source = ''.join(['%{:02X}'.format(x) for x in _code_units])
            icu.ucnv_cb_to_uwrite_uchars(_args, _source, len(_source), 0)
            _error_code = icu.U_ZERO_ERROR
        return _error_code
    
    cnv = icu.ucnv_open('utf-8')
    action = icu.UConverterToUCallbackPtr(_to_callback)
    context = icu.ConstVoidPtr(None)
    icu.ucnv_set_to_ucall_back(cnv, action, context)
    utf8 = b'\x61\xfe\x62'  # Impossible bytes
    s = icu.UnicodeString(utf8, -1, cnv)
    str(s)  # → 'a%FEb'
    
  • icu::DateFormat

    from icupy import icu
    tz = icu.TimeZone.create_time_zone('America/Los_Angeles')
    fmt = icu.DateFormat.create_instance_for_skeleton('yMMMMd', icu.Locale.get_english())
    fmt.set_time_zone(tz)
    dest = icu.UnicodeString()
    s = fmt.format(0, dest)
    str(s)  # → 'December 31, 1969'
    
  • icu::MessageFormat

    from icupy import icu
    fmt = icu.MessageFormat(
        "At {1,time,::jmm} on {1,date,::dMMMM}, "
        "there was {2} on planet {0,number}.",
        icu.Locale.get_us(),
    )
    tz = icu.TimeZone.get_gmt()
    subfmts = fmt.get_formats()
    subfmts[0].set_time_zone(tz)
    subfmts[1].set_time_zone(tz)
    date = 1637685775000.0  # 2021-11-23T16:42:55Z
    obj = icu.Formattable(
        [
            icu.Formattable(7),
            icu.Formattable(date, icu.Formattable.IS_DATE),
            icu.Formattable(icu.UnicodeString('a disturbance in the Force')),
        ]
    )
    dest = icu.UnicodeString()
    s = fmt.format(obj, dest)
    str(s)  # → 'At 4:42 PM on November 23, there was a disturbance in the Force on planet 7.'
    
  • icu::number::NumberFormatter

    from icupy import icu
    fmt = icu.number.NumberFormatter.with_().unit(icu.MeasureUnit.get_meter()).per_unit(icu.MeasureUnit.get_second())
    print(fmt.locale(icu.Locale.get_us()).format_double(3000).to_string())  # → '3,000 m/s'
    print(fmt.locale(icu.Locale.get_france()).format_double(3000).to_string())  # → '3 000 m/s'
    print(fmt.locale('ar').format_double(3000).to_string())  # → '٣٬٠٠٠ م/ث'
    
  • icu::BreakIterator

    from icupy import icu
    text = icu.UnicodeString('In the meantime Mr. Weston arrived with his small ship.')
    bi = icu.BreakIterator.create_sentence_instance(icu.Locale('en'))
    bi.set_text(text)
    list(bi)  # → [20, 55]
    # filter based on common English language abbreviations
    bi = icu.BreakIterator.create_sentence_instance(icu.Locale('en@ss=standard'))
    bi.set_text(text)
    list(bi)  # → [55]
    
  • icu::IDNA (UTS #46)

    from icupy import icu
    uts46 = icu.IDNA.create_uts46_instance(icu.UIDNA_NONTRANSITIONAL_TO_ASCII)
    dest = icu.UnicodeString()
    info = icu.IDNAInfo()
    uts46.name_to_ascii(icu.UnicodeString('faß.ExAmPlE'), dest, info)
    info.get_errors()  # → 0
    str(dest)  # → 'xn--fa-hia.example'
    
  • For more examples, see tests.

Installation

Prerequisites

Installing prerequisites

  • Windows

    Install the following dependencies.

    • Python >=3.7
    • Pre-built ICU4C binary package (>=64 recommended)
    • Visual Studio 2015 Update 3 or newer. Visual Studio 2019 recommended
    • CMake >=3.7
      • Note: Add CMake to the system PATH.
  • Linux

    To install dependencies, run the following command:

    • Ubuntu/Debian:

      sudo apt install g++ cmake libicu-dev python3-dev python3-pip
      
    • Fedora:

      sudo dnf install gcc-c++ cmake icu libicu-devel python3-devel
      

    If your system's ICU is out of date, consider building ICU4C from source or installing pre-built ICU4C binary package.

Building icupy from source

  1. Configuring environment variables:

    • Windows:

      • Set the ICU_ROOT environment variable to the root of the ICU installation (default is C:\icu). For example, if the ICU is located in C:\icu4c:

        set ICU_ROOT=C:\icu4c
        

        or in PowerShell:

        $env:ICU_ROOT = "C:\icu4c"
        
      • To verify settings using icuinfo (64-bit):

        %ICU_ROOT%\bin64\icuinfo
        

        or in PowerShell:

        & $env:ICU_ROOT\bin64\icuinfo
        
    • Linux:

      • If the ICU is located in a non-regular place, set the PKG_CONFIG_PATH and LD_LIBRARY_PATH environment variables. For example, if the ICU is located in /usr/local:

        export PKG_CONFIG_PATH=/usr/local/lib/pkgconfig:$PKG_CONFIG_PATH
        export LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH
        
      • To verify settings using pkg-config:

        $ pkg-config --cflags --libs icu-uc
        -I/usr/local/include -L/usr/local/lib -licuuc -licudata
        
  2. Installing from PyPI:

    pip install icupy
    

    Optionally, CMake environment variables are available. For example, using the Ninja build system and Clang:

    CMAKE_GENERATOR=Ninja CXX=clang++ pip install icupy
    

    Alternatively, installing development version from the git repository:

    pip install git+https://github.com/miute/icupy.git
    

How to import icupy

  1. Configuring environment variables:

    • Windows:

      • Set the ICU_ROOT environment variable to the root of the ICU installation (default is C:\icu). For example, if the ICU is located in C:\icu4c:

        set ICU_ROOT=C:\icu4c
        

        or in PowerShell:

        $env:ICU_ROOT = "C:\icu4c"
        
    • Linux:

      • If the ICU is located in a non-regular place, set the LD_LIBRARY_PATH environment variables. For example, if the ICU is located in /usr/local:

        export LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH
        
  2. To use icupy:

    import icupy.icu as icu
    # or
    from icupy import icu
    

License

This project is licensed under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

icupy-0.12.0.tar.gz (505.9 kB view details)

Uploaded Source

File details

Details for the file icupy-0.12.0.tar.gz.

File metadata

  • Download URL: icupy-0.12.0.tar.gz
  • Upload date:
  • Size: 505.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.8.10

File hashes

Hashes for icupy-0.12.0.tar.gz
Algorithm Hash digest
SHA256 445471f5a0068e53b49bec6cf49958e674aeab6e8e4134d90876739bc9e076ea
MD5 15e9298c4a244f6e5704f5db1831c13b
BLAKE2b-256 a9767cb0561a23089aaeb079030c40a842a60d6dc1ead4355051f1ee8106cfb7

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page