Skip to main content

Python bindings for ICU4C

Project description

icupy

PyPI PyPI - Python Version icu PyPI - License pre-commit.ci status tests build wheels codecov

Python bindings for ICU4C using pybind11.

Changes from ICU4C

  • Naming Conventions

    Renamed functions, methods, and C++ enumerators to conform to PEP 8.

    • Function Names: Use lower_case_with_underscores style.
    • Method Names: Use lower_case_with_underscores style. Also, use one leading underscore only for protected methods.
    • C++ Enumerators: Use UPPER_CASE_WITH_UNDERSCORES style without a leading "k". (e.g., kDateOffsetDATE_OFFSET)
    • APIs that match Python reserved words: e.g.,
      • with()with_()
  • Error Handling

    • Unlike the C/C++ APIs, icupy raises the icupy.icu.ICUError exception if an error code indicates a failure instead of receiving an error code UErrorCode.

      You can access the icu::ErrorCode object from ICUError.args[0]. For example:

      from icupy import icu
      try:
          ...
      except icu.ICUError as e:
          print(e.args[0])  # → icupy.icu.ErrorCode
          print(e.args[0].get())  # → icupy.icu.UErrorCode
      

Examples

  • icu::UnicodeString with error callback

    from icupy import icu
    cnv = icu.ucnv_open('utf-8')
    action = icu.UCNV_TO_U_CALLBACK_ESCAPE
    context = icu.ConstVoidPtr(icu.UCNV_ESCAPE_C)
    icu.ucnv_set_to_ucall_back(cnv, action, context)
    utf8 = b'\x61\xfe\x62'  # Impossible bytes
    s = icu.UnicodeString(utf8, -1, cnv)
    str(s)  # → 'a\\xFEb'
    
    action = icu.UCNV_TO_U_CALLBACK_ESCAPE
    context = icu.ConstVoidPtr(icu.UCNV_ESCAPE_XML_DEC)
    icu.ucnv_set_to_ucall_back(cnv, action, context)
    s = icu.UnicodeString(utf8, -1, cnv)
    str(s)  # → 'aþb'
    
  • icu::UnicodeString with user callback

    from icupy import icu
    def _to_callback(
        _context: object,
        _args: icu.UConverterToUnicodeArgs,
        _code_units: bytes,
        _length: int,
        _reason: icu.UConverterCallbackReason,
        _error_code: icu.UErrorCode,
    ) -> icu.UErrorCode:
        if _reason == icu.UCNV_ILLEGAL:
            _source = ''.join(['%{:02X}'.format(x) for x in _code_units])
            icu.ucnv_cb_to_uwrite_uchars(_args, _source, len(_source), 0)
            _error_code = icu.U_ZERO_ERROR
        return _error_code
    
    cnv = icu.ucnv_open('utf-8')
    action = icu.UConverterToUCallbackPtr(_to_callback)
    context = icu.ConstVoidPtr(None)
    icu.ucnv_set_to_ucall_back(cnv, action, context)
    utf8 = b'\x61\xfe\x62'  # Impossible bytes
    s = icu.UnicodeString(utf8, -1, cnv)
    str(s)  # → 'a%FEb'
    
  • icu::DateFormat

    from icupy import icu
    tz = icu.TimeZone.create_time_zone('America/Los_Angeles')
    fmt = icu.DateFormat.create_instance_for_skeleton('yMMMMd', icu.Locale.get_english())
    fmt.set_time_zone(tz)
    dest = icu.UnicodeString()
    s = fmt.format(0, dest)
    str(s)  # → 'December 31, 1969'
    
  • icu::MessageFormat

    from icupy import icu
    fmt = icu.MessageFormat(
        "At {1,time,::jmm} on {1,date,::dMMMM}, "
        "there was {2} on planet {0,number}.",
        icu.Locale.get_us(),
    )
    tz = icu.TimeZone.get_gmt()
    subfmts = fmt.get_formats()
    subfmts[0].set_time_zone(tz)
    subfmts[1].set_time_zone(tz)
    date = 1637685775000.0  # 2021-11-23T16:42:55Z
    obj = icu.Formattable(
        [
            icu.Formattable(7),
            icu.Formattable(date, icu.Formattable.IS_DATE),
            icu.Formattable(icu.UnicodeString('a disturbance in the Force')),
        ]
    )
    dest = icu.UnicodeString()
    s = fmt.format(obj, dest)
    str(s)  # → 'At 4:42 PM on November 23, there was a disturbance in the Force on planet 7.'
    
  • icu::number::NumberFormatter

    from icupy import icu
    fmt = icu.number.NumberFormatter.with_().unit(icu.MeasureUnit.get_meter()).per_unit(icu.MeasureUnit.get_second())
    print(fmt.locale(icu.Locale.get_us()).format_double(3000).to_string())  # → '3,000 m/s'
    print(fmt.locale(icu.Locale.get_france()).format_double(3000).to_string())  # → '3 000 m/s'
    print(fmt.locale('ar').format_double(3000).to_string())  # → '٣٬٠٠٠ م/ث'
    
  • icu::BreakIterator

    from icupy import icu
    text = icu.UnicodeString('In the meantime Mr. Weston arrived with his small ship.')
    bi = icu.BreakIterator.create_sentence_instance(icu.Locale('en'))
    bi.set_text(text)
    list(bi)  # → [20, 55]
    # filter based on common English language abbreviations
    bi = icu.BreakIterator.create_sentence_instance(icu.Locale('en@ss=standard'))
    bi.set_text(text)
    list(bi)  # → [55]
    
  • icu::IDNA (UTS #46)

    from icupy import icu
    uts46 = icu.IDNA.create_uts46_instance(icu.UIDNA_NONTRANSITIONAL_TO_ASCII)
    dest = icu.UnicodeString()
    info = icu.IDNAInfo()
    uts46.name_to_ascii(icu.UnicodeString('faß.ExAmPlE'), dest, info)
    info.get_errors()  # → 0
    str(dest)  # → 'xn--fa-hia.example'
    
  • For more examples, see tests.

Installation

Prerequisites

Installing prerequisites

  • Windows

    Install the following dependencies.

  • Linux

    To install dependencies, run the following command:

    • Ubuntu/Debian:

      sudo apt install g++ cmake libicu-dev python3-dev python3-pip
      
    • Fedora:

      sudo dnf install gcc-c++ cmake icu libicu-devel python3-devel
      

    If your system's ICU is out of date, consider building ICU4C from source or installing pre-built ICU4C binary package.

Building icupy from source

  1. Configuring environment variables:

    • Windows:

      • Set the ICU_ROOT environment variable to the root of the ICU installation (default is C:\icu). For example, if the ICU is located in C:\icu4c:

        set ICU_ROOT=C:\icu4c
        

        or in PowerShell:

        $env:ICU_ROOT = "C:\icu4c"
        
      • To verify settings using icuinfo (64-bit):

        %ICU_ROOT%\bin64\icuinfo
        

        or in PowerShell:

        & $env:ICU_ROOT\bin64\icuinfo
        
    • Linux:

      • If the ICU is located in a non-regular place, set the PKG_CONFIG_PATH and LD_LIBRARY_PATH environment variables. For example, if the ICU is located in /usr/local:

        export PKG_CONFIG_PATH=/usr/local/lib/pkgconfig:$PKG_CONFIG_PATH
        export LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH
        
      • To verify settings using pkg-config:

        $ pkg-config --cflags --libs icu-uc
        -I/usr/local/include -L/usr/local/lib -licuuc -licudata
        
  2. Installing from PyPI:

    pip install icupy
    

    Optionally, CMake environment variables are available. For example, using the Ninja build system and Clang:

    CMAKE_GENERATOR=Ninja CXX=clang++ pip install icupy
    

    Alternatively, installing development version from the git repository:

    pip install git+https://github.com/miute/icupy.git
    

Usage

  1. Configuring environment variables:

    • Windows:

      • Set the ICU_ROOT environment variable to the root of the ICU installation (default is C:\icu). For example, if the ICU is located in C:\icu4c:

        set ICU_ROOT=C:\icu4c
        

        or in PowerShell:

        $env:ICU_ROOT = "C:\icu4c"
        
    • Linux:

      • If the ICU is located in a non-regular place, set the LD_LIBRARY_PATH environment variables. For example, if the ICU is located in /usr/local:

        export LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH
        
  2. Using icupy:

    import icupy.icu as icu
    # or
    from icupy import icu
    

License

This project is licensed under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

icupy-0.19.0.tar.gz (556.7 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page