This is a pre-production deployment of Warehouse. Changes made here affect the production instance of PyPI (pypi.python.org).
Help us improve Python packaging - Donate today!

Python access to ICU text collation

Project Description

This package provides a Python interface to the International Component for Unicode (ICU).

Change History

1.0.2 (2006-10-16)

Fixed setup file problems.

1.0.1 (2006-10-16)

Added missing import to setup.py.

1.0 (2006-10-16)

Initial version.

Installation

zope.ucol is installed via setup.py in the usual way.

You must have ICU installed. If ICU isn’t installed in the usual places for include files and libraries on your system, you can provide command-line options to setup.py when building the extensions, as in:

python2.4 setup.py build_ext \
  -I/home/jim/p/z4i/jim-icu/var/opt/icu/include \
  -L/home/jim/p/z4i/jim-icu/var/opt/icu/lib \
  -R/home/jim/p/z4i/jim-icu/var/opt/icu/lib

python2.4 setup.py install

Note that if the libraries are in an unusual place, you will want to specify their location using the -R option so you don’t have to specify it at run-time.

Detailed Documentation

Locale-based text collation using ICU

The zope.ucol package provides a minimal Pythonic wrapper around the u_col C API of the International Components for Unicode (ICU) library. It provides locale-based text collation.

To perform collation, you need to create a collator key factory for your locale. We’ll use the special “root” locale in this example:

>>> import zope.ucol
>>> collator = zope.ucol.Collator("root")

The collator has a key method for creating collation keys from unicode strings. The method can be passed as the key argument to list.sort or to the built-in sorted function.

>>> sorted([u'Sam', u'sally', u'Abe', u'alice', u'Terry', u'tim',
...        u'\U00023119', u'\u62d5'], key=collator.key)
[u'Abe', u'alice', u'sally', u'Sam', u'Terry', u'tim',
 u'\u62d5', u'\U00023119']

There is a cmp method for comparing 2 unicode strings, which can also be used when sorting:

>>> sorted([u'Sam', u'sally', u'Abe', u'alice', u'Terry', u'tim',
...        u'\U00023119', u'\u62d5'], collator.cmp)
[u'Abe', u'alice', u'sally', u'Sam', u'Terry', u'tim',
 u'\u62d5', u'\U00023119']

Note that it is almost always more efficient to pass the key method to sorting functions, rather than the cmp method. The cmp method is more efficient in the special case that strings are long and few and when they tend to differ at their beginnings. This is because computing the entire key can be much more expensive than comparison when the order can be determined based on analyzing a small portion of the original strings.

Collator attributes

You can ask a collator for it’s locale:

>>> collator.locale
'root'

and you can find out whether default collation information was used:

>>> collator.used_default_information
0
>>> collator = zope.ucol.Collator("eek")
>>> collator.used_default_information
1
Release History

Release History

This version
History Node

1.0.2

History Node

1.0.1

History Node

1.0

Download Files

Download Files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

File Name & Checksum SHA256 Checksum Help Version File Type Upload Date
zope.ucol-1.0.2.tar.gz (15.1 kB) Copy SHA256 Checksum SHA256 Source Oct 16, 2006

Supported By

WebFaction WebFaction Technical Writing Elastic Elastic Search Pingdom Pingdom Monitoring Dyn Dyn DNS Sentry Sentry Error Logging CloudAMQP CloudAMQP RabbitMQ Heroku Heroku PaaS Kabu Creative Kabu Creative UX & Design Fastly Fastly CDN DigiCert DigiCert EV Certificate Rackspace Rackspace Cloud Servers DreamHost DreamHost Log Hosting