This is a pre-production deployment of Warehouse. Changes made here affect the production instance of PyPI (pypi.python.org).
Help us improve Python packaging - Donate today!

Text sorting function for the Czech language

Project Description

This is a pure-Python library for Czech-language albhabetical sorting.

Quick Use

From Python:

>>> import czech_sort

>>> czech_sort.sorted(['sídliště', 'shoda', 'schody'])
['shoda', 'schody', 'sídliště']

>>> sorted(['sídliště', 'shoda', 'schody'], key=czech_sort.key)
['shoda', 'schody', 'sídliště']

On the command line:

$ python -m czech_sort < file.txt
shoda
schody
sídliště

Why another sorting library?

To sort Python strings in the Czech language, there are three other options:

  • Use PyICU. This can sort really well, and do all kinds of wonderful, standards-compliant Unicode things. Perfect for publication-quality results. Unfortunately, ICU can be a major pain to install, making it overkill if you just want to sort a list of strings.
  • Set the locale, then use locale.strxfrm. (Yes, strxfrm! Try saying that ten times fast!) This depends on the Czech POSIX locale being available, so it’s hardly portable.
  • Just use Python’s built-in string sort. This sorts lexicographically by Unicode codepoints. It might be good enough for you? Maybe?

Scope

The czech-sort library is a compromise. It should give you good results in the 99% case.

Do not use this if you need proper sorting of symbols, non-Latin scripts, or diacritics other than Czech/Slovak.

Any other deviation from the relevant standard, ČSN 97 6030, should be considered a bug. However, neither the author nor the community at large have access to the standard, which makes finding such bugs somewhat difficult.

Full API

czech_sort.sorted(iterable)

Takes an iterable of strings, and returns a list of them, sorted.

czech_sort.key(s)

Returns a sort key object for a given string.

This function is suitable as the key for functions like the built-in sorted or list.sort.

Compatibility

The czech-sort library can be used with Python 2.6+ and 3.3+.

Under Python 2, it only accepts unicode strings.

Installation

Install this into your virtualenv by running:

pip install czech-sort

Contribute

Bug reports and comments are welcome at Github.

Patches are also welcome! Source code is hosted at Github:

$ git clone http://github.com/encukou/czech-sort

To run the included tests:

$ pip install pytest
$ python -m pytest

If you would like to contribute, but are confused by the above, then please e-mail encukou at gmail dot com.

License

The project is licensed under the MIT license. May it serve you well.

Release History

Release History

This version
History Node

0.4

History Node

0.3

History Node

0.2

History Node

0.1

Download Files

Download Files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

File Name & Checksum SHA256 Checksum Help Version File Type Upload Date
czech_sort-0.4-py3-none-any.whl (8.7 kB) Copy SHA256 Checksum SHA256 3.4 Wheel Sep 5, 2015
czech-sort-0.4.tar.gz (6.3 kB) Copy SHA256 Checksum SHA256 Source Sep 5, 2015

Supported By

WebFaction WebFaction Technical Writing Elastic Elastic Search Pingdom Pingdom Monitoring Dyn Dyn DNS Sentry Sentry Error Logging CloudAMQP CloudAMQP RabbitMQ Heroku Heroku PaaS Kabu Creative Kabu Creative UX & Design Fastly Fastly CDN DigiCert DigiCert EV Certificate Rackspace Rackspace Cloud Servers DreamHost DreamHost Log Hosting