Text sorting function for the Czech language
This is a pure-Python library for Czech-language alphabetical sorting.
>>> import czech_sort >>> czech_sort.sorted(['sídliště', 'shoda', 'schody']) ['shoda', 'schody', 'sídliště'] >>> sorted(['sídliště', 'shoda', 'schody'], key=czech_sort.key) ['shoda', 'schody', 'sídliště']
On the command line::
$ python -m czech_sort < file.txt shoda schody sídliště
Why another sorting library?
To sort Python strings in the Czech language, there are three other options:
PyICU. This can sort really well, and do all kinds of wonderful, standards-compliant Unicode things. Perfect for publication-quality results. Unfortunately, ICU can be a major pain to install, making it overkill if you just want to sort a list of strings.
- Set the locale, then use
strxfrm! Try saying that ten times fast!) This depends on the Czech POSIX locale being available, so it's hardly portable.
- Just use Python's built-in string sort. This sorts lexicographically by Unicode codepoints. It might be good enough for you? Maybe?
czech-sort library is a compromise. It should give you good results in
the 99% case.
Do not use this if you need proper sorting of symbols, non-Latin scripts, or diacritics other than Czech/Slovak.
Any other deviation from the relevant standard,
ČSN 97 6030, should be
considered a bug. However, neither the author nor the community at large
have access to the standard, which makes finding such bugs somewhat difficult.
Takes an iterable of strings, and returns a list of them, sorted.
Returns a sort key object for a given string.
This function is suitable as the
key for functions like the built-in
Returns a sort key for a given string, as bytes.
This is suitable as a DB-API custom function like the built-in
WARNING: Do not store the results of this function. The format can change
in future versions of
The czech-sort library can be used with Python 2.6+ and 3.5+.
Under Python 2, it only accepts
Install this into your
virtualenv by running:
$ pip install czech-sort
Bug reports and comments are welcome at Github.
Patches are also welcome! Source code is hosted at Github:
$ git clone http://github.com/encukou/czech-sort
To run the included tests:
$ python -m pip install -e.[test] $ python -m pytest
If you would like to contribute, but are confused by the above,
then please e-mail encukou
The project is licensed under the MIT license. May it serve you well.
bytes_key(Thanks to @honzajavorek!)
- Drop support for Python 2
- Fix bug that prevented sorting strings that contain 'Ł' and/or 'Ø'. (Thanks to @dark-light-cz for reporting and @jiri-one for the PR!)
No code changes. Since this has been stable for five years I decided to call it 1.0.
- Packaging improvements
- Tested with Python 2.7 and 3.5-3.9
- First general release
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Hashes for czech_sort-1.1.0-py3-none-any.whl