Text sorting function for the Czech language
Project description
Czech Sort
This is a pure-Python library for Czech-language alphabetical sorting.
Quick Use
From Python:
>>> import czech_sort >>> czech_sort.sorted(['sídliště', 'shoda', 'schody']) ['shoda', 'schody', 'sídliště'] >>> sorted(['sídliště', 'shoda', 'schody'], key=czech_sort.key) ['shoda', 'schody', 'sídliště']
On the command line::
$ python -m czech_sort < file.txt shoda schody sídliště
Why another sorting library?
To sort Python strings in the Czech language, there are three other options:
- Use
PyICU
. This can sort really well, and do all kinds of wonderful, standards-compliant Unicode things. Perfect for publication-quality results. Unfortunately, ICU can be a major pain to install, making it overkill if you just want to sort a list of strings. - Set the locale, then use
locale.strxfrm
. (Yes,strxfrm
! Try saying that ten times fast!) This depends on the Czech POSIX locale being available, so it's hardly portable. - Just use Python's built-in string sort. This sorts lexicographically by Unicode codepoints. It might be good enough for you? Maybe?
Scope
The czech-sort
library is a compromise. It should give you good results in
the 99% case.
Do not use this if you need proper sorting of symbols, non-Latin scripts, or diacritics other than Czech/Slovak.
Any other deviation from the relevant standard, ČSN 97 6030
, should be
considered a bug. However, neither the author nor the community at large
have access to the standard, which makes finding such bugs somewhat difficult.
Full API
czech_sort.sorted(iterable)
Takes an iterable of strings, and returns a list of them, sorted.
czech_sort.key(s)
Returns a sort key object for a given string.
This function is suitable as the key
for functions like the built-in
sorted
or list.sort
.
Compatibility
The czech-sort library can be used with Python 2.6+ and 3.5+.
Under Python 2, it only accepts unicode
strings.
Installation
Install this into your virtualenv
by running:
$ pip install czech-sort
Contribute
Bug reports and comments are welcome at Github.
Patches are also welcome! Source code is hosted at Github:
$ git clone http://github.com/encukou/czech-sort
To run the included tests:
$ python -m pip install -e.[test] $ python -m pytest
If you would like to contribute, but are confused by the above,
then please e-mail encukou at
gmail dot
com.
License
The project is licensed under the MIT license. May it serve you well.
Changelog
1.0.0 (2020-09-14)
No code changes. Since this has been stable for five years I decided to call it 1.0.
- Packaging improvements
- Tested with Python 2.7 and 3.5-3.9
0.4 (2015-09-05)
- First general release
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Filename, size | File type | Python version | Upload date | Hashes |
---|---|---|---|---|
Filename, size czech_sort-1.0.0-py3-none-any.whl (7.6 kB) | File type Wheel | Python version py3 | Upload date | Hashes View |
Hashes for czech_sort-1.0.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b173b87550e6a5e2dc5c704d0b289734d014437fe88b7cf6aa294e749c2f7f1e |
|
MD5 | 205f1f034989a3a6f3cd0bfc735b53ee |
|
BLAKE2-256 | bb9b1b30f85fb5ad010c3eed7e2fe1d86dc31d300c9b5f34bb3ba22e6134bcd2 |