Skip to main content

Unicode casefold support for python 2.

Project description

py2casefold

https://travis-ci.org/rwarren/py2casefold.svg?branch=master

Python 3 has str.casefold(). Python 2 doesn’t. py2casefold brings casefolding support to Python 2.

Installation

pip install py2casefold

Usage

>>> from py2casefold import casefold
>>> print casefold(u"tschüß")
tschüss
>>> casefold(u"ΣίσυφοςfiÆ") == casefold(u"ΣΊΣΥΦΟσFIæ") == u"σίσυφοσfiæ"
True

Note that casefold does not normalize the string. Casefolding and normalization are different operations. For more info see http://www.w3.org/International/wiki/Case_folding, and http://www.w3.org/TR/charmod-norm/.

If you are looking for string similarity you will also probably want to consider one of the unicode normalization options (NFC, NFKC, NFD, NFKD) that are available with Python’s built in unicodedata.normalize().

Speed

At the moment, this pure Python casefold implementation is significantly (> 20x) slower than the optimized py3 C implementation. This can be improved later, but it is currently more than sufficient for basic case folding. As a rough estimate, case folding 100 characters clocks in at ~25μs on an old developer laptop.

Tests

To run the tests on all supported Python version simple use tox.

tox

You will need to have Python 2.7, Python 3.4, Python 3.5 and Python 3.6 installed.

License

BSD and the Unicode license agreement. This module includes data from the Unicode consortium which should include the appropriate notice (see http://unicode.org/copyright.html).

See LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

py2casefold-1.0.1-py2.py3-none-any.whl (20.7 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file py2casefold-1.0.1-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for py2casefold-1.0.1-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 bb370bc530cca1f8b8bb3640f61717f47b525a82d35c470ccf2760b843e26ccf
MD5 5db43fc36a42430b48b245c6e5c5759f
BLAKE2b-256 7e569c100afcfbf57ccd39c6b3dca42ce2470131784ac560fbd4f4e29c576c66

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page