Skip to main content

Encode and decode text using UTF-9.

Project description

Description

On April 1st 2005, IEEE released the RFC4042UTF-9 and UTF-18 Efficient Transformation Formats of Unicode” :

The current representation formats for Unicode (UTF-7, UTF-8, UTF-16) are not storage and computation efficient on platforms that utilize the 9 bit nonet as a natural storage unit instead of the 8 bit octet.

Since there are not so many architecture that use 9 bit nonets as natural storage units and the release date was on April Fools’ Day, the beautiful UTF-9 was forgotten and no python implementation is available.

This python module is here to fill this gap! ;)

Usage

There are only two functions:

  • utf9encode(string): takes a string and returns a utf9-encoded version.
  • utf9decode(data): takes utf9-encoded data and returns the corresponding string.

Example

>>> import utf9
>>> encoded = utf9.utf9encode(u'ႹЄLᒪo, 🌍ǃ')
>>> print repr(encoded)
'p\xe0\xb7-\x0c!1\xc3\x92\xd5\x1b\xc5\x82\x07n\x83x\xed\xdecX\xf80'
>>> print utf9.utf9decode(encoded)
ႹЄLᒪo, 🌍ǃ

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for utf9, version 0.3.1
Filename, size File type Python version Upload date Hashes
Filename, size utf9-0.3.1.tar.gz (2.1 kB) File type Source Python version None Upload date Hashes View

Supported by

Pingdom Pingdom Monitoring Google Google Object Storage and Download Analytics Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page