translitcodec

Unicode to 8-bit charset transliteration codec

Project description

-- coding: utf-8 --

Unicode to 8-bit charset transliteration codec.

This package contains codecs for transliterating ISO 10646 texts into best-effort representations using smaller coded character sets (ASCII, ISO 8859, etc.). The translation tables used by the codecs are from the transtab collection by Markus Kuhn.

Three types of transliterating codecs are provided:

“long”, using as many characters as needed to make a natural

replacement. For example, u00e4 LATIN SMALL LETTER A WITH DIAERESIS ä will be replaced with ae.

“short”, using the minimum number of characters to make a replacement. For example, u00e4 LATIN SMALL LETTER A WITH DIAERESIS ä will be replaced with a.

“one”, only performing single character replacements. Characters that can not be transliterated with a single character are passed through unchanged. For example, u2639 WHITE FROWNING FACE ☹ will be passed through unchanged.

Using the codecs is simple:

>>> import translitcodec
>>> u'fácil € ☺'.encode('translit/long')
u'facil EUR :-)'
>>> u'fácil € ☺'.encode('translit/short')
u'facil E :-)'

The codecs return Unicode by default. To receive a bytestring back, either chain the output of encode() to another codec, or append the name of the desired byte encoding to the codec name:

>>> u'fácil € ☺'.encode('translit/one').encode('ascii', 'replace')
'facil E ?'
>>> u'fácil € ☺'.encode('translit/one/ascii', 'replace')
'facil E ?'

The package also supplies a ‘transliterate’ codec, an alias for ‘translit/long’.

Project details

Release history Release notifications | RSS feed

0.7.0

May 8, 2021

0.6.0

Dec 13, 2020

0.5.2

Jan 19, 2020

0.4.0

May 11, 2015

0.3

Feb 13, 2012

This version

0.2

Jan 28, 2011

0.1

Dec 28, 2008

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

translitcodec-0.2.tar.gz (47.3 kB view details)

Uploaded Jan 28, 2011 Source

File details

Details for the file translitcodec-0.2.tar.gz.

File metadata

Download URL: translitcodec-0.2.tar.gz
Upload date: Jan 28, 2011
Size: 47.3 kB
Tags: Source
Uploaded using Trusted Publishing? No

File hashes

Hashes for translitcodec-0.2.tar.gz
Algorithm	Hash digest
SHA256	`bccba9bf795cc644d7e7d09e4dfc081656aa357712f94c9881ea8aba69544b01`
MD5	`37bf6635275d4d45c26ece6e3b5b33bd`
BLAKE2b-256	`de30e8bbf80b6e1a5f05c1669eb39a717c35036d185e71570e85216f82af26f1`

See more details on using hashes here.

translitcodec 0.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta