US-ASCII transliterations of Unicode text
ASCII transliterations of Unicode text
from unidecode import unidecode print unidecode(u"\u5317\u4EB0") # That prints: Bei Jing
It often happens that you have non-Roman text data in Unicode, but you can’t display it – usually because you’re trying to show it to a user via an application that doesn’t support Unicode, or because the fonts you need aren’t accessible. You could represent the Unicode characters as “???????” or ” BA A0q0…”, but that’s nearly useless to the user who actually wants to read what the text says.
What Unidecode provides is a function, ‘unidecode(…)’ that takes Unicode data and tries to represent it in ASCII characters (i.e., the universally displayable characters between 0x00 and 0x7F). The representation is almost always an attempt at transliteration – i.e., conveying, in Roman letters, the pronunciation expressed by the text in some other writing system. (See the example above)
This is a Python port of Text::Unidecode Perl module by Sean M. Burke <email@example.com>.
Release history Release notifications | RSS feed
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
|Filename, size||File type||Python version||Upload date||Hashes|
|Filename, size Unidecode-0.04.1-py2.6.egg (397.1 kB)||File type Egg||Python version 2.6||Upload date||Hashes View|
|Filename, size Unidecode-0.04.1.tar.gz (167.0 kB)||File type Source||Python version None||Upload date||Hashes View|
|Filename, size Unidecode-0.04.5.tar.gz (186.6 kB)||File type Source||Python version None||Upload date||Hashes View|