Skip to main content

US-ASCII transliterations of Unicode text

Project description

Unihandecode

ASCII transliterations of Unicode text
that recognize CJKV complex charactors


EXAMPLE USE
-----------

from unihandecode import Unihandecoder
u = Unihandecoder(lang='ch')
print d.decode(u"\u660e\u5929\u660e\u5929\u7684\u98ce\u5439")

# That prints: Ming Tian Ming Tian De Feng Chui

u = Unihandecoder(lang='ja')
print d.decode(
u'\u660e\u65e5\u306f\u660e\u65e5\u306e\u98a8\u304c\u5439\u304f')

# That prints: Ashita ha Ashita no Kaze ga Fuku



DESCRIPTION
-----------

It often happens that you have non-Roman text data in Unicode, but
you can't display it -- usually because you're trying to show it
to a user via an application that doesn't support Unicode, or
because the fonts you need aren't accessible. You could represent
the Unicode characters as "???????" or "\15BA\15A0\1610...", but
that's nearly useless to the user who actually wants to read what
the text says.

What Unihandecode provides is a function, 'decode(...)' that
takes Unicode data and tries to represent it in ASCII characters
(i.e., the universally displayable characters between 0x00 and 0x7F).
The representation is almost always an attempt at *transliteration*
-- i.e., conveying, in Roman letters, the pronunciation expressed by
the text in some other writing system. (See the example above)

These are same meaning in both language in example above.
"明天明天的风吹" for Chinese and "明日は明日の風が吹く" for Japanese.
The character "明" is converted "Ming" in Chinese. "明日" is converted
"Ashita" but single charactor "明" will be converted "Mei" in Japanese.

This is an improved version of Python unidecode,
that is Python port of Text::Unidecode Perl module by
Sean M. Burke <sburke@cpan.org>.

REQUIREMENTS
------------

There is no required staff other than standard python libraries.
Because it is still under development for python3, you can use it
with python2.x.(>2.6)


INSTALLATION
------------

You install Unihandecode, as you would install any Python module,
by running these commands:

python setup.py gendict
python setup.py genmap
python setup.py install
python setup.py test

If you got egg package, it is easy to install by
$ easy_install Unihandecode-0.42-py2.7.egg

BUILD
------

To build egg package, we need additional instruction.

python setup.py gendict
python setup.py genmap
python setup.py bdist_egg

LIMITATION
----------

This library uses pickler that format is depend on platform
and python version.
You should re-create dictionary for each python version.


SUPPORT
--------

Questions, bug reports, useful code bits, and suggestions for
Unihandecode are handled on github.com/miurahr/unihandecode


AVAILABILITY
------------

The latest version of Unihandecode is available from
Git repository in github.com:

https://github.com/miurahr/unihandecode

WARNING: There was launchpad.net Bazzar repository named
unhandecode.
It has NOT been maintained and moved github entirely.


COPYRIGHT
---------

Unicode Character Database:
Date: 2010-09-23 09:29:58 UDT [JHJ]
Unicode version: 6.0.0

Copyright (c) 1991-2010 Unicode, Inc.
For terms of use, see http://www.unicode.org/terms_of_use.html
For documentation, see http://www.unicode.org/reports/tr44/

Unidecode's character transliteration tables:

Copyright 2001, Sean M. Burke <sburke@cpan.org>, all rights reserved.

Python code:

Copyright 2010,2011, Hiroshi Miura <miurahr@linux.com>
Copyright 2009, Tomaz Solc <tomaz@zemanta.com>


LICENSE
-------

Unihandecode
Copyright 2010-2013 Hiroshi Miura

This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Unihandecode-0.42.tar.gz (2.8 MB view details)

Uploaded Source

Built Distributions

Unihandecode-0.42-py3.2.egg (3.1 MB view details)

Uploaded Egg

Unihandecode-0.42-py2.7.egg (2.6 MB view details)

Uploaded Egg

Unihandecode-0.40-py3.2.egg (3.1 MB view details)

Uploaded Egg

Unihandecode-0.40-py2.7.egg (2.5 MB view details)

Uploaded Egg

File details

Details for the file Unihandecode-0.42.tar.gz.

File metadata

  • Download URL: Unihandecode-0.42.tar.gz
  • Upload date:
  • Size: 2.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for Unihandecode-0.42.tar.gz
Algorithm Hash digest
SHA256 2b9f870a94e192b5796c46b7be3e8668b1c3e6d8aef33792567de3d86d1d82c0
MD5 fe5d5962013b5e64874a7c4ec01540d2
BLAKE2b-256 1e37ec7730f3fc7cc5e6149064c8db11c0b980470b07ece2cfb3a593089cbbbd

See more details on using hashes here.

File details

Details for the file Unihandecode-0.42-py3.2.egg.

File metadata

File hashes

Hashes for Unihandecode-0.42-py3.2.egg
Algorithm Hash digest
SHA256 0cc86ceeb3c905587d32b650689b0803b48f16aa4480e32073d18d9a1fc9e86f
MD5 2bfebef1d3490e9f260d827954c22f0d
BLAKE2b-256 1c74b8a4d82dd61033aaf2b1d5d98dd531d7ca76bcb092b6d573af224d5f579d

See more details on using hashes here.

File details

Details for the file Unihandecode-0.42-py2.7.egg.

File metadata

File hashes

Hashes for Unihandecode-0.42-py2.7.egg
Algorithm Hash digest
SHA256 2c748a2d302f3bec4b659d4fe18d99618ac6b3222d9322e2e0c690e0a7b9fb42
MD5 56e67160b2de44ef20c4ad5172966919
BLAKE2b-256 a8c4a43e389d5cfb295793d9a94bf5788a0f56200a0e3dfa19864fba7fc6fb94

See more details on using hashes here.

File details

Details for the file Unihandecode-0.40-py3.2.egg.

File metadata

File hashes

Hashes for Unihandecode-0.40-py3.2.egg
Algorithm Hash digest
SHA256 33c919b6d3a48518ced6d36bd74598090bec07bb84b7e20139b5cb7a9cdb5adf
MD5 e3b0c8f6ccecd44c41675b254bd06b56
BLAKE2b-256 755233d2d97c277e5e830b63e3d92a817f6bffd4cca0d5d49e9275056074d7a6

See more details on using hashes here.

File details

Details for the file Unihandecode-0.40-py2.7.egg.

File metadata

File hashes

Hashes for Unihandecode-0.40-py2.7.egg
Algorithm Hash digest
SHA256 00200c9061f75af108ca7902e7f06522d7ed364b910bf69dbfa58f5d3e426611
MD5 042d89df7def366886dcad03af9dfac7
BLAKE2b-256 b818b2735c4f39e9442860693cc69cea5cb3a6951965154e0ca492a964fec2b2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page