Skip to main content

Unicode to ASCII transliteration

Project description

Any Ascii

JitPack

Unicode to ASCII transliteration

Table of Contents

Glossary

  • Unicode: The universal character set, a global standard to support all the world's languages. Consists of 100,000+ characters used by 150 writing systems. Typically encoded into bytes using UTF-8.
  • ASCII: The most compatible character set. A subset of Unicode/UTF-8 consisting of 128 characters using 7-bits in the range 0x00 - 0x7F. The visible characters are English letters, digits, and punctuation in the range 0x20 - 0x7E, with the remaining being control characters.
  • Transliteration: A mapping from one writing system into another, typically done one character at a time using predictable rules. Transliteration into the Latin script used by English is known as romanization.

Java

String s = AnyAscii.transliterate("άνθρωποι");
// anthropoi

Java 6+ compatible

Available through JitPack

Maven
<repositories>
    <repository>
        <id>jitpack.io</id>
        <url>https://jitpack.io</url>
    </repository>
</repositories>
<dependency>
    <groupId>com.hunterwb</groupId>
    <artifactId>any-ascii</artifactId>
    <version>${version}</version>
</dependency>
Gradle
repositories {
    maven { url 'https://jitpack.io' }
}
dependencies {
    implementation "com.hunterwb:any-ascii:${version}"
}

Python

from anyascii import anyascii

s = anyascii('άνθρωποι')
#  anthropoi

Python 3.3+ compatible

Install from GitHub

pip install https://github.com/hunterwb/any-ascii/archive/master.zip#subdirectory=python

Node.js

const anyAscii = require('any-ascii');

const s = anyAscii('άνθρωποι');
// anthropoi

Node.js 4+ compatible

See Also

ALA-LC Romanization
BGN/PCGN Romanization
Compart: Unicode Charts
ICAO 9303: Machine Readable Passports
ISO 9: Cyrillic Romanization
Sean M. Burke: Unidecode
Sean M. Burke: Unidecode, Perl Journal
UNGEGN Romanization
Unicode CLDR: Transliteration Guidelines
Unicode Unihan Database

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for anyascii, version 0.1.0
Filename, size File type Python version Upload date Hashes
Filename, size anyascii-0.1.0-py3-none-any.whl (235.3 kB) File type Wheel Python version py3 Upload date Hashes View
Filename, size anyascii-0.1.0.tar.gz (154.5 kB) File type Source Python version None Upload date Hashes View

Supported by

Pingdom Pingdom Monitoring Google Google Object Storage and Download Analytics Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page