Skip to main content

Unicode to ASCII transliteration

Project description

Any Ascii

JitPack

Unicode to ASCII transliteration

Table of Contents

Glossary

  • Unicode: The universal character set, a global standard to support all the world's languages. Consists of 100,000+ characters used by 150 writing systems. Typically encoded into bytes using UTF-8.
  • ASCII: The most compatible character set. A subset of Unicode/UTF-8 consisting of 128 characters using 7-bits in the range 0x00 - 0x7F. The visible characters are English letters, digits, and punctuation in the range 0x20 - 0x7E, with the remaining being control characters.
  • Transliteration: A mapping from one writing system into another, typically done one character at a time using predictable rules. Transliteration into the Latin script used by English is known as romanization.

Java

String s = AnyAscii.transliterate("άνθρωποι");
// anthropoi

Java 6+ compatible

Available through JitPack

Maven
<repositories>
    <repository>
        <id>jitpack.io</id>
        <url>https://jitpack.io</url>
    </repository>
</repositories>
<dependency>
    <groupId>com.hunterwb</groupId>
    <artifactId>any-ascii</artifactId>
    <version>${version}</version>
</dependency>
Gradle
repositories {
    maven { url 'https://jitpack.io' }
}
dependencies {
    implementation "com.hunterwb:any-ascii:${version}"
}

Python

from anyascii import anyascii

s = anyascii('άνθρωποι')
#  anthropoi

Python 3.3+ compatible

Install from GitHub

pip install https://github.com/hunterwb/any-ascii/archive/master.zip#subdirectory=python

Node.js

const anyAscii = require('any-ascii');

const s = anyAscii('άνθρωποι');
// anthropoi

Node.js 4+ compatible

See Also

ALA-LC Romanization
BGN/PCGN Romanization
Compart: Unicode Charts
ICAO 9303: Machine Readable Passports
ISO 9: Cyrillic Romanization
Sean M. Burke: Unidecode
Sean M. Burke: Unidecode, Perl Journal
UNGEGN Romanization
Unicode CLDR: Transliteration Guidelines
Unicode Unihan Database

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

anyascii-0.1.0.tar.gz (154.5 kB view hashes)

Uploaded Source

Built Distribution

anyascii-0.1.0-py3-none-any.whl (235.3 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page