Skip to main content

Unicode to ASCII transliteration

Project description

Any Ascii

JitPack

Unicode to ASCII transliteration

Table of Contents

Glossary

  • Unicode: The universal character set, a global standard to support all the world's languages. Consists of 100,000+ characters used by 150 writing systems. Typically encoded into bytes using UTF-8.
  • ASCII: The most compatible character set. A subset of Unicode/UTF-8 consisting of 128 characters using 7-bits in the range 0x00 - 0x7F. The visible characters are English letters, digits, and punctuation in the range 0x20 - 0x7E, with the remaining being control characters.
  • Transliteration: A mapping from one writing system into another, typically done one character at a time using predictable rules. Transliteration into the Latin script used by English is known as romanization.

Java

String s = AnyAscii.transliterate("άνθρωποι");
// anthropoi

Java 6+ compatible

Available through JitPack

Maven
<repositories>
    <repository>
        <id>jitpack.io</id>
        <url>https://jitpack.io</url>
    </repository>
</repositories>
<dependency>
    <groupId>com.hunterwb</groupId>
    <artifactId>any-ascii</artifactId>
    <version>${version}</version>
</dependency>
Gradle
repositories {
    maven { url 'https://jitpack.io' }
}
dependencies {
    implementation "com.hunterwb:any-ascii:${version}"
}

Python

from anyascii import anyascii

s = anyascii('άνθρωποι')
#  anthropoi

Python 3.3+ compatible

Install from GitHub

pip install https://github.com/hunterwb/any-ascii/archive/master.zip#subdirectory=python

Node.js

const anyAscii = require('any-ascii');

const s = anyAscii('άνθρωποι');
// anthropoi

Node.js 4+ compatible

See Also

ALA-LC Romanization
BGN/PCGN Romanization
Compart: Unicode Charts
ICAO 9303: Machine Readable Passports
ISO 9: Cyrillic Romanization
Sean M. Burke: Unidecode
Sean M. Burke: Unidecode, Perl Journal
UNGEGN Romanization
Unicode CLDR: Transliteration Guidelines
Unicode Unihan Database

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

anyascii-0.1.0.tar.gz (154.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

anyascii-0.1.0-py3-none-any.whl (235.3 kB view details)

Uploaded Python 3

File details

Details for the file anyascii-0.1.0.tar.gz.

File metadata

  • Download URL: anyascii-0.1.0.tar.gz
  • Upload date:
  • Size: 154.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.0.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.7.0

File hashes

Hashes for anyascii-0.1.0.tar.gz
Algorithm Hash digest
SHA256 ac9dc957ff3237801ff57035e911dce0b5d19b0bcf1fa78720efc92f25e356f1
MD5 20a947f5ca86cec0c5b9f89f5b23a24b
BLAKE2b-256 95a05227f6aed59cf588c168dddfe408d8374daeadcb91a651e6d8c1020e0d55

See more details on using hashes here.

File details

Details for the file anyascii-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: anyascii-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 235.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.0.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.7.0

File hashes

Hashes for anyascii-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 61c31969a12025b430ef28d77548fb469ce21e4e35e730248a968801ade74b73
MD5 4788b7643e152e530e5a7453fe3a73ef
BLAKE2b-256 bba61c90e21989ee806d151d136ba18fffbb5b5259be1b538a42471961fe52ba

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page