Unicode to ASCII transliteration
Project description
Any Ascii
Unicode to ASCII transliteration
Table of Contents
Glossary
- Unicode: The universal character set, a global standard to support all the world's languages. Consists of 100,000+ characters used by 150 writing systems. Typically encoded into bytes using UTF-8.
- ASCII:
The most compatible character set.
A subset of Unicode/UTF-8 consisting of 128 characters using 7-bits in the range
0x00-0x7F. The visible characters are English letters, digits, and punctuation in the range0x20-0x7E, with the remaining being control characters. - Transliteration: A mapping from one writing system into another, typically done one character at a time using predictable rules. Transliteration into the Latin script used by English is known as romanization.
Java
String s = AnyAscii.transliterate("άνθρωποι");
// anthropoi
Java 6+ compatible
Available through JitPack
Maven
<repositories>
<repository>
<id>jitpack.io</id>
<url>https://jitpack.io</url>
</repository>
</repositories>
<dependency>
<groupId>com.hunterwb</groupId>
<artifactId>any-ascii</artifactId>
<version>${version}</version>
</dependency>
Gradle
repositories {
maven { url 'https://jitpack.io' }
}
dependencies {
implementation "com.hunterwb:any-ascii:${version}"
}
Python
from anyascii import anyascii
s = anyascii('άνθρωποι')
# anthropoi
Python 3.3+ compatible
Install from GitHub
pip install https://github.com/hunterwb/any-ascii/archive/master.zip#subdirectory=python
Node.js
const anyAscii = require('any-ascii');
const s = anyAscii('άνθρωποι');
// anthropoi
Node.js 4+ compatible
See Also
ALA-LC Romanization
BGN/PCGN Romanization
Compart: Unicode Charts
ICAO 9303: Machine Readable Passports
ISO 9: Cyrillic Romanization
Sean M. Burke: Unidecode
Sean M. Burke: Unidecode, Perl Journal
UNGEGN Romanization
Unicode CLDR: Transliteration Guidelines
Unicode Unihan Database
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file anyascii-0.1.0.tar.gz.
File metadata
- Download URL: anyascii-0.1.0.tar.gz
- Upload date:
- Size: 154.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.0.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.7.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ac9dc957ff3237801ff57035e911dce0b5d19b0bcf1fa78720efc92f25e356f1
|
|
| MD5 |
20a947f5ca86cec0c5b9f89f5b23a24b
|
|
| BLAKE2b-256 |
95a05227f6aed59cf588c168dddfe408d8374daeadcb91a651e6d8c1020e0d55
|
File details
Details for the file anyascii-0.1.0-py3-none-any.whl.
File metadata
- Download URL: anyascii-0.1.0-py3-none-any.whl
- Upload date:
- Size: 235.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.0.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.7.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
61c31969a12025b430ef28d77548fb469ce21e4e35e730248a968801ade74b73
|
|
| MD5 |
4788b7643e152e530e5a7453fe3a73ef
|
|
| BLAKE2b-256 |
bba61c90e21989ee806d151d136ba18fffbb5b5259be1b538a42471961fe52ba
|