Skip to main content

Unicode to ASCII transliteration

Project description

Any Ascii

jitpack pypi npm

Unicode to ASCII transliteration

Table of Contents

Examples

Script Input Output Actual
résumé resume
© ㎧ Æ № (C) m/s AE No
Mandarin Chinese 深圳 ShenZhen Shenzhen
Cantonese Chinese 深水埗 ShenShuiBu Sham Shui Po
Russian Cyrillic Борис Николаевич Ельцин Boris Nikolaevich El'tsin Boris Nikolayevich Yeltsin
Korean Hangul 반기문 bangimun Ban Ki-Moon
Japanese Hiragana さいたま saitama Saitama
Japanese Kanji 埼玉県 QiYuXian Saitama-ken
Ancient Greek Φειδιππίδης Feidippidis Pheidippides
Modern Greek Δημήτρης Φωτόπουλος Dimitris Fotopoylos Dimitris Fotopoulos

Implementations

Java

String s = AnyAscii.transliterate("άνθρωποι");
// anthropoi

Java 6+ compatible

Available through JitPack

Maven
<repositories>
    <repository>
        <id>jitpack.io</id>
        <url>https://jitpack.io</url>
    </repository>
</repositories>
<dependency>
    <groupId>com.hunterwb</groupId>
    <artifactId>any-ascii</artifactId>
    <version>0.1.0</version>
</dependency>
Gradle
repositories {
    maven { url 'https://jitpack.io' }
}
dependencies {
    implementation 'com.hunterwb:any-ascii:0.1.0'
}

Python

from anyascii import anyascii

s = anyascii('άνθρωποι')
#  anthropoi

Python 3.3+ compatible

Install latest release: pip install anyascii

Install from master: pip install https://github.com/hunterwb/any-ascii/archive/master.zip#subdirectory=python

Node.js

const anyAscii = require('any-ascii');

const s = anyAscii('άνθρωποι');
// anthropoi

Node.js 4+ compatible

Install latest release: npm install any-ascii

Install from master: npm install hunterwb/any-ascii

Glossary

  • Unicode: The universal character set, a global standard to support all the world's languages. Consists of 100,000+ characters used by 150 writing systems. Typically encoded into bytes using UTF-8.
  • ASCII: The most compatible character set. A subset of Unicode/UTF-8 consisting of 128 characters using 7-bits in the range 0x00 - 0x7F. The printable characters are English letters, digits, and punctuation in the range 0x20 - 0x7E, with the remaining being control characters.
  • Transliteration: A mapping from one writing system into another, typically done one character at a time using predictable rules. Transliteration generally preserves the spelling of words, while translation preserves the meaning, and transcription preserves the sound. Transliteration into the Latin script used by English is known as romanization.

See Also

ALA-LC Romanization
BGN/PCGN Romanization
Compart: Unicode Charts
ICAO 9303: Machine Readable Passports
ISO 9: Cyrillic Romanization
Sean M. Burke: Unidecode
Sean M. Burke: Unidecode, Perl Journal
UNGEGN Romanization
Unicode CLDR: Transliteration Guidelines
Unicode Unihan Database
Wikipedia: Romanization of Greek

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

anyascii-0.1.1.tar.gz (156.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

anyascii-0.1.1-py3-none-any.whl (236.2 kB view details)

Uploaded Python 3

File details

Details for the file anyascii-0.1.1.tar.gz.

File metadata

  • Download URL: anyascii-0.1.1.tar.gz
  • Upload date:
  • Size: 156.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.0.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.7.0

File hashes

Hashes for anyascii-0.1.1.tar.gz
Algorithm Hash digest
SHA256 22dca2b387f9ef145fe2d6bdda2a771dd1cdacbce2ecf334a81e7d79a2767a29
MD5 935b9fa76700011201b7be5f2711bbf9
BLAKE2b-256 c20970e27d157b62c46b63958f0b5cf3c5df0fdcdb7c4e85a66d9c4fe4e9edfb

See more details on using hashes here.

File details

Details for the file anyascii-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: anyascii-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 236.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.0.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.7.0

File hashes

Hashes for anyascii-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 1f4c17d0037d8e226113653df462204548defc7b87ffa0f5b2fb4a64e2b72708
MD5 0a5bb54a3fd237d8569acf36e9818611
BLAKE2b-256 11c3d6fba886d012e53ef053464446578a3fc663069ddd480844f7faa6a12022

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page