Skip to main content

Detect confusable usage of unicode homoglyphs, prevent homograph attacks.

Project description

https://img.shields.io/travis/vhf/confusable_homoglyphs.svg https://img.shields.io/pypi/v/confusable_homoglyphs.svg Documentation Status

a homoglyph is one of two or more graphemes, characters, or glyphs with shapes that appear identical or very similar wikipedia:Homoglyph

Unicode homoglyphs can be a nuisance on the web. Your most popular client, AlaskaJazz, might be upset to be impersonated by a trickster who deliberately chose the username ΑlaskaJazz.

  • AlaskaJazz is single script: only Latin characters.

  • ΑlaskaJazz is mixed-script: the first character is a greek letter.

You might also want to avoid people being tricked into entering their password on www.microsоft.com or www.faϲebook.com instead of www.microsoft.com or www.facebook.com. Here is a utility to play with these confusable homoglyphs.

Not all mixed-script strings have to be ruled out though, you could only exclude mixed-script strings containing characters that might be confused with a character from some unicode blocks of your choosing.

  • Allo and ρττ are fine: single script.

  • AlloΓ is fine when our preferred script alias is ‘latin’: mixed script, but Γ is not confusable.

  • Alloρ is dangerous: mixed script and ρ could be confused with p.

This library is compatible Python 2 and Python 3.

API documentation

Is the data up to date?

Yep.

The unicode blocks aliases and names for each character are extracted from this file provided by the unicode consortium.

The matrix of which character can be confused with which other characters is built using this file provided by the unicode consortium.

This data is stored in two JSON files: categories.json and confusables.json. If you delete them, they will both be recreated by downloading and parsing the two abovementioned files and stored as JSON files again.

History

1.0.0 (2016)

Initial release.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

confusable_homoglyphs-2.0.0.tar.gz (31.2 kB view details)

Uploaded Source

Built Distribution

confusable_homoglyphs-2.0.0-py2.py3-none-any.whl (10.3 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file confusable_homoglyphs-2.0.0.tar.gz.

File metadata

File hashes

Hashes for confusable_homoglyphs-2.0.0.tar.gz
Algorithm Hash digest
SHA256 b3a456b545df4e553c681e2d306c5605ae33e4a1a41a8de81df75d416b4c168f
MD5 ea713167933bcc285e05049c88100ebf
BLAKE2b-256 5866e9237750f3683e9413e913f4271a3cdfd611f1262047eba2eff870d48191

See more details on using hashes here.

File details

Details for the file confusable_homoglyphs-2.0.0-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for confusable_homoglyphs-2.0.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 8186725491537939025f129228802f9e4e8495de045e4eddd77db2e3b11d3d3a
MD5 2e312d4aa7461500bfd77b45fd0b0a34
BLAKE2b-256 c8cda4cc6ac38d556ff339c762c445b80cd1adc0c56369f3fed6b3992abba374

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page