Skip to main content

text_scrambler, a tool to scramble texts

Project description

Using the Unicode confusable characters and other tricks, we can transform a text into another that looks exactly like it but remains different from a machine view.

Examples

Replacing randomly the Latin characters by Greek or Cyrillic letters and adding the ZW(N)J.

Original text:

Herman Melville (August 1, 1819 – September 28, 1891) was an American novelist, short story writer, and poet of the American Renaissance period. Among his best-known works are Moby-Dick (1851), Typee (1846), a romanticized account of his experiences in Polynesia, and Billy Budd, Sailor, a posthumously published novella. Although his reputation was not high at the time of his death, the centennial of his birth in 1919 was the starting point of a Melville revival and Moby-Dick grew to be considered one of the great American novels.

Srambled text (looking the same but totally different):

Неrman Μelvillе (Аugust 1, 1819 – Sерtеmbеr 28, 1891) waѕ аn Amerіcan nοvеliѕt, shοrt stоry wrіtеr, and рoеt οf thе Amеriсаn Rеnaissаnсе реrіοd. Amοng his bеѕt-knοwn works arе Мoby-Diсk (1851), Τyрee (1846), а romаntiсized aсcοunt of his ехperienсеs in Pоlynеѕіа, and Віlly Βudd, Sаilоr, а роѕthumοuѕly рublіshed nοvella. Аlthοugh hiѕ rеputatiоn wаs nоt hіgh аt the tіme оf hіѕ dеath, thе centеnnіаl οf hіѕ bіrth іn 1919 was thе startіng pοint οf a Мelvillе rеvіval аnd Mοby-Dісk grеw to be cоnsіdеrеd оne οf thе grеаt Αmerican novеls.

It is worth to notice that search engines can’t find the original webpage (as free online plagiarism checkers). Searching for Μelvillе (copy-paste it) on Google doesn’t return any match, though the original word Melville does.

Using all of the confusable characters of unicode (see [the unicode confusable characters][1]), we can generate weird looking text worthy of old spam messages:

𝚮‍𝒆‌𝕣‍m‍𝓪‍n‍ ‍𝝡‍ҽ‌𝟙‍∨‍𝘪‍𝘐‌𞺀‍𝓮‍ ‍﴾‍𝓐‍𝞄‍𝓰‍ꞟ‌𑣁‍t‌ ‌1‌,‌ ‍1‍8‌1‍Ⳋ‌ ‍–‍ ‌Ꮥ‌𝖊‍𝞺‌𝐭‍𝖾‌m‍Ƅ‌𝔢‌𝔯‌ ‍Ƨ‍𐌚‌ꓹ‌ ‍1‍ଃ‌𝟿‍1‍]‌ ‍𝘸‍𝐚‍𝚜‍ ‍𝖺‌𝔫‍ ‍Α‍m‌ℯ‌𝔯‌𝓲‌ꮯ‌𝒶‌𝓷‌ ‍n‌ം‍𝝼‍𝔢‍𝙸‌i‌s‌𝖙‍؍‍ ‍𐑈‌𝖍‌ꬽ‍ꭇ‍𝓽‍ ‌𝓼‌𝖙‍ⲟ‌r‌𑣜‍ ‍𝐰‌𝓻‌і‍𝒕‍е‍𝕣‍٫‍ ‍α‌𝒏‌𝕕‍ ‍𝙥‌𝜊‍e‍𝕥‍ ‍ﮨ‍f‌ ‌𝘵‍h‍𝗲‌ ‌Α‌m‍𝐞‍𝐫‌ꙇ‌𝒸‍a‍n‌ ‍𖼵‍𝘦‍𝑛‌𝐚‌𝒾‌𝑠‌𑣁‌𝜶‌𝕟‌𝗰‌𝒆‍ ‌𝟈‍𝖾‌r‍⍳‌ﮫ‌ᑯ‌𐩐‌ ‍Α‌m‍o‍𝓃‌𝖌‍ ‌𝓱‌Ꭵ‌𝐬‍ ‌Ꮟ‍𝙚‌𝗌‍𝕥‌۔‍𝖐‌𝖓‌o‌𝑤‍𝐧‍ ‌𑜎‌о‌ꮁ‍𝐤‌𝗌‍ ‌𝜶‍𝗿‍𝖾‌ ‌𝕸‍໐‍Ꮟ‍𝙮‍Ⲻ‍𝖣‍𝑖‍𝔠‌𝒌‌ ‍〔‍1‌𝟪‌5‍1‍〕‌ꓹ‌ ‌𝖳‍𝗒‌𝓹‍𝘦‌𝚎‌ ‌〔‍1‍🯸‌𝟜‌6‍❳‍ꓹ‌ ‍𝖆‍ ‌𝕣‌ꬽ‍m‍⍺‌𝘯‌𝘵‌і‌ꮯ‌𝛊‍𝐳‍ⅇ‍𝙙‍ ‍𝕒‌c‍ᴄ‌ჿ‌𝚞‍𝚗‌𝐭‍ ‍𞹤‍𝔣‍ ‍𝚑‌ӏ‌𝓈‌ ‍𝕖‍𝑥‌𝙥‍𝔢‍𝗿‍ꙇ‌e‌𝓷‍c‌℮‍ꮪ‌ ‌𝖎‍𝚗‍ ‌𝙋‍𝘰‌Ӏ‍γ‌𝓷‍𝖾‍𝔰‍𝚒‌𝗮‌؍‍ ‌𝛼‍𝔫‍𝖉‌ ‍𝔅‌Ꭵ‌𝖑‌l‌𝔂‌ ‌𝓑‍𝐮‌𝖉‌𝒹‌‚‌ ‍Ꮥ‌а‌ꙇ‌𝘭‍𝝈‍𝗋‌,‍ ‌α‍ ‍𝑝‍ꬽ‍𐑈‍𝓽‌һ‍𝛖‍m‍𞺄‌ᴜ‍𝔰‍𝗹‌𝑦‍ ‌𝖕‍ᴜ‍Ꮟ‍𝝞‌𝜄‌s‍h‍𝗲‍ꓒ‌ ‌𝓃‍𝗈‌𝓋‍𝒆‌𐌉‌ו‌𝞪‍꘎‍ ‍𖽀‍𝜤‍𝑡‍һ‍𝙤‍𝑢‌ց‍𝘩‌ ‌𝒉‌ι‍ѕ‌ ‌𝖗‌𝒆‌𝛠‍𝚞‍𝐭‌𝓪‌𝙩‌ɪ‍ﮨ‍𝓷‍ ‌𑜊‍𝖺‍s‌ ‍𝘯‍𞹤‍𝚝‌ ‌𝐡‌𝜄‌ᶃ‍𝕙‍ ‍𝖆‍𝘁‍ ‌𝙩‍h‍ꬲ‌ ‍𝓉‌𝔦‍m‍е‍ ‌𝞼‍ẝ‍ ‍ℎ‌ı‍ƽ‍ ‌𝐝‌𝕖‍𝖆‍𝚝‌𝔥‌ꓹ‌ ‍𝙩‌Ꮒ‌ꬲ‍ ‌𝗰‌ⅇ‌𝗻‌𝔱‍𝖊‌𝖓‌n‍𝛊‍𝙖‌𐌠‌ ‍ﻫ‍𝘧‌ ‌𝒽‍𝖎‍𝘴‍ ‍b‍ı‌𝚛‌𝓽‌𝘩‌ ‌i‌𝐧‍ ‍1‍𑣖‌1‍𝟵‌ ‍𑜏‌α‌𝗌‌ ‌𝗍‌𝐡‌ҽ‍ ‍𝕤‍𝑡‍𝛂‌r‍𝓉‍Ꭵ‌𝚗‍ᶃ‍ ‌𝛒‍ס‌𝜾‍𝗻‌𝖙‌ ‌𝜊‌𝖋‌ ‍𝙖‌ ‍ꓟ‍𝙚‌ⵏ‌𝛎‍˛‍І‍𝘭‍ҽ‌ ‌𝔯‍𝐞‌v‌𝞲‌𝚟‌𝖆‍l‍ ‍ɑ‍𝘯‍𝖽‍ ‍𝑀‌ං‌𝒃‍𝚢‌‐‍𝐷‍ͺ‌𝚌‌𝗸‍ ‌𝓰‌ꭈ‌е‌ᴡ‌ ‍𝓉‌ﮭ‌ ‌ᑲ‍ℯ‍ ‌c‍ℴ‍𝙣‌𝔰‌𑣃‍d‍ⅇ‍𝔯‌℮‌ⅾ‍ ‍ﻬ‌𝓃‌℮‍ ‌੦‌𝙛‌ ‍𝙩‌𝔥‍𝔢‍ ‌𝚐‍ꮁ‌ℯ‍𝜶‍𝙩‍ ‍𝞐‍m‍𝘦‍ᴦ‌𝜾‌𝙘‌𝕒‍𝐧‍ ‍𝓃‌o‌𝓿‌ⅇ‍|‍𝒔‍ꓸ

API

Python

>>> from text_scrambler import Scrambler
>>> scr = Scrambler()
>>> text = "This is an example"
>>> text_1 = scr.scramble(text, level=1)
>>> # adding only zwj/zwnj characters
>>> print(text, text_1)
This is an example This is an example
>>> assert text != text_1
>>> print(text_1)
This is an example
>>> print(len(text), len(text_1))
18 35
>>> text_2 = scr.scramble(text, level=2)
>>> # replacing some latin letters by their cyrilic/greek equivalent
>>> print(text_2)
Тhіѕ  an еxample
>>> for char, char_2 in zip(text, text_2):
...     if char != char_2:
...             print(char, char_2)
...
T Т
i і
s ѕ
s ѕ
e е
>>> text_4 = scr.scramble(text, level=4)
>>> # replacing all characters by any
>>> unicode looking like character
>>> print(text_4)
𝕋h𝗌𝝸𝘴‍‍ 𝛼n‍‍ 𝖊𝙭𝐚m𝜌𝐞
>>> versions = scr.generate(text, 10, level=4)
>>> for txt in versions:
...     print(txt)
...
𝘛h𝚒𝓼‍‌ͺ‌s𝛂ոҽ𝕩𝚊m𝒑‌𞣇‍𝒆
𐊗𝘩ı𝚜𝚒𐑈𝚊𝓃𝔢𝖺m𝗉𝟣𝑒
𝕿𝓱𝚒𝗂𝗮𝙣𝖊𝑥𝛂m𝜌𝕴𝖾
⊤‍𝐡𝓲𝞲𝔰𝐚𝚗ҽ𝓍𝚊mρ‌׀‌
𝕿𝚑іs 𝜾ѕ𝔞𝕟𝑒𝘹𝛼m𝟈
𝗧𝐡𝚒𝘪𝗌 𝔞ո𝕖𝘹𝘢m𝜌𝗅
𝕋𝗁𝔰 𝕚𝒔𝓪𝘯𝙚𝗮m𝝔۱
𝖳𝖍ӏ𝗌ι𑣁α𝒏𝖊𝘹𝛼m𝗽𝜤e
𝔗𝓱ɪ𑣁𝒾𝒔 𝛼𝓷‌‍𝖾𝔵𝖺m𝝔𝒍e
𝚻𝕙ɪ𝕤𝕤‍‌𝛂𝔫 𝓮‌⍺‍m‌⍴‍𝐈𝒆
>>> versions = scr.generate(text, 1000, level=2)
>>> assert len(versions) == len(set(versions))
>>> # all unique

>>> text = "A cranial nerve nucleus is a collection of neurons in the brain stem that is associated with one or more of the cranial nerves."
>>> texts = scr.generate(text, 1000, level=1)
>>> assert texts[0] != text
>>> for scrambled_text in texts:
...     assert text != scrambled_text
...
>>> print(texts[0])
A cranial nerve nucleus is a collection of neurons in the brain stem that is associated with one or more of the cranial nerves.
>>> # different from the original text

Command line interface (CLI)

To get words from input words through CLI, run

$ python -m text_scrambler
usage: Usage : python -m text_scrambler file

Replace/insert the charaters of the file using the unicode confusable characters

positional arguments:
  file                  encoded in UTF-8

optional arguments:
  -h, --help            show this help message and exit
  -l LEVEL, --level LEVEL

                                1: insert non printable characters within the text
                                2: replace some latin letters to their Greek or Cyrilic equivalent
                                3: insert non printable characters and change the some latin  to their Greek or Cyrilic equivalent
                                4: insert non printable chraracters change all possible letter to a randomly picked unicode letter equivalent
                                default=1
  -n N, --generate N
                                Scramble n times the string
                                default=1

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

text_scrambler-0.1.3.tar.gz (172.7 kB view hashes)

Uploaded Source

Built Distribution

text_scrambler-0.1.3-py3-none-any.whl (173.7 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page