Skip to main content

text_scrambler, a tool to scramble texts

Project description

Using the Unicode confusable characters and other tricks, we can transform a text into another that looks exactly like it but remains different from a machine view.

Examples

Replacing randomly the Latin characters by Greek or Cyrillic letters and adding the ZW(N)J.

Original text:

Herman Melville (August 1, 1819 – September 28, 1891) was an American novelist, short story writer, and poet of the American Renaissance period. Among his best-known works are Moby-Dick (1851), Typee (1846), a romanticized account of his experiences in Polynesia, and Billy Budd, Sailor, a posthumously published novella. Although his reputation was not high at the time of his death, the centennial of his birth in 1919 was the starting point of a Melville revival and Moby-Dick grew to be considered one of the great American novels.

Srambled text with zw(n)j added (looking the same but totally different):

H‍e‌r‌m‍a‍n‌ ‌M‍e‌l‌v‌i‍l‍l‍e‍ ‌(‍A‍u‍g‌u‌s‍t‌ ‌1‌,‌ ‍1‍8‌1‍9‌ ‌–‌ ‌S‍e‌p‌t‌e‍m‍b‍e‌r‌ ‍2‌8‍,‍ ‍1‍8‌9‌1‍)‌ ‌w‍a‌s‍ ‌a‍n‌ ‍A‌m‌e‌r‌i‌c‍a‌n‍ ‍n‌o‍v‌e‍l‌i‌s‍t‍,‍ ‌s‍h‌o‍r‍t‌ ‍s‌t‌o‌r‍y‍ ‌w‌r‍i‌t‍e‌r‌,‌ ‍a‍n‍d‍ ‌p‌o‌e‌t‌ ‍o‌f‍ ‌t‌h‍e‌ ‌A‍m‌e‍r‌i‍c‌a‌n‍ ‍R‍e‌n‍a‍i‍s‍s‌a‌n‍c‌e‍ ‌p‍e‌r‍i‌o‌d‌.‍ ‍A‌m‍o‌n‍g‍ ‍h‍i‌s‍ ‌b‍e‌s‍t‍-‌k‍n‌o‍w‌n‍ ‌w‍o‌r‍k‍s‌ ‍a‌r‍e‍ ‍M‍o‌b‍y‌-‍D‌i‍c‍k‍ ‍(‌1‍8‌5‍1‍)‍,‌ ‌T‍y‌p‍e‌e‌ ‍(‌1‍8‌4‌6‍)‌,‍ ‌a‌ ‍r‌o‌m‍a‌n‍t‌i‍c‍i‌z‌e‍d‌ ‍a‍c‌c‍o‌u‍n‌t‌ ‍o‌f‌ ‍h‌i‍s‌ ‍e‍x‍p‍e‌r‌i‌e‌n‍c‌e‍s‌ ‌i‍n‍ ‍P‍o‌l‌y‌n‍e‍s‌i‍a‌,‍ ‍a‍n‍d‍ ‍B‍i‍l‍l‌y‌ ‌B‌u‌d‍d‌,‍ ‍S‍a‌i‌l‌o‍r‌,‍ ‍a‌ ‍p‌o‌s‍t‌h‍u‍m‌o‍u‌s‍l‍y‌ ‌p‍u‌b‍l‍i‌s‌h‍e‌d‍ ‍n‌o‌v‌e‌l‍l‍a‍.‌ ‍A‍l‍t‍h‌o‍u‍g‍h‍ ‍h‌i‍s‌ ‌r‍e‌p‍u‍t‌a‍t‍i‌o‌n‍ ‌w‍a‌s‌ ‍n‌o‌t‌ ‍h‌i‌g‍h‌ ‍a‌t‌ ‌t‌h‌e‌ ‌t‍i‍m‍e‍ ‍o‍f‌ ‌h‌i‍s‍ ‌d‌e‍a‍t‍h‌,‌ ‌t‍h‌e‍ ‌c‍e‌n‍t‌e‍n‌n‌i‍a‍l‌ ‍o‍f‌ ‍h‍i‌s‍ ‍b‍i‌r‌t‍h‍ ‌i‌n‌ ‍1‌9‍1‌9‌ ‌w‍a‍s‍ ‌t‌h‌e‍ ‌s‌t‍a‍r‌t‍i‍n‍g‍ ‍p‍o‌i‌n‌t‌ ‍o‌f‌ ‍a‌ ‍M‍e‌l‌v‌i‍l‌l‌e‍ ‌r‍e‍v‌i‌v‍a‍l‍ ‌a‌n‍d‌ ‍M‍o‍b‌y‍-‌D‌i‌c‌k‍ ‍g‍r‌e‌w‍ ‌t‌o‌ ‌b‍e‍ ‌c‌o‌n‍s‌i‌d‌e‍r‌e‌d‌ ‌o‍n‍e‌ ‍o‌f‌ ‍t‌h‌e‍ ‍g‍r‌e‍a‌t‌ ‌A‌m‍e‌r‌i‍c‍a‍n‍ ‍n‌o‌v‌e‍l‌s‍.

Srambled text with latin letter replaced with their Cyrillic/Greek equivalent:

Неrman Melvіllе (Αuguѕt 1, 1819 – Septеmber 28, 1891) wаѕ an Аmеrісаn nοvеlist, shοrt story writеr, and poеt оf the Americаn Rеnaіssanсe pеriоd. Amоng his bеst-known works arе Μoby-Dісk (1851), Τyреe (1846), a rоmаnticizеd accоunt оf hіs eхрerіencеs in Ρоlynеѕiа, аnd Вilly Budd, Ѕаіlοr, а pοsthumously рublіshed nоvеllа. Although hiѕ reputation was nοt hіgh at thе tіme οf hіѕ dеаth, the сentennіаl оf hіs bіrth in 1919 waѕ thе stаrting point οf a Μelville revival and Μοby-Dick grew tο bе соnѕidеrеd οne оf the great American novels.

Srambled text with both changes:

H‍e‌r‌m‍a‍n‌ ‌Μ‍e‍l‍v‌і‍l‍l‌е‍ ‌(‍А‌u‌g‍u‍ѕ‌t‍ ‌1‍,‍ ‌1‌8‍1‍9‌ ‌–‍ ‍S‍e‌p‌t‌e‌m‍b‍e‍r‍ ‌2‍8‌,‌ ‍1‍8‍9‍1‍)‍ ‍w‍a‍ѕ‌ ‌a‍n‌ ‌Α‍m‍e‌r‌i‍с‌a‌n‍ ‌n‌o‍v‍e‌l‍i‍ѕ‌t‌,‌ ‌s‍h‌ο‍r‍t‍ ‍ѕ‌t‌ο‌r‍y‍ ‍w‌r‍i‍t‌е‌r‌,‍ ‌а‌n‌d‍ ‌p‌о‌е‌t‍ ‌ο‍f‌ ‍t‌h‍e‍ ‍А‍m‌e‌r‍і‌c‍а‍n‍ ‍R‍е‍n‍a‍i‌s‍s‍a‍n‍с‌е‌ ‌p‍е‍r‍i‍о‍d‌.‌ ‌A‍m‍ο‍n‌g‌ ‌h‌i‌ѕ‍ ‍b‍е‌s‍t‌-‍k‌n‌ο‍w‍n‍ ‌w‌о‌r‌k‌ѕ‌ ‍a‌r‌е‌ ‌M‍о‍b‍y‌-‍D‌i‍c‌k‍ ‌(‌1‍8‍5‍1‍)‌,‌ ‌T‍y‍p‌е‍е‍ ‍(‌1‌8‌4‌6‌)‍,‍ ‌a‍ ‍r‍ο‍m‌а‌n‌t‌і‍с‍і‍z‍e‌d‌ ‍a‌с‍c‍о‌u‍n‍t‍ ‌ο‌f‌ ‍h‍і‍s‍ ‍e‌x‌р‍e‍r‌і‌е‍n‍c‌e‍s‌ ‌і‌n‍ ‍Р‍о‍l‌y‌n‍е‍s‍і‍а‌,‌ ‍a‍n‍d‌ ‍В‌i‍l‍l‌y‍ ‌Β‍u‌d‍d‍,‍ ‌Ѕ‌а‍i‌l‌ο‍r‌,‍ ‌a‌ ‌p‍ο‍ѕ‌t‌h‍u‍m‍о‍u‌ѕ‍l‌y‍ ‌p‌u‍b‌l‍i‌ѕ‌h‍е‌d‍ ‌n‌о‌v‍е‍l‌l‍a‍.‍ ‍A‍l‍t‍h‌о‌u‌g‍h‍ ‍h‌i‍s‌ ‌r‌e‌р‌u‌t‍a‍t‌і‌o‍n‌ ‍w‍а‌ѕ‌ ‌n‌о‍t‌ ‍h‍і‌g‌h‍ ‍а‍t‌ ‍t‍h‌е‍ ‌t‍і‍m‍e‍ ‌o‍f‍ ‌h‌і‍s‍ ‌d‍e‍а‍t‍h‍,‍ ‍t‌h‌е‍ ‌с‌e‍n‍t‍e‌n‌n‍і‍a‌l‌ ‍o‍f‌ ‍h‍і‍ѕ‍ ‍b‌i‍r‍t‌h‌ ‌і‌n‌ ‍1‍9‍1‍9‌ ‌w‌а‌s‌ ‌t‌h‍e‍ ‌s‌t‌а‌r‌t‌і‍n‍g‍ ‌р‍ο‍і‍n‌t‍ ‌ο‌f‌ ‍a‌ ‌Μ‍e‌l‍v‍i‍l‌l‍е‌ ‍r‌е‍v‍i‌v‍а‍l‍ ‍a‌n‌d‍ ‍М‍o‌b‍y‌-‍D‌і‌с‌k‍ ‌g‍r‌е‌w‌ ‍t‍ο‍ ‌b‌e‍ ‍с‍o‍n‍s‍i‌d‌e‌r‌e‌d‍ ‍о‍n‍e‌ ‌o‍f‍ ‌t‌h‍е‍ ‌g‍r‌e‌а‍t‍ ‍А‍m‌е‌r‌i‌с‌а‍n‌ ‌n‍o‌v‍e‍l‌s‍.

It is worth to notice that search engines can’t find the original webpage (as free online plagiarism checkers). Searching for Μelvillе (with cyrillic letters) (copy-paste it) on Google doesn’t return any match, though the original word Melville does.

Using all of the confusable characters of unicode (see the unicode confusable characters below), we can generate weird looking text worthy of old spam messages:

𝚮‍𝒆‌𝕣‍m‍𝓪‍n‍ ‍𝝡‍ҽ‌𝟙‍∨‍𝘪‍𝘐‌𞺀‍𝓮‍ ‍﴾‍𝓐‍𝞄‍𝓰‍ꞟ‌𑣁‍t‌ ‌1‌,‌ ‍1‍8‌1‍Ⳋ‌ ‍–‍ ‌Ꮥ‌𝖊‍𝞺‌𝐭‍𝖾‌m‍Ƅ‌𝔢‌𝔯‌ ‍Ƨ‍𐌚‌ꓹ‌ ‍1‍ଃ‌𝟿‍1‍]‌ ‍𝘸‍𝐚‍𝚜‍ ‍𝖺‌𝔫‍ ‍Α‍m‌ℯ‌𝔯‌𝓲‌ꮯ‌𝒶‌𝓷‌ ‍n‌ം‍𝝼‍𝔢‍𝙸‌i‌s‌𝖙‍؍‍ ‍𐑈‌𝖍‌ꬽ‍ꭇ‍𝓽‍ ‌𝓼‌𝖙‍ⲟ‌r‌𑣜‍ ‍𝐰‌𝓻‌і‍𝒕‍е‍𝕣‍٫‍ ‍α‌𝒏‌𝕕‍ ‍𝙥‌𝜊‍e‍𝕥‍ ‍ﮨ‍f‌ ‌𝘵‍h‍𝗲‌ ‌Α‌m‍𝐞‍𝐫‌ꙇ‌𝒸‍a‍n‌ ‍𖼵‍𝘦‍𝑛‌𝐚‌𝒾‌𝑠‌𑣁‌𝜶‌𝕟‌𝗰‌𝒆‍ ‌𝟈‍𝖾‌r‍⍳‌ﮫ‌ᑯ‌𐩐‌ ‍Α‌m‍o‍𝓃‌𝖌‍ ‌𝓱‌Ꭵ‌𝐬‍ ‌Ꮟ‍𝙚‌𝗌‍𝕥‌۔‍𝖐‌𝖓‌o‌𝑤‍𝐧‍ ‌𑜎‌о‌ꮁ‍𝐤‌𝗌‍ ‌𝜶‍𝗿‍𝖾‌ ‌𝕸‍໐‍Ꮟ‍𝙮‍Ⲻ‍𝖣‍𝑖‍𝔠‌𝒌‌ ‍〔‍1‌𝟪‌5‍1‍〕‌ꓹ‌ ‌𝖳‍𝗒‌𝓹‍𝘦‌𝚎‌ ‌〔‍1‍🯸‌𝟜‌6‍❳‍ꓹ‌ ‍𝖆‍ ‌𝕣‌ꬽ‍m‍⍺‌𝘯‌𝘵‌і‌ꮯ‌𝛊‍𝐳‍ⅇ‍𝙙‍ ‍𝕒‌c‍ᴄ‌ჿ‌𝚞‍𝚗‌𝐭‍ ‍𞹤‍𝔣‍ ‍𝚑‌ӏ‌𝓈‌ ‍𝕖‍𝑥‌𝙥‍𝔢‍𝗿‍ꙇ‌e‌𝓷‍c‌℮‍ꮪ‌ ‌𝖎‍𝚗‍ ‌𝙋‍𝘰‌Ӏ‍γ‌𝓷‍𝖾‍𝔰‍𝚒‌𝗮‌؍‍ ‌𝛼‍𝔫‍𝖉‌ ‍𝔅‌Ꭵ‌𝖑‌l‌𝔂‌ ‌𝓑‍𝐮‌𝖉‌𝒹‌‚‌ ‍Ꮥ‌а‌ꙇ‌𝘭‍𝝈‍𝗋‌,‍ ‌α‍ ‍𝑝‍ꬽ‍𐑈‍𝓽‌һ‍𝛖‍m‍𞺄‌ᴜ‍𝔰‍𝗹‌𝑦‍ ‌𝖕‍ᴜ‍Ꮟ‍𝝞‌𝜄‌s‍h‍𝗲‍ꓒ‌ ‌𝓃‍𝗈‌𝓋‍𝒆‌𐌉‌ו‌𝞪‍꘎‍ ‍𖽀‍𝜤‍𝑡‍һ‍𝙤‍𝑢‌ց‍𝘩‌ ‌𝒉‌ι‍ѕ‌ ‌𝖗‌𝒆‌𝛠‍𝚞‍𝐭‌𝓪‌𝙩‌ɪ‍ﮨ‍𝓷‍ ‌𑜊‍𝖺‍s‌ ‍𝘯‍𞹤‍𝚝‌ ‌𝐡‌𝜄‌ᶃ‍𝕙‍ ‍𝖆‍𝘁‍ ‌𝙩‍h‍ꬲ‌ ‍𝓉‌𝔦‍m‍е‍ ‌𝞼‍ẝ‍ ‍ℎ‌ı‍ƽ‍ ‌𝐝‌𝕖‍𝖆‍𝚝‌𝔥‌ꓹ‌ ‍𝙩‌Ꮒ‌ꬲ‍ ‌𝗰‌ⅇ‌𝗻‌𝔱‍𝖊‌𝖓‌n‍𝛊‍𝙖‌𐌠‌ ‍ﻫ‍𝘧‌ ‌𝒽‍𝖎‍𝘴‍ ‍b‍ı‌𝚛‌𝓽‌𝘩‌ ‌i‌𝐧‍ ‍1‍𑣖‌1‍𝟵‌ ‍𑜏‌α‌𝗌‌ ‌𝗍‌𝐡‌ҽ‍ ‍𝕤‍𝑡‍𝛂‌r‍𝓉‍Ꭵ‌𝚗‍ᶃ‍ ‌𝛒‍ס‌𝜾‍𝗻‌𝖙‌ ‌𝜊‌𝖋‌ ‍𝙖‌ ‍ꓟ‍𝙚‌ⵏ‌𝛎‍˛‍І‍𝘭‍ҽ‌ ‌𝔯‍𝐞‌v‌𝞲‌𝚟‌𝖆‍l‍ ‍ɑ‍𝘯‍𝖽‍ ‍𝑀‌ං‌𝒃‍𝚢‌‐‍𝐷‍ͺ‌𝚌‌𝗸‍ ‌𝓰‌ꭈ‌е‌ᴡ‌ ‍𝓉‌ﮭ‌ ‌ᑲ‍ℯ‍ ‌c‍ℴ‍𝙣‌𝔰‌𑣃‍d‍ⅇ‍𝔯‌℮‌ⅾ‍ ‍ﻬ‌𝓃‌℮‍ ‌੦‌𝙛‌ ‍𝙩‌𝔥‍𝔢‍ ‌𝚐‍ꮁ‌ℯ‍𝜶‍𝙩‍ ‍𝞐‍m‍𝘦‍ᴦ‌𝜾‌𝙘‌𝕒‍𝐧‍ ‍𝓃‌o‌𝓿‌ⅇ‍|‍𝒔‍ꓸ

Full documentation at https://text-scrambler.readthedocs.io

Installation

pip install text-scrambler

Quickstart

Python

>>> from text_scrambler import Scrambler
>>> scr = Scrambler()
>>> text = "This is an example"
>>> text_1 = scr.scramble(text, level=1)
>>> #############
>>> # adding only zwj/zwnj characters
>>> print(text, text_1, sep="\\n")
This is an example
This is an example
>>> assert text != text_1
>>> print(len(text), len(text_1))
18 35
>>> # though the texts look similar, the second one has more characters
>>> #############
>>> text_2 = scr.scramble(text, level=2)
>>> # replacing some latin letters by their cyrillic/greek equivalent
>>> print(text_2)
Тhiѕ  an ехаmple
>>> for char, char_2 in zip(text, text_2):
...     if char != char_2:
...             print(char, char_2)
...
T Т
s ѕ
s ѕ
e е
x х
a а
>>> #############
>>> text_3 = scr.scramble(text, level=3)
>>> # adding zwj/zwnj characters and replacing latin letters
>>> print(text_3)
Thіs iѕ аn eхаmple
>>> print(text, text_3, sep="\\n")
This is an example
Thіs iѕ аn eхаmple
>>> assert text_3 != text
>>> #############
>>> text_4 = scr.scramble(text, level=4)
>>> # replacing all characters by any unicode looking like character
>>> print(text_4)
⊤‌𝒽𝐢𝘴 𝘪𝙨 𝞪ռ 𝙚‍⨯‍𝚊mρ𝟙ҽ
>>> #
>>> # generating several versions
>>> versions = scr.generate(text, 10, level=4)
>>> for txt in versions:
...     print(txt)
...
𝕋𝗵𝕚𝔰 𝙞ѕ ɑ𝗇 𝗑𝒂m𝛠𝚎
𝔗һ𑣃ƽ ‌˛‍ 𝛼𝐧 𝐞𝖝𝛼m𝜌𝟏
𝓲𝔰 𝔰 αn ‌⤬‌αm‌⍴‍𞸀
𝗧𝗵i𝑠 𝖘 ‍⍺‍𝘯 𝗲𝔁аm𝘱𝙸𝔢
⊤‌𝚑𝑖 ɪ𝚜 𝜶𝑛 𝖾𝘅𝒶m𝛒𝑙𝓮
𝘛𝙞 𝗌 𝗮𝐧 𝓪m𝜌‌⏽‍𝓮
𝙏𝕙і𝓈 ı 𝔞𝕟 𝗲𝕩𝛂mр𐌉𝚎
𝕿𝐬 𝗶𝗌 𝛼𝔫 𝗲𝐱𝓪m𝞎𝙡𝖊
⟙‌𝜾 𝘴 𝝰𝒏 𝙚𝗮m𝗽𝗜𝗲
𝖳հ𝒊s 𝕚𝙨 𝖆𝑛 𝘦𝔁аm𝜌𝐈𝗲
>>> versions = scr.generate(text, 1000, level=1)
>>> assert len(versions) == len(set(versions))
>>> # all unique

Command line interface (CLI)

To get words from input words through CLI, run

$ python -m text_scrambler
usage: Usage : python -m text_scrambler file

Replace/insert the charaters of the file using the unicode confusable characters

positional arguments:
  file                  encoded in UTF-8

optional arguments:
  -h, --help            show this help message and exit
  -l LEVEL, --level LEVEL

                                1: insert non printable characters within the text
                                2: replace some latin letters to their Greek or Cyrillic equivalent
                                3: insert non printable characters and change the some latin  to their Greek or Cyrillic equivalent
                                4: insert non printable chraracters change all possible letter to a randomly picked unicode letter equivalent
                                default=1
  -n N, --generate N
                                Scramble n times the string
                                default=1

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

text_scrambler-0.1.5.tar.gz (175.4 kB view hashes)

Uploaded Source

Built Distribution

text_scrambler-0.1.5-py3-none-any.whl (174.9 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page