Skip to main content

Native codecs extension

Project description

PyPi Read The Docs Build Status Coverage Status Python Versions Requirements Status Known Vulnerabilities License

Introduction

This library extends the native codecs library (namely for adding new custom encodings and character mappings) and provides a myriad of new encodings (static or parametrized, like rot or xor), hence its named combining CODecs EXTension.

Setup

$ pip install codext

Note: Some encodings are available in Python 3 only.

Usage (CLI tool)

$ codext -i test.txt encode dna-1
GTGAGCGGGTATGTGA
$ echo -en "test" | codext encode morse
- . ... -

Python 3 (includes Ascii85, Base85, Base100 and braille):

$ echo -en "test" | codext encode braille
⠞⠑⠎⠞
$ echo -en "test" | codext encode base100
👫👜👪👫

Using codecs chaining:

$ echo -en "Test string" | codext encode reverse
gnirts tseT
$ echo -en "Test string" | codext encode reverse morse
--. -. .. .-. - ... / - ... . -
$ echo -en "Test string" | codext encode reverse morse dna-2
AGTCAGTCAGTGAGAAAGTCAGTGAGAAAGTGAGTGAGAAAGTGAGTCAGTGAGAAAGTCAGAAAGTGAGTGAGTGAGAAAGTTAGAAAGTCAGAAAGTGAGTGAGTGAGAAAGTGAGAAAGTC
$ echo -en "Test string" | codext encode reverse morse dna-2 octal
101107124103101107124103101107124107101107101101101107124103101107124107101107101101101107124107101107124107101107101101101107124107101107124103101107124107101107101101101107124103101107101101101107124107101107124107101107124107101107101101101107124124101107101101101107124103101107101101101107124107101107124107101107124107101107101101101107124107101107101101101107124103
$ echo -en "AGTCAGTCAGTGAGAAAGTCAGTGAGAAAGTGAGTGAGAAAGTGAGTCAGTGAGAAAGTCAGAAAGTGAGTGAGTGAGAAAGTTAGAAAGTCAGAAAGTGAGTGAGTGAGAAAGTGAGAAAGTC" | codext -d dna-2 morse reverse
test string

Usage (Python)

Getting the list of available codecs:

>>> import codext
>>> codext.list()
['ascii85', 'base85', 'base100', 'base122', ..., 'tomtom', 'dna', 'html', 'markdown', 'url', 'resistor', 'sms', 'whitespace', 'whitespace-after-before']

Usage examples:

>>> codext.encode("this is a test", "base58-bitcoin")
'jo91waLQA1NNeBmZKUF'
>>> codext.encode("this is a test", "base58-ripple")
'jo9rA2LQwr44eBmZK7E'
>>> codext.encode("this is a test", "base58-url")
'JN91Wzkpa1nnDbLyjtf'
>>> codecs.encode("this is a test", "base100")
'👫👟👠👪🐗👠👪🐗👘🐗👫👜👪👫'
>>> codecs.decode("👫👟👠👪🐗👠👪🐗👘🐗👫👜👪👫", "base100")
'this is a test'
>>> for i in range(8):
        print(codext.encode("this is a test", "dna-%d" % (i + 1)))
GTGAGCCAGCCGGTATACAAGCCGGTATACAAGCAGACAAGTGAGCGGGTATGTGA
CTCACGGACGGCCTATAGAACGGCCTATAGAACGACAGAACTCACGCCCTATCTCA
ACAGATTGATTAACGCGTGGATTAACGCGTGGATGAGTGGACAGATAAACGCACAG
AGACATTCATTAAGCGCTCCATTAAGCGCTCCATCACTCCAGACATAAAGCGAGAC
TCTGTAAGTAATTCGCGAGGTAATTCGCGAGGTAGTGAGGTCTGTATTTCGCTCTG
TGTCTAACTAATTGCGCACCTAATTGCGCACCTACTCACCTGTCTATTTGCGTGTC
GAGTGCCTGCCGGATATCTTGCCGGATATCTTGCTGTCTTGAGTGCGGGATAGAGT
CACTCGGTCGGCCATATGTTCGGCCATATGTTCGTCTGTTCACTCGCCCATACACT
>>> codext.decode("GTGAGCCAGCCGGTATACAAGCCGGTATACAAGCAGACAAGTGAGCGGGTATGTGA", "dna-1")
'this is a test'
>>> codecs.encode("this is a test", "morse")
'- .... .. ... / .. ... / .- / - . ... -'
>>> codecs.decode("- .... .. ... / .. ... / .- / - . ... -", "morse")
'this is a test'
>>> with open("morse.txt", 'w', encoding="morse") as f:
	f.write("this is a test")
14
>>> with open("morse.txt",encoding="morse") as f:
	f.read()
'this is a test'
>>> codext.decode("""
      =            
              X         
   :            
      x         
  n  
    r 
        y   
      Y            
              y        
     p    
         a       
 `          
            n            
          |    
  a          
o    
       h        
          `            
          g               
           o 
   z      """, "whitespace-after+before")
'CSC{not_so_invisible}'
>>> print(codext.encode("An example test string", "baudot-tape"))
***.**
   . *
***.* 
*  .  
   .* 
*  .* 
   . *
** .* 
***.**
** .**
   .* 
*  .  
* *. *
   .* 
* *.  
* *. *
*  .  
* *.  
* *. *
***.  
  *.* 
***.* 
 * .* 

List of codecs

Codec Conversions Comment
a1z26 text <-> alphabet order numbers keeps words whitespace-separated and uses a custom character separator
affine text <-> affine ciphertext aka Affine Cipher
ascii85 text <-> ascii85 encoded text Python 3 only
atbash text <-> Atbash ciphertext aka Atbash Cipher
bacon text <-> Bacon ciphertext aka Baconian Cipher
barbie-N text <-> barbie ciphertext aka Barbie Typewriter (N belongs to [1, 4])
baseXX text <-> baseXX see base encodings (incl base32, 36, 45, 58, 62, 63, 64, 91, 100, 122)
baudot text <-> Baudot code bits supports CCITT-1, CCITT-2, EU/FR, ITA1, ITA2, MTK-2 (Python3 only), UK, ...
bcd text <-> binary coded decimal text encodes characters from their (zero-left-padded) ordinals
braille text <-> braille symbols Python 3 only
citrix text <-> Citrix CTX1 ciphertext aka Citrix CTX1 passord encoding
dna text <-> DNA-N sequence implements the 8 rules of DNA sequences (N belongs to [1,8])
excess3 text <-> XS3 encoded text uses Excess-3 (aka Stibitz code) binary encoding to convert characters from their ordinals
gray text <-> gray encoded text aka reflected binary code
gzip text <-> Gzip-compressed text standard Gzip compression/decompression
html text <-> HTML entities implements entities according to this reference
ipsum text <-> latin words aka lorem ipsum
klopf text <-> klopf encoded text Polybius square with trivial alphabetical distribution
leetspeak text <-> leetspeak encoded text based on minimalistic elite speaking rules
letter-indices text <-> text with letter indices encodes consonants and/or vowels with their corresponding indices
lz77 text <-> LZ77-compressed text compresses the given data with the algorithm of Lempel and Ziv of 1977
lz78 text <-> LZ78-compressed text compresses the given data with the algorithm of Lempel and Ziv of 1978
manchester text <-> manchester encoded text XORes each bit of the input with 01
markdown markdown --> HTML unidirectional
morse text <-> morse encoded text uses whitespace as a separator
navajo text <-> Navajo only handles letters (not full words from the Navajo dictionary)
octal text <-> octal digits dummy octal conversion (converts to 3-digits groups)
ordinal text <-> ordinal digits dummy character ordinals conversion (converts to 3-digits groups)
pkzip_deflate text <-> deflated text standard Zip-deflate compression/decompression
pkzip_bzip2 text <-> Bzipped text standard BZip2 compression/decompression
pkzip_lzma text <-> LZMA-compressed text standard LZMA compression/decompression
radio text <-> radio words aka NATO or radio phonetic alphabet
resistor text <-> resistor colors aka resistor color codes
rot text <-> rot(N) ciphertext aka Caesar cipher (N belongs to [1,25])
rotate text <-> N-bits-rotated text rotates characters by the specified number of bits ; Python 3 only
scytale text <-> scytale ciphertext encrypts with L, the number of letters on the rod (belongs to [1,[)
shift text <-> shift(N) ciphertext shift ordinals with N (belongs to [1,255])
sms text <-> phone keystrokes also called T9 code ; uses "-" as a separator for encoding, "-" or "_" or whitespace for decoding
southpark text <-> Kenny's language converts letters to Kenny's language from Southpark (whitespace is also handled)
tomtom text <-> tom-tom encoded text similar to morse, using slashes and backslashes
url text <-> URL encoded text aka URL encoding
xor text <-> XOR(N) ciphertext XOR with a single byte (N belongs to [1,255])
whitespace text <-> whitespaces and tabs replaces bits with whitespaces and tabs

A few variants are also implemented.

Codec Conversions Comment
baudot-spaced text <-> Baudot code groups of bits groups of 5 bits are whitespace-separated
baudot-tape text <-> Baudot code tape outputs a string that looks like a perforated tape
bcd-extended0 text <-> BCD-extended text encodes characters from their (zero-left-padded) ordinals using prefix bits 0000
bcd-extended1 text <-> BCD-extended text encodes characters from their (zero-left-padded) ordinals using prefix bits 1111
manchester-inverted text <-> manchester encoded text XORes each bit of the input with 10
octal-spaced text <-> octal digits (whitespace-separated) dummy octal conversion
ordinal-spaced text <-> ordinal digits (whitespace-separated) dummy character ordinals conversion
southpark-icase text <-> Kenny's language same as southpark but case insensitive
whitespace_after_before text <-> lines of whitespaces[letter]whitespaces encodes characters as new characters with whitespaces before and after according to an equation described in the codec name (e.g. "whitespace+2*after-3*before")

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

codext-1.9.0.tar.gz (86.7 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page