Advanced string encoding / decoding toolkit — 24 formats, auto-detection, deep decode, pipelines, plugins
Project description
stringshift
Advanced string encoding / decoding toolkit for Python.
24 built-in formats · Auto-detection engine · Deep multi-layer unwrapping · Operation pipelines · Runtime plugin system · Full CLI · Zero dependencies
What's new in v4.0
- 16 new formats —
base16,base58,base85,ascii85,rot47,nato,braille,caesar,atbash,vigenere,xor,reverse,unicode_escape,punycode,quoted_printable,uuencode magic_decode— rank every possible interpretation of an unknown stringsmart_decode— one-call auto-detection and decodingdeep_decode— recursively unwrap multi-layer encodings (like CyberChef)pipeline— chain encode/decode operations in sequence- Plugin system — register custom codecs at runtime
- Proper exceptions —
DecodeError,EncodeError,UnknownFormatError,PipelineError - Bug fixes:
exceptions.pynow contains actual exceptions, all functions properly exported
Installation
pip install stringshift
# Optional: smarter byte-encoding detection
pip install "stringshift[full]"
Quick Start
import stringshift
# Encode & decode
stringshift.encode("hello", "base64") # 'aGVsbG8='
stringshift.decode("aGVsbG8=", "base64") # 'hello'
# Don't know what something is? Auto-detect it
stringshift.smart_decode("SGVsbG8=") # 'Hello'
# See every possible interpretation, ranked by confidence
stringshift.magic_decode("SGVsbG8=")
# [{'format': 'base64', 'confidence': 0.89, 'decoded': 'Hello'}, ...]
# Unwrap multi-layer encodings in one call
stringshift.deep_decode("SGVsbG8%3D")
# {
# 'result': 'Hello',
# 'total_layers': 2,
# 'layers': [
# {'layer': 1, 'format': 'url', 'value': 'SGVsbG8='},
# {'layer': 2, 'format': 'base64', 'value': 'Hello'}
# ]
# }
# Chain operations
stringshift.pipeline("hello", ["base64_encode", "url_encode"]) # 'aGVsbG8%3D'
Supported Formats (24 built-in)
| Category | Formats |
|---|---|
| Base encodings | base64 base32 base16 base58 base85 ascii85 |
| Binary / Hex | hex binary |
| Web / Text | url html quoted_printable punycode uuencode |
| Classic ciphers | caesar atbash vigenere rot13 rot47 xor |
| Symbol / Human | morse nato braille |
| Misc | reverse unicode_escape |
CLI
Install and the stringshift command is available immediately.
# Auto-detect and decode
$ stringshift "SGVsbG8="
Hello
# Encode
$ stringshift "Hello" -e base64
SGVsbG8=
# Decode a specific format
$ stringshift "48656c6c6f" -d hex
Hello
# Show all possible interpretations with confidence scores
$ stringshift "SGVsbG8=" --magic
Confidence Format Decoded
------------------------------------------------------
89% base64 Hello
# Unwrap multi-layer encodings (like CyberChef)
$ stringshift "SGVsbG8%3D" --deep
Layer 1 [url ] SGVsbG8=
Layer 2 [base64 ] Hello
Final result: Hello
# Chain operations via pipeline
$ stringshift "hello" --pipeline base64_encode url_encode
aGVsbG8%3D
# Cipher options
$ stringshift "Hello" -e caesar --shift 3 # Khoor
$ stringshift "Hello" -e vigenere --key secret
$ stringshift "Hello" -e xor --xor-key 99
# Batch process — one item per line from stdin
$ echo -e "aGVsbG8=\nd29ybGQ=" | stringshift --batch -d base64
Hello
world
# Process a file
$ stringshift -i encoded.txt -d base64 > decoded.txt
# List every available format
$ stringshift --list
# Benchmark processing time
$ stringshift "SGVsbG8=" --benchmark
# Interactive mode (no arguments)
$ stringshift
stringshift 4.0.1 — interactive mode
Commands: encode <fmt> <text>
decode <fmt> <text>
magic <text>
deep <text>
list
stringshift>
Python API
Encode & Decode
import stringshift
stringshift.encode("hello", "hex") # '68656c6c6f'
stringshift.encode("Hello", "caesar", shift=3) # 'Khoor'
stringshift.encode("Hello", "vigenere", key="secret") # 'Zinlc'
stringshift.encode("SOS", "morse") # '... --- ...'
stringshift.encode("ABC", "nato") # 'Alpha Bravo Charlie'
stringshift.encode("hi", "braille") # '⠓⠊'
stringshift.decode("68656c6c6f", "hex") # 'hello'
stringshift.decode("Khoor", "caesar", shift=3) # 'Hello'
stringshift.decode("Zinlc", "vigenere", key="secret") # 'Hello'
You can also call individual format functions directly:
from stringshift import encode_base64, decode_morse, encode_braille
encode_base64("hello") # 'aGVsbG8='
decode_morse("... --- ...") # 'SOS'
encode_braille("hello") # '⠓⠑⠇⠇⠕'
Auto-Detection
# Best guess — returns a single string
stringshift.smart_decode("68656c6c6f") # 'hello'
stringshift.smart_decode("hello%20world") # 'hello world'
stringshift.smart_decode("... --- ...") # 'SOS'
# All candidates, ranked by confidence
results = stringshift.magic_decode("SGVsbG8=")
for r in results:
print(f"{r['confidence']:.0%} {r['format']:15s} {r['decoded']}")
# Detection only — no decoding
results = stringshift.detect_format("SGVsbG8=")
# [{'format': 'base64', 'confidence': 0.89, 'decoded': 'Hello'}]
Deep Decode
Automatically peels every encoding layer off a string, the same way CyberChef's "Magic" operation works.
# Two layers: url → base64
info = stringshift.deep_decode("SGVsbG8%3D")
print(info["result"]) # 'Hello'
print(info["total_layers"]) # 2
for layer in info["layers"]:
print(layer["layer"], layer["format"], layer["value"])
# 1 url SGVsbG8=
# 2 base64 Hello
# Three layers: url → base64 → hex
tripled = stringshift.encode(
stringshift.encode(stringshift.encode("Hi", "hex"), "base64"),
"url"
)
info = stringshift.deep_decode(tripled)
print(info["result"]) # 'Hi'
print(info["total_layers"]) # 3
Pipeline
Chain any number of encode/decode steps. Each step must end with
_encode or _decode. Pass a tuple to include kwargs for ciphers.
# Simple chain
result = stringshift.pipeline("hello", [
"base64_encode",
"url_encode",
])
# 'aGVsbG8%3D'
# Reverse it
stringshift.pipeline(result, ["url_decode", "base64_decode"])
# 'hello'
# With cipher kwargs
stringshift.pipeline("hello", [
("caesar_encode", {"shift": 5}),
"base64_encode",
"url_encode",
])
Batch Processing
All batch functions use a thread pool internally and scale automatically to your CPU count.
texts = ["hello", "world", "foo"]
# Encode all in parallel
stringshift.batch_process(texts, operation="encode", fmt="base64")
# ['aGVsbG8=', 'd29ybGQ=', 'Zm9v']
# Decode all — explicit format
encoded = [stringshift.encode(t, "hex") for t in texts]
stringshift.batch_process(encoded, operation="decode", fmt="hex")
# ['hello', 'world', 'foo']
# Decode all — auto-detect format per item
mixed = ["SGVsbG8=", "68656c6c6f", "hello%20world"]
stringshift.batch_process(mixed)
# ['Hello', 'hello', 'hello world']
# Control worker threads
stringshift.batch_process(texts, operation="encode", fmt="base64", workers=8)
Plugin System
Register your own codec at runtime. It immediately becomes available to
encode(), decode(), pipeline(), the CLI, and list_formats().
# Simple functional style
stringshift.register_codec(
"shout",
encoder=str.upper,
decoder=str.lower,
)
stringshift.encode("hello", "shout") # 'HELLO'
stringshift.decode("HELLO", "shout") # 'hello'
# Class decorator style — cleaner for complex codecs
@stringshift.codec("reverse_words")
class ReverseWords:
def encode(self, text: str) -> str:
return " ".join(word[::-1] for word in text.split())
def decode(self, text: str) -> str:
return self.encode(text) # self-inverse
stringshift.encode("hello world", "reverse_words") # 'olleh dlrow'
# Use in a pipeline
stringshift.pipeline("hello world", [
"reverse_words_encode",
"base64_encode",
])
# See all formats including plugins
stringshift.list_formats()
# {'builtin': ['ascii85', 'atbash', 'base16', ...], 'plugins': ['shout', 'reverse_words']}
Error Handling
from stringshift import (
DecodeError, EncodeError,
UnknownFormatError, PipelineError,
)
# Bad input for a known format
try:
stringshift.decode("not!!valid!!", "base64")
except stringshift.DecodeError as exc:
print(exc.original) # the input that failed
print(exc.error) # the underlying exception
# Requesting a format that doesn't exist
try:
stringshift.encode("hello", "made_up")
except stringshift.UnknownFormatError as exc:
print(exc.fmt) # 'made_up'
print(exc.available) # full list of valid format names
# Pipeline step failure
try:
stringshift.pipeline("hello", ["badstep"])
except stringshift.PipelineError as exc:
print(exc.step) # 'badstep'
print(exc.index) # 0 (position in the pipeline)
# Auto-detect on truly unrecognisable input
try:
stringshift.smart_decode("!@#$%^&*()")
except stringshift.DecodeError:
print("Could not determine encoding")
Legacy Helpers (v1 compatible)
These functions are kept for backward compatibility.
# decode_all: applies URL + HTML + escape-sequence decoding in one pass
stringshift.decode_all("hello%20world") # 'hello world'
stringshift.decode_all("<b>hi</b>") # '<b>hi</b>'
stringshift.decode_all("\\x", fallback="[error]") # '[error]' ← invalid escape
# Normalise Unicode
stringshift.normalize_text("café", "NFC")
stringshift.normalize_text("café", "NFD")
# Parallel decode_all over a list
stringshift.batch_decode(["hello%20world", "&foo"])
# ['hello world', '&foo']
Format Reference
| Format | Encode: "Hi" → |
Notes |
|---|---|---|
base64 |
SGk= |
RFC 4648, auto-padded |
base32 |
JBQQ==== |
uppercase alphabet |
base16 |
4869 |
uppercase hex |
base58 |
9Ajd |
Bitcoin alphabet, no 0/O/I/l |
base85 |
LrF |
Python base64.b85encode |
ascii85 |
9jqo^ |
Adobe variant |
hex |
4869 |
lowercase, strips 0x/spaces/colons on decode |
binary |
01001000 01101001 |
8-bit groups, space-separated |
url |
Hi ("Hi!" → Hi%21) |
quote(safe="") |
html |
Hi ("<b>" → <b>) |
full entity escaping |
quoted_printable |
Hi |
email-safe encoding |
punycode |
caf-dma (for café) |
IDN domain encoding |
uuencode |
*2&D |
classic Unix transfer encoding |
rot13 |
Uv |
letter-only, self-inverse |
rot47 |
w6 |
all printable ASCII, self-inverse |
caesar |
Jk (shift=1) |
kwarg: shift (default 13) |
atbash |
Sr |
A↔Z substitution, self-inverse |
vigenere |
Rr (key="k") |
kwarg: key (default "key") |
xor |
62 43 |
kwarg: key int 0-255 (default 42) |
morse |
.... .. |
dots, dashes, / for space |
nato |
Hotel India |
full NATO phonetic alphabet |
braille |
⠓⠊ |
Grade 1 Braille |
unicode_escape |
\u0048\u0069 |
\uXXXX / \xXX sequences |
reverse |
iH |
self-inverse |
Running Tests
pip install pytest
pytest tests/ -v
Contributing
Pull requests are welcome at https://github.com/0xdivin3/stringshift
To add a new codec:
- Add
encode_<name>anddecode_<name>functions tocore.py - Register them in
ENCODE_REGISTRYandDECODE_REGISTRY - Add a round-trip test in
tests/test_core.py - Update the format table in this README
License
MIT — free for personal and commercial use.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file stringshift-4.0.1.tar.gz.
File metadata
- Download URL: stringshift-4.0.1.tar.gz
- Upload date:
- Size: 25.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ef802bd20c6238f909755a080fc769e9f208f5ac06090212f8b88f83d80ea589
|
|
| MD5 |
c627b31cf2db634c41725dad970dea64
|
|
| BLAKE2b-256 |
ef1538eaec3cbce3992101be7743154095b3b8919e7b6cc57d03481e86e8cd7b
|
File details
Details for the file stringshift-4.0.1-py3-none-any.whl.
File metadata
- Download URL: stringshift-4.0.1-py3-none-any.whl
- Upload date:
- Size: 18.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
17d462a55d33ce286379a7a7bc8bb9fba3549a50e354e98e0dde97a5ecf10a0c
|
|
| MD5 |
434461916295095eacdf81f1dd69b235
|
|
| BLAKE2b-256 |
7d83d88e4200387d62d0648f59270e3d42da35b85234408412f8ed0736e0fc41
|