Memorable word-based encoding for binary data.
Project description
WordyBin
What the world needs is yet another word-based encoding system for binary data. In this case, a 16-bit encoding system, with one of 512 5-letter words standing in for the first 9 bits, and one of 128 3-letter words standing in for the next 7 bits.
Why?
- Because the words are fixed length, the encoded string length has a stepwise linear relationship to the source data. This can be advantageous for humans who are trying to eyeball something, and stands in contrast to encodings like BIP39.
- Each word can have its accent on the first syllable, to make reading out loud easier.
- Each word can be pronounced uniquely, such that there is reduced
ambiguity when restricted to the built-in list of English words. A
lot of effort has been put into making 'hearing' these words read
be as unambigous as possible.
- Caveats:
- It is not possible to ensure strong phonetic difference across this many words, but we've attempted to provide as much phonetic difference as possible.
- Future versions could redo the wordlist to improve this at the
cost of backward-incompatibility; suggestions backed up by
jellyfish
are welcomed since this concept is still in its early stages.
- Caveats:
The words are built on prior art; mostly, this is the
BIP39 English wordlist,
filtered to 5 and 3-letter words, then filtered again for various
words that don't fit the above restrictions or that I felt like
dropping for no particular reason. Since this leaves less than 512
words, I added some 4-letter words from the BIPS wordlist that have
can have an adjectival version ending in y
, plus a couple of
others. There were not enough 3-letter words, and many of them
diverged from the given criteria, so I added quite a few of those to
get to 128.
Why would I actually use this?
There are a lot of cases where we want to represent something
determinstically and uniquely. One of the common cases is to provide a
unique, unopinionated, compressed reference to it. This is sometimes
called a hash
.
Hashes have really nice properties, but they also have some not-nice
properties, and perhaps the main one is that they are just a jumble of
characters. For instance, here is a shortened, 8-character hash of a
commit from the BIP39 repo: ce1862ac
. That hash contains 32 bits of
entropy, which is sufficient in most cases to uniquely identify a
moment in time in the life of your repository.
What it isn't is memorable, or easy to communicate. But 32 bits is
very easy to communicate using WordyBin, because you can use 4 words
to represent those three bytes. ce1862ac
(in hexadecimal) is
SprayCowHandyFee
in WordyBin. I bet you can remember that for long
enough to switch browser tabs!
Installation/Usage
pip install wordybin
- encode:
cat <file> | python -m wordybin
python -m wordybin --input-file <file>
- decode:
echo "DirtyGumCycleGetCrossFoxCrazyFog" | python -m wordybin -d > output.b
python -m wordybin -d --input-file input.b --output-file output.b
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file wordybin-0.2.0.tar.gz
.
File metadata
- Download URL: wordybin-0.2.0.tar.gz
- Upload date:
- Size: 6.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.8.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ed434136975498789691ff2af87ed28b2d3e2a27faeecdb3f43ddc572e1a37ae |
|
MD5 | 646f81f60415fa63d895d47d6ee6c0bc |
|
BLAKE2b-256 | 2040e244f51211ac5bd06b294e2e0c4de6f98a4ac7188bac60f0ea36b8cc0d23 |
File details
Details for the file wordybin-0.2.0-py3-none-any.whl
.
File metadata
- Download URL: wordybin-0.2.0-py3-none-any.whl
- Upload date:
- Size: 7.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.8.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4ff7f46a2413508336d1f5c5852f60cd08e5148e9227ea0128b257781fbd55e3 |
|
MD5 | db2b73eef3024f535348dd45b696419b |
|
BLAKE2b-256 | 0af208659fa52005ba36d3f9b495f03a859299720b074c2b2e20d7f1e1780323 |