base26 ([A-Z]) and base52 ([A-Za-z]) encodings
Project description
alphacodings
base26 ([A-Z]) and base52 ([A-Za-z]) encodings
🌟 overview
transform any string to alphabetic-only with base26 ([A-Z]) and base52 ([A-Za-z]) lossless encodings; useful for transmitting textual data over restrictive channels or for training AI models and tokenizers on simpler vocabularies.
alphacodings is a fast and lightweight library using GMP arithmetic.
⚙️ installation
python -m pip install alphacodings
🚀 usage
from alphacodings import base26_encode, base26_decode, base52_encode, base52_decode
string = """\
<!DOCTYPE html>
<html>
<head>
<title>sample page</title>
</head>
<body>
<h1>welcome!</h1>
<p>you are reading a sample HTML string.</p>
</body>
</html>
"""
if __name__ == "__main__":
encoding_base26 = base26_encode(string)
print(encoding_base26)
# >>> ["YBPNLKVNQWZQCMDHMLNDTVQCCRKQLNCFGMQPNGQCIXHUUPHFUNKUFEPDLKIGARFOKTDEZKQHXGCPYHDZKKVIUDNFOAYYAUOQFBJFFGSTKAXNWGDPVUJNBARPNXBASHZBXIBSSEFTAIQRPEADSOVVNXUMQXVDWTAIVCIVWQZAHAGYAVZYKGMETJOOUQNOEXMSOOGSKVMFBYZIBZDAITICYVXMJTTCCHPMSCABLYUMFDUNLVSLNKHSBPKCGASXJSFYDHZFAOEQTUACEBIFKQGYC"]
encoding_base52 = base52_encode(string)
print(encoding_base52)
# >>> ["EgcgYRPxckylMQWRLDADNZxPJiJcHaVwYHLnicahBgaotGGANZuvsvcpSSOJFLXvKPjRlNQCJqqdviiIdtnwJyDOnWojsrpkWSTZFHbMIREvREjpsODtSxoLlLjQZOoehsGFzawGQecyuomgpZQNyFnZQLWPiDhzClwxBFCCwdqduGJoshrwFdwHWMtJpSTmjxzaYmNvzOIOwLkJvyQHCaFtrODPhbhBpPBmC"]
assert base26_decode(encoding_base26) == string
assert base52_decode(encoding_base52) == string
🧠 motivation
The library is inspired by R. Heaton's base26 implementation and his story of manipulating data transmission in restrictive network channels on long-distance flights using alphabetic-only encodings and tokenization.
have a look at the original repository and story blog post and show him some love.
📊 benchmarking
our implementation is orders of magnitude more efficient on 100k+ strings:
Figure 1: runtime and memory usage performance against Heaton's original implementation with and without automatic chunking and SIMD on variable-length strings with a strict 60-second timeout; average over 5 trials.
🤝 contributing
contributions to alphacodings are welcome!
feel free to submit pull requests or open issues on our repository.
📄 license
see the LICENSE file for more details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file alphacodings-0.2.0.tar.gz.
File metadata
- Download URL: alphacodings-0.2.0.tar.gz
- Upload date:
- Size: 1.8 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.5.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
00af7678a8d6699614b75d4c59353c7ce82769ff565e219b95b63f62595a9c9d
|
|
| MD5 |
609af95b0d2a065b6b550514acfca817
|
|
| BLAKE2b-256 |
c461d7ded79cb9515c70d056bb2b2d5fc654a9c3ffad3f57df59d5adfa2f9eaf
|
File details
Details for the file alphacodings-0.2.0-py3-none-any.whl.
File metadata
- Download URL: alphacodings-0.2.0-py3-none-any.whl
- Upload date:
- Size: 6.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.5.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
efa215f5dca2d5c3b2e67e6c1bc8daa4404c07f11c41d2c3731a28b6f3430e22
|
|
| MD5 |
f7d53af9a7b7fd97bbe000a1a088a248
|
|
| BLAKE2b-256 |
c0b191861670659888fc655cc96cfae324a9413eb4fe0610a5f709090e55c51f
|