Vortex-Codec: neural lossless byte-level codec

These details have not been verified by PyPI

Project description

Vortex-Codec

Vortex-Codec is a Python library for neural lossless compression using compressive transformers + arithmetic coding. Use it as a package in your projects or via the provided CLI tools.

Installation

# Install in editable/development mode
pip install -e .

# Or install runtime requirements only
pip install -r requirements.txt

Quick usage (library)

import vortex
from vortex.models.optimized_transformer import OptimisedCompressiveTransformer

print(vortex.__version__)
model = OptimisedCompressiveTransformer()

Quick usage (CLI)

# Compress and decompress via installed console scripts
vortex-compress --model PATH_TO_MODEL --input file.bin --output file.vxc --config path/config.yaml
vortex-decompress --model PATH_TO_MODEL --input file.vxc --output recovered.bin --config path/config.yaml

Repository Layout

vortex-codec/
├── vortex/                              # core Python package
│   ├── models/
│   │   ├── __init__.py                  # re-exports all public symbols
│   │   ├── compressive_transformer.py   # base model (CompressiveTransformer)
│   │   └── optimized_transformer.py     # production model (OptimisedCompressiveTransformer)
│   ├── compression/
│   │   └── arithmetic_coding.py         # torchac encode/decode + BPD metric
│   ├── data/
│   │   └── dataset.py                   # make_loaders() for binary / HDF5 files
│   └── utils/
│       ├── training.py                  # LR schedule, checkpointing, EarlyStopping
│       └── zipnn.py                     # Huffman post-training weight compression
├── scripts/
│   ├── train.py                         # full training loop (CATWrapper, AMP, TensorBoard)
│   ├── compress.py                      # file → .vxc bitstream
│   ├── decompress.py                    # .vxc bitstream → file
│   ├── evaluate.py                      # BPD vs gzip / zlib / lzma baselines
│   └── compress_weights.py              # apply ZipNN compression to a checkpoint
├── experiments/
│   ├── atlas_experiment/                # ATLAS FTAG HDF5 -> .bin splits
│   ├── camel_experiment/                # CAMEL HDF5 -> raw + float32 .bin splits
│   ├── hepmc_experiment/                # ATLAS HEPMC tarballs -> .hepmc splits
│   ├── cms_experiment/                  # CMS NanoAOD ROOT -> padded float32 .bin
│   ├── cms_experiment_lg/               # Original large-dataset CMS pipeline
│   └── alice_experiment/                # ALICE ROOT -> padded float32 .bin
├── configs/                             # hardware-specific base configs
│   ├── colab_t4.yaml
│   ├── rtx4070_8gb.yaml
│   ├── default.yaml
│   ├── rtx4090_24gb.yaml
│   └── amd_mi300x.yaml
├── tests/
│   └── test_basic.py
└── docs/
    ├── ARCHITECTURE_COMPARISON.md       # v1 vs v3 component-by-component diff
    └── HARDWARE_GUIDE.md

Architecture

Overview

Vortex-Codec is a byte-level autoregressive model: given a stream of bytes it predicts a probability distribution over the next byte, and uses arithmetic coding (torchac) to encode/decode the stream losslessly. Lower predicted cross-entropy = better compression.

The codebase contains two model variants, both in vortex/models/:

Class	File	Use
`CompressiveTransformer`	`compressive_transformer.py`	Reference / lightweight
`OptimisedCompressiveTransformer`	`optimized_transformer.py`	Production (Flash Attn2, KV cache, RMSNorm)
`CATWrapper`	`optimized_transformer.py`	Dynamic chunk scheduler wrapping either model

`compressive_transformer.py` — Base Model

`TDTEmbedding`

Per-type embedding for IEEE-754 float32 byte streams.
Each of the 4 byte positions within a float32 (mantissa-low through sign/exponent-high) gets its own nn.Embedding(256, d_model) lookup table, since they have very different entropy profiles. An additional learnable type_scale vector (softmax-normalised) gates each table's contribution.

byte (0–255) ──► table[ t % 4 ]  (one of 4 typed tables, scale-gated)
                       ↓
                 h  (B, T, d_model)

`LearnableTokenEviction` (LTE)

Content-adaptive token selection replacing strided Conv1d downsampling.
A lightweight depthwise + pointwise scorer produces per-token importance scores; the top-k (where k = ceil(T / rate)) tokens are kept in original temporal order. A straight-through soft gate (sigmoid-weighted) keeps the operation end-to-end differentiable. A final Conv1d projection + LayerNorm produces the memory representation.

acts (B, T, D) ──► scorer ──► topk ──► soft-gate ──► proj+norm ──► (B, k, D)

`MemoryManager`

Thin wrapper around LearnableTokenEviction. Provides a .compress(acts) method used by attention layers to build compressed memory from past activations.

`CompressiveAttention`

Multi-head attention with two-tier memory:

Local stream: causal scaled_dot_product_attention over the current window (Q, K, V).
Memory stream: cross-attention from current queries into compressed past (Km, Vm from MemoryManager).
Infini-β gating: a per-head learnable scalar β = sigmoid(infini_beta) mixes the two streams: out = β·out_mem + (1−β)·out_local. Initialised at 0 (all local) so training starts stable.
Compressed memory is accumulated across chunks and capped at window // 2 tokens (oldest dropped).

`SwiGLU`

Gated feed-forward block (Shazeer 2020). No bias, no dropout.
out = down( silu(gate(x)) * up(x) ) — two parallel projections to d_ff, one is SiLU-activated and used as a gate.

`TransformerBlock`

LayerNorm → CompressiveAttention → residual → LayerNorm → SwiGLU → residual.

`CompressiveTransformer`

Full byte-level model:

Embedding: standard nn.Embedding or TDTEmbedding (use_tdt=True)
Sinusoidal PositionalEncoding (max 8192)
Stack of TransformerBlock layers
Final LayerNorm + linear projection to vocab logits
Optional per-layer gradient checkpointing (enable_gradient_checkpointing())

Default config: vocab_size=256, d_model=512, n_layers=8, n_heads=8, d_ff=2048, window=512, compression_rate=4.

`optimized_transformer.py` — Production Model

All components from compressive_transformer.py are reused (imported directly). The optimised variant swaps or adds:

`RMSNorm`

Root-Mean-Square normalisation (no mean-centering). ~15 % faster than LayerNorm at the same quality.

`OptimisedCompressiveAttention`

Extends CompressiveAttention with:

Flash Attention 2 (flash_attn_func) for causal attention when CUDA is available; falls back to PyTorch scaled_dot_product_attention automatically.
KV cache: concatenates previously seen K/V tensors for O(1)-per-step autoregressive inference. Returns new_cache = {"k": K, "v": V} each forward pass.
Infini-β init changed to −3.0 (sigmoid → ~0.047) so training starts almost entirely local.

`OptimisedBlock`

RMSNorm → OptimisedCompressiveAttention → residual → RMSNorm → SwiGLU → residual.
Forward signature: (x, comp_mem, kv_cache) → (x, new_comp, new_cache).

`OptimisedCompressiveTransformer`

Drop-in replacement for CompressiveTransformer with all optimised components.
Extra method: vram_estimate_gb(batch_size, seq_len) — returns a dict with parameter, activation, optimizer-state, and total VRAM estimates in GB.

`CATWrapper`

Dynamic chunk scheduler wrapping any model.

Training: randomly samples chunk size from chunk_sizes=(128, 256, 512) each forward pass, enabling multi-scale learning.
Inference: defaults to the largest chunk size; override with chunk_size= argument.
Handles sequences longer than the chunk size by iterating and accumulating memories and kv_caches across chunks (detached between chunks to limit graph size).
Transparent proxy: delegates parameters(), named_parameters(), state_dict(), load_state_dict(), enable_gradient_checkpointing(), and vram_estimate_gb() to the inner model, so checkpoints are portable without the wrapper.

`vortex/compression/arithmetic_coding.py`

Lossless arithmetic coding via torchac:

Function	Description
`probs_to_cdf(probs)`	Converts model output probabilities to a cumulative CDF (with ε-smoothing)
`encode(probs, symbols)`	Encodes a `(B, T)` symbol tensor to `bytes`
`decode(bitstring, probs)`	Decodes `bytes` back to `(B, T)` int16 symbols
`theoretical_bpd(probs, symbols)`	Cross-entropy bits-per-byte — the training objective

`vortex/utils/zipnn.py` — Post-Training Weight Compression

Huffman-based lossless checkpoint size reduction (30–60 % smaller files).
Splits each float32 weight tensor into sign + exponent + mantissa bytes. Exponents and signs are Huffman-coded (low entropy); raw mantissa bytes are stored unmodified (near-random, high entropy). Decompression is exact.

from vortex.utils.zipnn import compress_model_weights, decompress_model_weights

compressed = compress_model_weights(model)
torch.save(compressed, "weights.zipnn.pt")

model2 = MyModel(...)
decompress_model_weights(model2, compressed)

Hardware Configs

File	GPU	VRAM	Params
`colab_t4.yaml`	T4 (Colab)	15 GB	3.2 M
`rtx4070_8gb.yaml`	RTX 4070	8 GB	8.5 M
`default.yaml`	RTX 3090/80	12 GB	14.8 M
`rtx4090_24gb.yaml`	RTX 4090	24 GB	28 M
`amd_mi300x.yaml`	MI300X	192 GB	60 M+

Training Details

The scripts/train.py loop uses OptimisedCompressiveTransformer wrapped in CATWrapper.
Key features:

Mixed precision (torch.amp) with bfloat16 on ROCm/Ampere+, float16 otherwise
Cosine LR schedule with linear warmup (vortex.utils.training.cosine_with_warmup)
Gradient clipping (grad_clip=1.0) + AdamW weight decay
EarlyStopping on validation BPD (patience=5, min_delta=1e-4)
TensorBoard logging + live ASCII scoreboard with BPD trend vs baselines
Gradient checkpointing (enabled per config; ~40 % VRAM reduction)

Default hyperparameters (configs/default.yaml):

d_model: 512  |  n_layers: 8  |  n_heads: 8  |  d_ff: 2048
window: 512   |  compression_rate: 4          |  dropout: 0.1
batch_size: 32  |  lr: 3e-4  |  warmup: 4000  |  max_steps: 100000

ATLAS Dataset

Source: CERN EOS root://eospublic.cern.ch//eos/opendata/atlas/datascience/ATLAS-FTAG-2023-05/
Format: HDF5 → extracted to raw binary (atlas.bin) via download.py
Benchmark sample: mc-flavtag-ttbar-medium.bin (1 GB) — used for both baseline and Vortex evaluation
Structured dtype: 30 fields including pt_btagJes, GN2v01_pb, kinematics, labels
See docs/ARCHITECTURE_COMPARISON.md for a detailed v1 → v3 component diff and BPD benchmarks.

Project details

These details have not been verified by PyPI

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

This version

0.2.0

May 9, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vortex_codec-0.2.0.tar.gz (50.4 kB view details)

Uploaded May 9, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

vortex_codec-0.2.0-py3-none-any.whl (52.9 kB view details)

Uploaded May 9, 2026 Python 3

File details

Details for the file vortex_codec-0.2.0.tar.gz.

File metadata

Download URL: vortex_codec-0.2.0.tar.gz
Upload date: May 9, 2026
Size: 50.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for vortex_codec-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`6e656927226a250465c4cddef8c33e06600e47f3b67f2bc61c031b2d57713e43`
MD5	`6d562244440b1863c3a3834bbd9b9518`
BLAKE2b-256	`ffde92b933986b23b8176705bee88db50167868521f41043c72e363e65613f19`

See more details on using hashes here.

File details

Details for the file vortex_codec-0.2.0-py3-none-any.whl.

File metadata

Download URL: vortex_codec-0.2.0-py3-none-any.whl
Upload date: May 9, 2026
Size: 52.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for vortex_codec-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`746c8dae90135db1efc2974922fdedf3eff95d0316945de66376ef766b26d645`
MD5	`d593415cd8d1e5e13f6de4382984b150`
BLAKE2b-256	`6c21e4461beeb58b20e0b5579161c70999c1f114793448de13a49c7a4bbf7f6d`

See more details on using hashes here.

vortex-codec 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

Vortex-Codec

Repository Layout

Architecture

Overview

compressive_transformer.py — Base Model

TDTEmbedding

LearnableTokenEviction (LTE)

MemoryManager

CompressiveAttention

SwiGLU

TransformerBlock

CompressiveTransformer

optimized_transformer.py — Production Model

RMSNorm

OptimisedCompressiveAttention

OptimisedBlock

OptimisedCompressiveTransformer

CATWrapper

vortex/compression/arithmetic_coding.py

vortex/utils/zipnn.py — Post-Training Weight Compression

Hardware Configs

Training Details

ATLAS Dataset

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`compressive_transformer.py` — Base Model

`TDTEmbedding`

`LearnableTokenEviction` (LTE)

`MemoryManager`

`CompressiveAttention`

`SwiGLU`

`TransformerBlock`

`CompressiveTransformer`

`optimized_transformer.py` — Production Model

`RMSNorm`

`OptimisedCompressiveAttention`

`OptimisedBlock`

`OptimisedCompressiveTransformer`

`CATWrapper`

`vortex/compression/arithmetic_coding.py`

`vortex/utils/zipnn.py` — Post-Training Weight Compression