Model weight compression and streaming decode library

These details have not been verified by PyPI

Project links

Project description

helix-substrate

Model weight compression and streaming decode library. Compress neural network weights into a compact format (CDNA), then run matrix operations directly from the compressed representation — without ever loading the full weight matrix into memory.

What it does

CDNA Format — Quantize model weights into a 256-entry codebook + uint8 indices, with per-block brotli compression and SHA256 verification.
Streaming Block Decode — Compute Y = X @ W where W is stored in CDNA format. W is never fully loaded. Instead, blocks of rows are decompressed one at a time, multiplied against the corresponding slice of X, and accumulated.
Structural Entropy (Se) Routing — Measure tensor complexity via Se = H × U × D (entropy × unstructuredness × rank depth). The Se score maps to a compute routing decision: simple tensors → CPU, parallel tensors → GPU, complex unstructured tensors → QPU.
Receipts — Every operation produces a tamper-evident receipt with SHA256 input/output hashes, timing, memory usage, and fidelity metrics. If you can't verify it, it didn't happen.

Benchmarks

Measured peak memory for Y = X @ W with streaming vs loading full W:

Matrix Size	Block Rows	Standard	Streaming	Ratio
64 MB	64	64 MB	18 MB	3.5x
256 MB	64	256 MB	68 MB	3.8x
1 GB	64	1024 MB	137 MB	7.5x
1 GB	32	1024 MB	69 MB	14.9x

Streaming overhead is ~constant (~68 MB). Ratio improves with matrix size. At LLM scale (1GB+ weight matrices), expect 7-15x memory reduction depending on block size.

Correctness: cosine similarity = 1.000000 (exact match to full-matrix computation).

Run the benchmark yourself:

python tools/bench_memory.py --rows 16384 --cols 16384 --block-rows 32

Install

pip install helix-substrate

For HuggingFace model conversion:

pip install helix-substrate[hf,brotli]

Required: numpy>=1.24. Optional: brotli, zstandard (compression), huggingface_hub, safetensors (model conversion).

Quick start

Convert a HuggingFace model

helix-substrate convert mistralai/Mistral-7B-v0.1 --output ./mistral-cdna

This downloads the model, quantizes each weight tensor to CDNA format, and saves to a directory. The model is now ready for streaming inference.

Compress a tensor

import numpy as np
from helix_substrate import encode_tensor_to_cdna, decode_cdna_to_tensor

# Compress
W = np.random.randn(4096, 4096).astype(np.float32)
encode_tensor_to_cdna(W, "weight.cdna", tensor_name="my_layer")

# Decompress
W_decoded = decode_cdna_to_tensor("weight.cdna")
print(f"Cosine similarity: {np.dot(W.flat, W_decoded.flat) / (np.linalg.norm(W) * np.linalg.norm(W_decoded)):.6f}")

Measure tensor complexity

from helix_substrate import compute_tensor_se

result = compute_tensor_se(W)
print(f"Se={result['Se']:.3f} → route to {result['routing_hint']}")
# Se=0.42 → route to gpu

Streaming decode (CDNAv2)

from helix_substrate.stream_matmul import stream_xw_from_cdna

# Y = X @ W, but W is never fully loaded
X = np.random.randn(1, 256, 4096).astype(np.float32)
Y, receipt = stream_xw_from_cdna(X, "weight.cdna2.hxz")
print(f"Memory savings: {receipt.savings_factor:.1f}x")

Package structure

helix_substrate/
├── __init__.py          # Public API
├── cdna_encoder.py      # CDNA v1 encode/decode (k-means quantization)
├── cdna_reader.py       # CDNA v2 reader (block-indexed, brotli, SHA256)
├── sidecar.py           # HXZO outlier sidecar (high-precision corrections)
├── stream_matmul.py     # Core: Y = X @ W from CDNA (streaming, never loads W)
├── stream_attention.py  # Full attention layer (Q,K,V,O all streamed)
├── stream_ffn.py        # Full FFN layer (gate,up,down all streamed)
├── stream_block.py      # Full transformer block (attention + FFN + norms)
├── rope.py              # Rotary Position Embeddings
├── se.py                # Structural Entropy estimator and routing
└── receipt.py           # Tamper-evident execution receipts

The Se formula

Structural Entropy decomposes tensor complexity into three independent factors:

Component	Measures	High means
H (entropy)	Singular value spread	Energy spread across many directions
U (unstructuredness)	Neighbor coherence	No spatial correlation between adjacent rows
D (depth)	Effective rank ratio	Many dimensions matter

Se = H × U × D produces a 0-1 score. The 2D routing policy uses (Se, C_struct) jointly:

Zone 1: Se < 0.30, structured → CPU
Zone 2: 0.30 ≤ Se < 0.70 → GPU
Zone 3: Se ≥ 0.70, unstructured → QPU
Zone 4: Se ≥ 0.70, structured → GPU

Inspiration

The mathematical patterns in this library draw from nature — Fibonacci sequences in the block structure, golden-ratio-inspired codebook initialization, and structural entropy as a measure of order vs chaos in weight matrices. The thesis: nature already solved the compression math we need, because it's the world's largest dataset.

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.3.3

Apr 2, 2026

0.3.2

Apr 2, 2026

0.3.1

Apr 2, 2026

0.3.0

Mar 31, 2026

0.2.6

Mar 30, 2026

0.2.5

Mar 29, 2026

0.2.4

Mar 29, 2026

0.2.3

Mar 29, 2026

0.2.2

Mar 29, 2026

0.2.1

Mar 5, 2026

This version

0.2.0

Mar 5, 2026

0.1.0

Mar 5, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

helix_substrate-0.2.0.tar.gz (63.7 kB view details)

Uploaded Mar 5, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

helix_substrate-0.2.0-py3-none-any.whl (68.2 kB view details)

Uploaded Mar 5, 2026 Python 3

File details

Details for the file helix_substrate-0.2.0.tar.gz.

File metadata

Download URL: helix_substrate-0.2.0.tar.gz
Upload date: Mar 5, 2026
Size: 63.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for helix_substrate-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`b1315e7c9736dbcddadc756180e90508d417785a15abc237cdb32c2bb32c51b2`
MD5	`5afbd19861e41c0c6ff4aba3eeb9d275`
BLAKE2b-256	`5f11458add52c15d3a757ff3cd495c7ab97720262a0a2c4cfd69d5c9b1067084`

See more details on using hashes here.

File details

Details for the file helix_substrate-0.2.0-py3-none-any.whl.

File metadata

Download URL: helix_substrate-0.2.0-py3-none-any.whl
Upload date: Mar 5, 2026
Size: 68.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for helix_substrate-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ea6ae1100246153deee4b2d4c6ab397cd179d435eba6596dadf330ce304d9c57`
MD5	`c4cfdc3b4ea992e2350f9c7119b51dd2`
BLAKE2b-256	`2cd0ef3f17d533452ed0eb8fa1d4d08795a3600662066a1c3fe155c5ddf03e9e`

See more details on using hashes here.

helix-substrate 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

helix-substrate

What it does

Benchmarks

Install

Quick start

Convert a HuggingFace model

Compress a tensor

Measure tensor complexity

Streaming decode (CDNAv2)

Package structure

The Se formula

Inspiration

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes