fraQtl runtime — drop-in loader for fraQtl-compressed Hugging Face checkpoints. Production LLM inference with calibration-aware compression.

These details have not been verified by PyPI

Project links

Project description

fraQtl

Runtime KV-cache and weight compression for production LLM inference.

Drop-in. No retraining. Calibration-aware.

What it is

fraqtl-runtime is the runtime loader for fraQtl-compressed model artifacts. It enables:

Weight compression: load fraQtl-compressed Hugging Face checkpoints (e.g. fraQtl/Qwen3.6-35B-A3B-compressed) via standard transformers with trust_remote_code=True. The wheel ships the compiled loader that decodes the packed weights at load time.
Runtime KV-cache compression (separate, in active validation): a llama.cpp-compatible runtime layer that compresses the V cache at runtime — independent of weight format.

Install

pip install fraqtl-runtime

That's the entire setup. No license token required for loading published artifacts.

Quick start

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

repo = "fraQtl/Qwen3.6-35B-A3B-compressed"
model = AutoModelForCausalLM.from_pretrained(
    repo, trust_remote_code=True,
    torch_dtype=torch.bfloat16, device_map="auto",
)
tok = AutoTokenizer.from_pretrained(repo)

ids = tok("The capital of France is", return_tensors="pt").to(model.device)
print(tok.decode(model.generate(**ids, max_new_tokens=20, do_sample=False)[0]))

trust_remote_code=True pulls a small stub from the model repo that imports the compiled loader from this wheel. You never write import fraqtl directly.

High-level approach

fraQtl combines two ideas:

Calibration-aware eigenbasis rotation — protect the input directions that matter for the deployment task; quantize the rest. The calibration corpus determines which directions are protected (this is FPT — fraQtl Pullback Theorem).
Per-row sign correction primitive — additional precision on top of low-bit quantization where it matters most for reasoning.

Both compose with standard quantization machinery (Lloyd-Max centroids, INT3 packing) and standard inference engines (HF transformers, llama.cpp).

Status

Public weight-compression artifacts on Hugging Face: huggingface.co/fraQtl
Runtime KV-cache compression layer: in active validation. Public benchmark numbers landing after H100 measurement lock and manual review.
Methodology paper in preparation.

License

Proprietary. The compressed model weights and loader are free to install and use for research and evaluation. Production / commercial use: contact fraQtl.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.1

Apr 30, 2026

0.1.0

Apr 27, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

fraqtl_runtime-0.1.1-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (857.4 kB view details)

Uploaded Apr 30, 2026 CPython 3.11manylinux: glibc 2.17+ x86-64

File details

Details for the file fraqtl_runtime-0.1.1-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.

File metadata

Download URL: fraqtl_runtime-0.1.1-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
Upload date: Apr 30, 2026
Size: 857.4 kB
Tags: CPython 3.11, manylinux: glibc 2.17+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for fraqtl_runtime-0.1.1-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
Algorithm	Hash digest
SHA256	`212a0e1636e75bd5f417d243805a35e1f66fbfd68ea87a234f4ba426560fec52`
MD5	`d5acd02a3226f8589e74c979a3f66dc3`
BLAKE2b-256	`5c3c2dd218eaf471fdcb2057cc96b723ef1cc73e9cd55853309a7729cfe064da`

See more details on using hashes here.

fraqtl-runtime 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

fraQtl

What it is

Install

Quick start

High-level approach

Status

Links

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution

File details

File metadata

File hashes