Skip to main content

Extreme compression for large language models. Download pre-compressed models from Hugging Face Hub; self-compress support coming soon.

Reason this release was yanked:

Superseded by 0.1.2 with corrected package metadata. Please install 0.1.2 or later.

Project description

UltraCompress

Extreme compression for large language models. Patent pending — USPTO 64/049,511 + 64/049,517

PyPI Python License

Run large language models on less hardware. UltraCompress compresses modern transformer LLMs by 26–734× with minimal quality loss. The underlying methods are patent pending; this CLI lets you download pre-compressed reference models and run them locally.

Install

pip install ultracompress

Quickstart

# List pre-compressed models available on the official Hugging Face Hub
uc list

# Download a pre-compressed model (Qwen3-1.7B at 2.798 bpw, ~30% smaller than bnb-nf4)
uc pull sipsalabs/qwen3-1.7b-uc2p79

# Inspect what's in a compressed artifact
uc info ./models/qwen3-1.7b-uc2p79

# Benchmark the compressed model against the fp16 teacher
uc eval ./models/qwen3-1.7b-uc2p79 --tasks hellaswag --limit 500

What's available today (v0.1 — alpha)

  • uc list — browse pre-compressed models from our Hugging Face Hub collection.
  • uc pull <model-id> — download a pre-compressed model locally.
  • uc info <path> — inspect the compression metadata of an artifact.
  • uc eval <path> --tasks <list> — run downstream benchmarks via lm-eval-harness on the compressed model.

What's coming

  • uc compress <hf-model-id> --bpw 2.8 — self-compression (private method, coming once patents clear).
  • uc serve <path> — inference server with OpenAI-compatible API.
  • uc export --format gguf — export to llama.cpp GGUF format.
  • uc export --format coreml — export to Apple CoreML for on-device inference.

Why UltraCompress

On a 6-model × 8-method × 500-sample head-to-head benchmark:

Method Bits per weight Cohort median T1 retention Catastrophic failures
bitsandbytes int8 8.000 99.75% 0/6
bitsandbytes nf4 4.000 98.31% 0/6
HQQ 4-bit g64 4.500 97.72% 0/6
UltraCompress 2.8 bpw 2.798 95.63% 0/6
HQQ 3-bit g64 3.500 72.46% 1/6
HQQ 2-bit g64 2.500 3.46% 6/6

UltraCompress is the only sub-3-bpw method on this cohort that produces zero catastrophic failures.

Patent status

The UltraCompress compression methods are the subject of pending U.S. patent applications. Pre-compressed models are distributed under a separate licensing arrangement described in LICENSE. The CLI code in this repository is Apache-2.0.

Citation

@misc{ounnar2026ultracompress,
  title   = {UltraCompress: Extreme Compression for Large Language Models},
  author  = {Missipssa Ounnar},
  year    = {2026},
  howpublished = {\url{https://mounnar.vercel.app}}
}

Author

Missipssa Ounnarmounnar.vercel.app · github.com/mounnar

Built on a dual-RTX-5090 workstation I designed and assembled myself. Patent pending.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ultracompress-0.1.0.tar.gz (876.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ultracompress-0.1.0-py3-none-any.whl (13.3 kB view details)

Uploaded Python 3

File details

Details for the file ultracompress-0.1.0.tar.gz.

File metadata

  • Download URL: ultracompress-0.1.0.tar.gz
  • Upload date:
  • Size: 876.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for ultracompress-0.1.0.tar.gz
Algorithm Hash digest
SHA256 32c3a074f14e619aafb604389e195ccb5e32d38fd9bc463a587fbb37f0d6fca7
MD5 8d6c38236ba789113507ad40c72dfc6c
BLAKE2b-256 52370e34ff54364195e6040d23f52ab67eb9e64287a496d2389dbb8d5fb21db2

See more details on using hashes here.

File details

Details for the file ultracompress-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: ultracompress-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 13.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for ultracompress-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 aead7429babccfff4026f9a37b8f3f571f5058dd381ffe9ead87bee8d7d95d74
MD5 495287ee477d6205b26a29fe65450180
BLAKE2b-256 a434c359c8c767354b942e5504ae19753bce6f43705eea7fa30080e099c3d46e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page