Skip to main content

Model Compression & Acceleration Module for sageLLM

Project description

sagellm-compression

Protocol Compliance (Mandatory)

CI PyPI version Python 3.10+ codecov

Inference acceleration tools for LLM: quantization, sparsity, speculative decoding, kernel fusion, and more.

Features

  • Quantization (INT8/INT4)
  • Sparsity (structured and unstructured pruning)
  • Speculative decoding
  • Kernel fusion
  • Chain-of-Thought acceleration

Installation

pip install isagellm-compression

Quick Start

from sagellm_compression import QuantizationConfig, apply_quantization

config = QuantizationConfig(method="int8", per_channel=True)
quantized_model = apply_quantization(model, config)

Development

git clone git@github.com:intellistream/sagellm-compression.git
cd sagellm-compression
./quickstart.sh

pip install -e ".[dev]"
pytest tests/ -v

Documentation

License

Private - IntelliStream Research Project

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

isagellm_compression-0.3.0.0.tar.gz (4.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

isagellm_compression-0.3.0.0-py2.py3-none-any.whl (3.1 kB view details)

Uploaded Python 2Python 3

File details

Details for the file isagellm_compression-0.3.0.0.tar.gz.

File metadata

  • Download URL: isagellm_compression-0.3.0.0.tar.gz
  • Upload date:
  • Size: 4.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for isagellm_compression-0.3.0.0.tar.gz
Algorithm Hash digest
SHA256 adda34b974e75c3075052e916cb1ba758662feb45d5a77f849156a21be4810fb
MD5 7c34c54c87e306f6407f51bf45f888c0
BLAKE2b-256 5f2243bfd6517930b130c5ccf3155e9120c6364c1f21e27e29d64c1a11b3dff2

See more details on using hashes here.

File details

Details for the file isagellm_compression-0.3.0.0-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for isagellm_compression-0.3.0.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 b2187b001a5140cef0cbeb04af32fdda5b54a18a054118085e76817299306432
MD5 941fc13f979a2551e928d6cb369c0a92
BLAKE2b-256 9bc277e8cc84110d051d080b12615449129b3928786048e7490b7daaea170367

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page