Production-grade model quantization SDK for enterprise custom models (AWQ, GGUF, and CoreML)

These details have not been verified by PyPI

Project links

Project description

Qwodel - Production-Grade Model Quantization

Qwodel is a production-ready Python package for model quantization across multiple backends (AWQ, GGUF, CoreML). It provides a unified, intuitive API for quantizing large language models with minimal code.

Features

Unified API - Simple interface across all quantization backends
Multiple Backends - AWQ (GPU), GGUF (CPU), CoreML (Apple devices)
Optional Dependencies - Install only what you need
CLI & Python API - Use via command line or programmatically
Type Safe - Full type hints and mypy validation
Well Documented - Comprehensive docs with examples

Quick Start

Installation

Quick Install (All Backends)

pip install qwodel[all]

This installs all backends (GGUF, AWQ, CoreML) with PyTorch 2.1.2 (CPU version).

GPU Support (for AWQ only)

If you need GPU quantization with AWQ, install PyTorch with CUDA first:

# 1. Install PyTorch with CUDA 12.1
pip install torch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 --index-url https://download.pytorch.org/whl/cu121

# 2. Install qwodel
pip install qwodel[all]

Note: GGUF and CoreML work perfectly fine with CPU-only PyTorch!

Individual Backends

# GGUF only (CPU quantization - most popular!)
pip install qwodel[gguf]

# AWQ only (GPU quantization)
pip install qwodel[awq]

# CoreML only (Apple devices)
pip install qwodel[coreml]

Local Development

# Clone and install locally
cd /path/to/qwodel
pip install -e .[all]

Python API

from qwodel import Quantizer

# Create quantizer
quantizer = Quantizer(
    backend="gguf",
    model_path="meta-llama/Llama-2-7b-hf",
    output_dir="./quantized"
)

# Quantize model
output_path = quantizer.quantize(format="Q4_K_M")
print(f"Quantized model saved to: {output_path}")

CLI

# Quantize a model
qwodel quantize \
    --backend gguf \
    --format Q4_K_M \
    --model meta-llama/Llama-2-7b-hf \
    --output ./quantized

# List available formats
qwodel list-formats --backend gguf

Supported Backends

GGUF (CPU Quantization)

Use Case: CPU inference, broad compatibility
Formats: Q4_K_M, Q8_0, Q2_K, Q5_K_M, and more
Best For: Most users, CPU-based deployment

AWQ (GPU Quantization)

Use Case: NVIDIA GPU inference
Formats: INT4
Best For: GPU deployments, maximum speed
Requires: CUDA 12.1+

CoreML (Apple Devices)

Use Case: iOS, macOS, iPadOS deployment
Formats: FLOAT16, INT8, INT4
Best For: Apple device deployment

Examples

Batch Processing

from qwodel import quantize

models = ["meta-llama/Llama-2-7b-hf", "meta-llama/Llama-2-13b-hf"]

for model in models:
    quantize(
        model_path=model,
        backend="gguf",
        format="Q4_K_M",
        output_dir="./quantized"
    )

Custom Progress Callback

from qwodel import Quantizer

def progress_handler(progress: int, stage: str, message: str):
    print(f"[{progress}%] {stage}: {message}")

quantizer = Quantizer(
    backend="gguf",
    model_path="./my-model",
    output_dir="./output",
    progress_callback=progress_handler
)

quantizer.quantize(format="Q4_K_M")

Documentation

Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.

License

MIT License - see LICENSE for details.

Acknowledgments

Qwodel builds upon the excellent work of:

llama.cpp for GGUF quantization
llm-compressor for AWQ quantization
CoreMLTools for CoreML conversion

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.4.0

Apr 9, 2026

1.3.0

Feb 25, 2026

0.0.13

Feb 21, 2026

0.0.12

Feb 18, 2026

0.0.11

Feb 18, 2026

0.0.10

Feb 18, 2026

0.0.9

Feb 18, 2026

This version

0.0.8

Feb 18, 2026

0.0.7

Feb 18, 2026

0.0.6

Feb 18, 2026

0.0.5

Feb 18, 2026

0.0.4

Feb 17, 2026

0.0.0

Feb 18, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

qwodel-0.0.8.tar.gz (208.6 kB view details)

Uploaded Feb 18, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

qwodel-0.0.8-py3-none-any.whl (208.4 kB view details)

Uploaded Feb 18, 2026 Python 3

File details

Details for the file qwodel-0.0.8.tar.gz.

File metadata

Download URL: qwodel-0.0.8.tar.gz
Upload date: Feb 18, 2026
Size: 208.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for qwodel-0.0.8.tar.gz
Algorithm	Hash digest
SHA256	`77efd9c68c8a5dbcd08c53c2c2289ce5099a522f0e9174f5d530797d5135ad19`
MD5	`ca44e9fac8165fef7d33054a67e5049c`
BLAKE2b-256	`25d4397afabf310c15ff5f0674a3edd46c118b416d73bf2b351e0db21bff51dd`

See more details on using hashes here.

File details

Details for the file qwodel-0.0.8-py3-none-any.whl.

File metadata

Download URL: qwodel-0.0.8-py3-none-any.whl
Upload date: Feb 18, 2026
Size: 208.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for qwodel-0.0.8-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e6fd8e5e8fb1f57e1b8321b291b21743dac698a5a214d3f1962ec0f2d7a273cc`
MD5	`1360044c53efda36fb807c9e42da422a`
BLAKE2b-256	`271378b1a164306900698715fd78ea4dc1bfd1250fa73ac446b933a450c4f415`

See more details on using hashes here.

Qwodel 0.0.8

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Qwodel - Production-Grade Model Quantization

Features

Quick Start

Installation

Quick Install (All Backends)

GPU Support (for AWQ only)

Individual Backends

Local Development

Python API

CLI

Supported Backends

GGUF (CPU Quantization)

AWQ (GPU Quantization)

CoreML (Apple Devices)

Examples

Batch Processing

Custom Progress Callback

Documentation

Contributing

License

Acknowledgments

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes