Skip to main content

Implementing auto adpq

Project description

Docs Paper PyPI CI Build and Release

auto_adpq

Adaptive Post-Training Quantization tooling (replicating AdpQ)

This repository implements tools and reference code to reproduce the ideas from AdpQ: A Zero-shot Calibration Free Adaptive Post Training Quantization Method for LLMs.

This README explains how to install, run tests, build documentation (including multi-version docs), and contribute.

Installation

Install from PyPI (recommended):

python -m pip install auto_adpq

Install the latest development version directly from GitHub:

python -m pip install "git+https://github.com/Tfloow/auto_adpq.git"

To develop locally (editable install):

git clone https://github.com/Tfloow/auto_adpq.git
cd auto_adpq
python -m pip install -e .

Makefile helper:

# Run formatting, linting, coverage and docs targets as defined in Makefile
make

Quick usage

Import the package and use the public API. Example (replace with real API):

from auto_adpq import Auto_AdpQ

Add a short usage snippet here specific to the package functions you expect users to try first.

Running tests & linters

Coverage test: 91%

  • Run tests with pytest:
pytest -q
  • Run full coverage report (Makefile target):
make coverage
  • Format & lint with ruff (Makefile target):
make ruff

Debug mode

To obtain logs of the package, it is possible to enable the logging module. To activate it please create the new environment variable AUTO_ADPQ_DEBUG by running:

# Linux
export AUTO_ADPQ_DEBUG=1

# Windows
$Env:AUTO_ADPQ_DEBUG = 1

Documentation

The documentation can be found here.

Building the documentation

This project uses Sphinx for documentation. There are two common workflows:

  • Build a single-version site (useful for local writing and previews):
python -m pip install -r docs/requirements.txt
python -m sphinx -b html docs docs/_build/html
  • Build a multi-version site using sphinx-multiversion (we configure this in docs/conf.py). This produces one static site containing each built branch and tag (useful for publishing versioned docs with a dropdown selector):
python -m pip install -r docs/requirements.txt
sphinx-multiversion docs docs/_build/html-mv

Notes about versions

  • The project includes a small template docs/_templates/versions.html which renders a versions dropdown when the site is built with sphinx-multiversion.
  • Adjust smv_tag_whitelist and smv_branch_whitelist in docs/conf.py to control which tags/branches are included in the build.

Tasklist

  • Solve the datapacking issue #1
  • Optimize pydantic module AdpQQuantizedWeights
    • Currently, there is a major overhead when creating a new object to validate the field. Since it is used internally only, we could ditch the Pydantic module but would need to ensure proper dump and load function
  • Support model and integrate with .safetensors

Quantized models

Pre-quantized models are available in this collection. They are simulated models meaning they are stored as bf16 values instead of the quantized versions. If I stored them in the custom format, I would either need an algorithm to reconstruct the weights in full at runtime or develop a custom CUDA kernel, which is quite tough.

Nonetheless, those models represent the quality and rounding errors that a typical quantized model can meet.

Contributing

Contributions are welcome. A suggested workflow:

  1. Fork the repository and create a feature branch.
  2. Add tests for new functionality.
  3. Run ruff to format and lint.
  4. Open a pull request describing the change.

Please include unit tests and keep the public API stable when possible.

Development notes

  • Docs templates: docs/_templates/versions.html — version switcher used by sphinx-multiversion.
  • Makefile targets: make ruff, make coverage, make docs (runs single and multiversion builds).

License

This work is under Apache 2.0 License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

auto_adpq-0.3.4.tar.gz (24.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

auto_adpq-0.3.4-py3-none-any.whl (17.3 kB view details)

Uploaded Python 3

File details

Details for the file auto_adpq-0.3.4.tar.gz.

File metadata

  • Download URL: auto_adpq-0.3.4.tar.gz
  • Upload date:
  • Size: 24.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for auto_adpq-0.3.4.tar.gz
Algorithm Hash digest
SHA256 21a37d781eea88fb25fcc7828ccd5432cb1ace19e598db6a3cd51d7d60608d5f
MD5 93e83a0488d7faf484cdb6daac1a843f
BLAKE2b-256 a8881094431aaca67bf3055b5683d14d46b37f3454502b0ef0d094bc44e543b5

See more details on using hashes here.

File details

Details for the file auto_adpq-0.3.4-py3-none-any.whl.

File metadata

  • Download URL: auto_adpq-0.3.4-py3-none-any.whl
  • Upload date:
  • Size: 17.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for auto_adpq-0.3.4-py3-none-any.whl
Algorithm Hash digest
SHA256 bafc269e758e229192ef7ad8ebf2da0c95bede5652ea8f89bd0dc3888abd113a
MD5 ffcb5b42c4342d0f457e3d742084da09
BLAKE2b-256 4c6435f22a75d570ef12418a270db7b8b3242f18960577a1f18f5816a703a0ea

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page