Package for applying ao techniques to GPU models

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

torchao: PyTorch Architecture Optimization

Note: This repository is currently under heavy development - if you have suggestions on the API or use-cases you'd like to be covered, please open an GitHub issue

Introduction

torchao is a PyTorch native library for optimizing your models using lower precision dtypes, techniques like quantization and sparsity and performant kernels.

Get Started

To try out our APIs, you can check out API examples in quantization (including autoquant), sparsity, dtypes.

Installation

Note: this library makes liberal use of several new features in pytorch, it's recommended to use it with the current nightly or latest stable version of PyTorch.

From PyPI:

pip install torchao

From Source:

git clone https://github.com/pytorch-labs/ao
cd ao
pip install -e .

Key Features

The library provides

Support for lower precision dtypes such as nf4, uint4 that are torch.compile friendly
Quantization algorithms such as dynamic quant, smoothquant, GPTQ that run on CPU/GPU and Mobile.

Int8 dynamic activation quantization
Int8 and int4 weight-only quantization
Int8 dynamic activation quantization with int4 weight quantization
GPTQ and Smoothquant
High level autoquant API and kernel auto tuner targeting SOTA performance across varying model shapes on consumer/enterprise GPUs.

Sparsity algorithms such as Wanda that help improve accuracy of sparse networks
Integration with other PyTorch native libraries like torchtune and ExecuTorch

Our Goals

torchao embodies PyTorch’s design philosophy details, especially "usability over everything else". Our vision for this repository is the following:

Composability: Native solutions for optimization techniques that compose with both torch.compile and FSDP
- For example, for QLoRA for new dtypes support
Interoperability: Work with the rest of the PyTorch ecosystem such as torchtune, gpt-fast and ExecuTorch
Transparent Benchmarks: Regularly run performance benchmarking of our APIs across a suite of Torchbench models and across hardware backends
Heterogeneous Hardware: Efficient kernels that can run on CPU/GPU based server (w/ torch.compile) and mobile backends (w/ ExecuTorch).
Infrastructure Support: Release packaging solution for kernels and a CI/CD setup that runs these kernels on different backends.

Interoperability with PyTorch Libraries

torchao has been integrated with other repositories to ease usage

torchtune is integrated with 8 and 4 bit weight-only quantization techniques with and without GPTQ.
Executorch is integrated with GPTQ for both 8da4w (int8 dynamic activation, with int4 weight) and int4 weight only quantization.

Success stories

Our kernels have been used to achieve SOTA inference performance on

Image segmentation models with sam-fast
Language models with gpt-fast
Diffusion models with sd-fast

License

torchao is released under the BSD 3 license.

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

This version

2024.4.26

Apr 26, 2024

2024.4.25

Apr 25, 2024

2024.4.24

Apr 24, 2024

2024.4.23

Apr 23, 2024

2024.4.22

Apr 22, 2024

2024.4.21

Apr 21, 2024

2024.4.20

Apr 20, 2024

2024.4.19

Apr 19, 2024

2024.4.18

Apr 18, 2024

2024.4.17

Apr 17, 2024

2024.4.16

Apr 16, 2024

2024.4.15

Apr 15, 2024

2024.4.14

Apr 14, 2024

2024.4.13

Apr 13, 2024

2024.4.12

Apr 12, 2024

2024.4.11

Apr 11, 2024

2024.4.10

Apr 10, 2024

2024.4.9

Apr 9, 2024

2024.4.8

Apr 8, 2024

2024.4.7

Apr 7, 2024

2024.4.6

Apr 6, 2024

2024.4.5

Apr 5, 2024

2024.4.4

Apr 4, 2024

2024.4.3

Apr 3, 2024

2024.4.2

Apr 2, 2024

2024.4.1

Apr 1, 2024

2024.3.31

Mar 31, 2024

2024.3.30

Mar 30, 2024

2024.3.29

Mar 29, 2024

2024.3.28

Mar 28, 2024

2024.3.27

Mar 27, 2024

2024.3.26

Mar 26, 2024

2024.3.25

Mar 25, 2024

2024.3.24

Mar 24, 2024

2024.3.23

Mar 23, 2024

2024.3.22

Mar 22, 2024

2024.3.21

Mar 21, 2024

2024.3.20

Mar 20, 2024

2024.3.19

Mar 19, 2024

2024.3.18

Mar 18, 2024

2024.3.17

Mar 17, 2024

2024.3.16

Mar 16, 2024

2024.3.15

Mar 15, 2024

2024.3.14

Mar 14, 2024

2024.3.13

Mar 13, 2024

2024.3.7

Mar 8, 2024

0.0.3

Mar 8, 2024

0.0.3.dev20240313 pre-release

Mar 13, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

torchao_nightly-2024.4.26.tar.gz (98.8 kB view hashes)

Uploaded Apr 26, 2024 Source

Built Distribution

torchao_nightly-2024.4.26-py3-none-any.whl (120.1 kB view hashes)

Uploaded Apr 26, 2024 Python 3

Hashes for torchao_nightly-2024.4.26.tar.gz

Hashes for torchao_nightly-2024.4.26.tar.gz
Algorithm	Hash digest
SHA256	`a8511e1d259d8855f0862433d057e0a2757c698925ac44ae948dfb334669033c`
MD5	`9e77bca7f9c77c3b87670c4ecb27b3fb`
BLAKE2b-256	`36a4b8cbb1e3ba66fbe02131d666e30ec935364e1205fc2b8a0dad7348e996ce`

Hashes for torchao_nightly-2024.4.26-py3-none-any.whl

Hashes for torchao_nightly-2024.4.26-py3-none-any.whl
Algorithm	Hash digest
SHA256	`01c9251d145cba868288e20dd2f611d392d7621bfd616bb3916637ad04492f43`
MD5	`6ed4d3c7d6e0a92eedee95a6483234d6`
BLAKE2b-256	`9afbaa370de523b4ce71cae900f70929c36cdc4c59d54ccfe355525f74a82a0a`