PPQ is an offline quantization tools

These details have not been verified by PyPI

Project links

Home

Project description

Minimal PPQ (mPPQ) PPQ 裁剪版

PPQ Readme

This repo will continue the development of PPQ, maintain features and fix bugs. I pruned most deprecated APIs and sample codes, functions in original PPQ and make it more extensible and easier to use.

Key Changes

1. TargetPlatform

TargetPlatform in PPQ mixed the concept of Precision and Device. We can see TargetPlatform.OPENVINO_INT8 comes with TargetPlatform.INT8.

In mPPQ, I separate the concept of Precision and Device and all calls to TargetPlatform are now changed to TargetPrecision.

I also removed platforms support in PPQ because they are lacking of correct maintenance and support. Instead of that, I add a new api to register your customized platform as well as your frontend parser, dispatcher and quantizer.

2. Extension

Users need to register their own platforms to perform a correct quantization. There is no pre-defined platforms in mPPQ, just a sample quantizer for example.

Let me explain the way users need to follow.

Parser and Exporter

I keep onnx_parser and onnxruntime_exporter as the default graph serialization and de-serialization methods. Users can create a new parser by inheriting GraphBuilder and a new exporter by inheriting GraphExporter.

class MyParser(GraphBuilder):
    def build(self, model_object: Any, **kwargs) -> BaseGraph:
        """Parser offers the way how to read from a model object and turn it into PPQ BaseGraph.
        """
        ...

class MyExporter(GraphExporter):
    def export(
        self,
        file_path: str | os.PathLike,
        graph: BaseGraph,
        config_path: Optional[str] = None,
        **kwargs,
    ):
        """Exporter offers the way how to serialize a PPQ BaseGraph into a model object.
        ...

The model object can be onnx or others like openvino .xml or qnn .dlc.

Dispatcher

Dispatcher is the core concept from PPQ, it will analysis the whole graph and decide which op should be quantized in which precision.

In mPPQ, all pre-defined dispatchers in PPQ are still kept, and they are:

Users can create a new dispatcher by inheriting GraphDispatcher and implement its own logic.

Quantizer

Quantizer is a fundamental component in PPQ, it will control all the operations that needed to quantize a model.

In mPPQ quantizer, it will offer 3 abstract methods for users to implement:

@abstractmethod
def init_quantize_config(self, operation: Operation) -> OperationQuantizationConfig:
    r"""Return a query to the operation how it should be quantized."""
    raise NotImplementedError

@property
def default_prequant_pipeline(self) -> QuantizationOptimizationPipeline:
    r"""A simplified API to return a default quantization pipeline."""
    return QuantizationOptimizationPipeline([])

@property
def default_quant_pipeline(self) -> QuantizationOptimizationPipeline:
    r"""A simplified API to return a default quantization pipeline."""
    raise NotImplementedError

Quantizer doesn't work with dispatcher any more, now user's dispatcher (or builtin ones) must dispatch all operations in the graph to a specific precision, if one of the operation is Precision.UNSPECIFIED, quantizer will raise an error now.

Platform

In mPPQ, I provide a register to support add new parsers, exporters, dispatchers and quantizers to a specific platform from external codebase.

from mppq.api import load_quantizer, register_platform, export_ppq_graph

MyPlatformID = 1

register_platform(
    MyPlatformID,
    dispatcher={"mydisp": MyDispatcher},
    quantizer={"myquant": MyQuantizer},
    parsers={"myparser": MyParser},
    exporters={"myexporter": MyExporter},
)

quantizer = load_quantizer("mymodel.onnx", MyPlatformID)
quantized_graph = quantizer.quantize()
export_ppq_graph(quantized_graph, "quantized_mymodel.onnx")

Users can use builtin dispatcher by specifying a name in load_quantizer, but I highly recommend to know very detail of your platform and design your own dispatcher and quantizer.

register_platform(
    MyPlatformID,
    dispatcher={},  # need to specify a builtin dispatcher name in quantize api
    quantizer={"myquant": MyQuantizer},
    parsers={},  # use builtin onnx parser
    exporters={},  # use builtin onnx exporter
)

quantizer = load_quantizer("mymodel.onnx", MyPlatformID, dispatcher="allin")

Operation

In mPPQ, most ONNX operators up to opset 19 are supported. In order to add a new support from users code base, users can register a new operation to a specific platform.

from mppq.api import register_operation

@register_operation("myop", MyPlatformID)
def myop_forward(
    op: Operation,
    values: Sequence[torch.Tensor],
    ctx: Optional[TorchBackendContext] = None,
    **kwargs,
) -> torch.Tensor | Tuple[torch.Tensor, ...]:
    """
    Args:
        op: operation information (precision, attributes, types, etc.)
        values: input tensors
        ctx: execution device information

    Returns:
        a result tensor or a tuple of result tensors
    """
    ...

Contribution

All mPPQ python codes are clean with flake8, black, and pyright (except for mppq/executor/op/default.py).

Test coverage: 51% (v0.7.1)

Acknowledgement

PPQ

Project details

These details have not been verified by PyPI

Project links

Home

Release history Release notifications | RSS feed

0.7.6

Apr 25, 2025

0.7.5

Apr 1, 2025

This version

0.7.4

Apr 1, 2025

0.7.2

Mar 31, 2025

0.7.1

Mar 30, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mppq-0.7.4.tar.gz (986.1 kB view details)

Uploaded Apr 1, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

mppq-0.7.4-py3-none-any.whl (316.6 kB view details)

Uploaded Apr 1, 2025 Python 3

File details

Details for the file mppq-0.7.4.tar.gz.

File metadata

Download URL: mppq-0.7.4.tar.gz
Upload date: Apr 1, 2025
Size: 986.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: python-requests/2.32.3

File hashes

Hashes for mppq-0.7.4.tar.gz
Algorithm	Hash digest
SHA256	`5f30455c9a5f31fe075d736ddfb876a1a5a65dfa858d6f6bf6fa52d0abe4c2e3`
MD5	`93022aed5738c5d9c206f1341fd8b8d0`
BLAKE2b-256	`cdeeaf3148aa710d27d64a91c35a0cbd114277bcef2e706a207b0c7ec2248edc`

See more details on using hashes here.

File details

Details for the file mppq-0.7.4-py3-none-any.whl.

File metadata

Download URL: mppq-0.7.4-py3-none-any.whl
Upload date: Apr 1, 2025
Size: 316.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: python-requests/2.32.3

File hashes

Hashes for mppq-0.7.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8f01aef43c51ac66387249b6515f5e3bc21b22b463339eeb49030c6e296f05aa`
MD5	`e2385639ca6e716c532618d688b22592`
BLAKE2b-256	`cd5551f4cdc2e88a874d9af61beb8f3414e21befc8bbbaf09a07eef3a4b620b4`

See more details on using hashes here.

mppq 0.7.4

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Minimal PPQ (mPPQ) PPQ 裁剪版

Key Changes

1. TargetPlatform

2. Extension

Contribution

Acknowledgement

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes