pruna

Smash your AI models

These details have not been verified by PyPI

Project description

Pruna AI Logo

Element Simply make AI models faster, cheaper, smaller, greener! Element

GitHub License GitHub Actions Workflow Status GitHub Release GitHub commit activity PyPI - Downloads Codacy

Pruna AI Logo

Introduction

Pruna is a model optimization framework built for developers, enabling you to deliver faster, more efficient models with minimal overhead. It provides a comprehensive suite of compression algorithms including caching, quantization, pruning, distillation and compilation techniques to make your models:

Faster: Accelerate inference times through advanced optimization techniques
Smaller: Reduce model size while maintaining quality
Cheaper: Lower computational costs and resource requirements
Greener: Decrease energy consumption and environmental impact

The toolkit is designed with simplicity in mind - requiring just a few lines of code to optimize your models. It supports various model types including LLMs, Diffusion and Flow Matching Models, Vision Transformers, Speech Recognition Models and more.

Installation

Pruna is currently available for installation on Linux, MacOS and Windows. However, some algorithms impose restrictions on the operating system and might not be available on all platforms.

Before installing, ensure you have:

Python 3.9 or higher
Optional: CUDA toolkit for GPU support

Option 1: Install Pruna using pip

Pruna is available on PyPI, so you can install it using pip:

pip install pruna

Option 2: Install Pruna from source

You can also install Pruna directly from source by cloning the repository and installing the package in editable mode:

git clone https://github.com/PrunaAI/pruna.git
cd pruna
pip install -e .

Quick Start

Getting started with Pruna is easy-peasy pruna-squeezy!

First, load any pre-trained model. Here's an example using Stable Diffusion:

from diffusers import StableDiffusionPipeline
base_model = StableDiffusionPipeline.from_pretrained("stable-diffusion-v1-5/stable-diffusion-v1-5")

Then, use Pruna's smash function to optimize your model. Pruna provides a variety of different optimization algorithms, allowing you to combine different algorithms to get the best possible results. You can customize the optimization process using SmashConfig:

from pruna import smash, SmashConfig

# Create and smash your model
smash_config = SmashConfig()
smash_config["cacher"] = "deepcache"
smash_config["compiler"] = "stable_fast"
smashed_model = smash(model=base_model, smash_config=smash_config)

Your model is now optimized and you can use it as you would use the original model:

smashed_model("An image of a cute prune.").images[0]

You can then use our evaluation interface to measure the performance of your model:

from pruna.evaluation.task import Task
from pruna.evaluation.evaluation_agent import EvaluationAgent
from pruna.data.pruna_datamodule import PrunaDataModule

datamodule = PrunaDataModule.from_string("LAION256")
datamodule.limit_datasets(10)
task = Task("image_generation_quality", datamodule=datamodule)
eval_agent = EvaluationAgent(task)
eval_agent.evaluate(smashed_model)

This was the minimal example, but you are looking for the maximal example? You can check out our documentation for an overview of all supported algorithms as well as our tutorials for more use-cases and examples.

Algorithm Overview

Since Pruna offers a broad range of optimization algorithms, the following table provides a high-level overview of all methods available in Pruna. For a detailed description of each algorithm, have a look at our documentation.

Technique	Description	Speed	Memory	Quality
`batcher`	Groups multiple inputs together to be processed simultaneously, improving computational efficiency and reducing processing time.	✅	❌	➖
`cacher`	Stores intermediate results of computations to speed up subsequent operations.	✅	➖	➖
`compiler`	Optimises the model with instructions for specific hardware.	✅	➖	➖
`quantizer`	Reduces the precision of weights and activations, lowering memory requirements.	✅	✅	❌
`pruner`	Removes less important or redundant connections and neurons, resulting in a sparser, more efficient network.	✅	✅	❌
`factorizer`	Factorization batches several small matrix multiplications into one large fused operation.	✅	➖	➖
`kernel`	Kernels are specialized GPU routines that speed up parts of the computation.	✅	➖	➖

✅ (improves), ➖ (approx. the same), ❌ (worsens)

Pruna AI Logo

FAQ and Troubleshooting

If you can not find an answer to your question or problem in our documentation, in our FAQs or in an existing issue, we are happy to help you! You can either get help from the Pruna community on Discord, join our Office Hours or open an issue on GitHub.

Contributors

The Pruna package was made with 💜 by the Pruna AI team and our amazing contributors. Contribute to the repository to become part of the Pruna family!

Citation

If you use Pruna in your research, feel free to cite the project! 💜

@misc{pruna,
    title = {Efficient Machine Learning with Pruna},
    year = {2023},
    note = {Software available from pruna.ai},
    url={https://www.pruna.ai/}
}

Pruna AI Logo

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.3.3

Apr 23, 2026

0.3.2

Mar 9, 2026

0.3.1

Jan 23, 2026

0.3.0

Nov 10, 2025

This version

0.2.11

Nov 4, 2025

0.2.10

Sep 17, 2025

0.2.9

Aug 13, 2025

0.2.8

Jul 29, 2025

0.2.7

Jul 14, 2025

0.2.6

Jun 30, 2025

0.2.5

May 28, 2025

0.2.4

May 19, 2025

0.2.3

May 2, 2025

0.2.2

Apr 11, 2025

0.2.1

Apr 2, 2025

0.2.0.post1

Mar 16, 2025

0.2.0

Mar 16, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pruna-0.2.11-py3-none-any.whl (234.6 kB view details)

Uploaded Nov 4, 2025 Python 3

File details

Details for the file pruna-0.2.11-py3-none-any.whl.

File metadata

Download URL: pruna-0.2.11-py3-none-any.whl
Upload date: Nov 4, 2025
Size: 234.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for pruna-0.2.11-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6527688a81948a084867349290c7787110b33237612a2394b86022c42b70ad6f`
MD5	`044b4113e1932eec4d7411f5954c3fc3`
BLAKE2b-256	`3b27435ee77a4ad7e751f84c0a51a7b678306ee058cb948819b67f932be9ebd8`

See more details on using hashes here.

pruna 0.2.11

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

Introduction

Installation

Option 1: Install Pruna using pip

Option 2: Install Pruna from source

Quick Start

Algorithm Overview

FAQ and Troubleshooting

Contributors

Citation

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution

File details

File metadata

File hashes