Skip to main content

Composable model compression for PyTorch โ€” prune, quantize, and ship.

Project description

๐Ÿง  Lobotomizer

Take any nn.Module. Lobotomize it. Make it smaller, faster, cheaper.

Composable model compression for PyTorch. Run pipelines to use compression techniques like quantization, pruning, knowledge distillation, and more with a one-liner, an explicit pipeline, or the CLI.

Over time will try to serve as an easy-to-use collection of popular techniques to help with R&D in the field.

Installation

pip install lobotomizer

# With optional extras
pip install lobotomizer[all]       # everything
pip install lobotomizer[dev]       # pytest
pip install lobotomizer[pruning]   # torch-pruning
pip install lobotomizer[quantize]  # bitsandbytes

Quick Start

One-liner

import lobotomizer as lob

result = lob.compress(model, recipe="balanced")
print(result.summary())
result.save("compressed/")

Explicit pipeline

import lobotomizer as lob

result = lob.Pipeline([
    lob.Prune(method="l1_unstructured", sparsity=0.4),
    lob.Quantize(method="dynamic"),
]).run(model)

print(result.summary())

CLI

# Compress with a recipe
lobotomize model.pt --recipe balanced --output compressed/

# Compress with explicit options
lobotomize model.pt --prune l1_unstructured --sparsity 0.3 --quantize dynamic -o out/

# Profile only
lobotomize model.pt --profile-only --input-shape "1,3,224,224"

# List available recipes
lobotomize --list-recipes

Summary output

Real results from compressing Whisper-tiny (39M params) with dynamic int8 quantization:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Metric                โ”‚ Before     โ”‚ After      โ”‚ ฮ”      โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ param_count           โ”‚ 37,760,640 โ”‚ 37,760,640 โ”‚ +0.0%  โ”‚
โ”‚ param_count_trainable โ”‚ 37,760,640 โ”‚ 21,245,568 โ”‚ -43.7% โ”‚
โ”‚ size_mb               โ”‚ 144.10     โ”‚ 97.02      โ”‚ -32.7% โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Available Stages

Pruning

Method Description
l1_unstructured Remove weights with smallest L1 magnitude
random_unstructured Remove random weights
l1_structured Remove entire channels by L1 norm (Conv2d)
random_structured Remove random channels (Conv2d)

Quantization

Method Description
dynamic Dynamic int8 quantization (Linear layers)
static Static int8 quantization (requires calibration data)

Knowledge Distillation

Train a compressed student model to mimic the original teacher:

import lobotomizer as lob

# Logit-based distillation (Hinton-style)
result = lob.Pipeline([
    lob.StructuredPrune(sparsity=0.3),
    lob.Distill(method="logit", temperature=4.0, epochs=5, lr=1e-4),
]).run(model, training_data=train_loader)

# Feature matching โ€” align intermediate representations
result = lob.Pipeline([
    lob.Distill(
        method="feature",
        feature_layers={"fc1": "fc1", "fc2": "fc2"},
        epochs=10,
    ),
]).run(model, training_data=train_loader)

# Both logit + feature distillation
result = lob.Pipeline([
    lob.Distill(method="both", alpha=0.7, temperature=4.0, epochs=10),
]).run(model, training_data=train_loader)
Parameter Description
method "logit", "feature", or "both"
temperature Softmax temperature for logit KD (default: 4.0)
alpha KD loss weight; 1-alpha goes to task loss (default: 1.0)
feature_layers dict[str,str] mapping studentโ†’teacher layer names (auto-matched if None)
teacher nn.Module, file path, or None (uses original model)
epochs Training epochs (default: 5)

YAML recipe:

stages:
  - type: structured_prune
    sparsity: 0.3
  - type: distill
    method: logit
    temperature: 4.0
    epochs: 5

Recipes

Recipes are YAML files that define a sequence of stages:

name: balanced
description: "Structured pruning + dynamic int8 quantization"
stages:
  - type: prune
    method: l1_unstructured
    sparsity: 0.25
  - type: quantize
    method: dynamic
    dtype: qint8

Built-in recipes: balanced

Use custom recipes: lob.compress(model, recipe="path/to/recipe.yaml")

How It Works

Model โ†’ [Stage 1] โ†’ [Stage 2] โ†’ ... โ†’ Result
         Prune        Quantize
  1. Pipeline โ€” a list of Stage objects run sequentially
  2. Stages โ€” each stage (Prune, Quantize) transforms the model in-place on a deep copy
  3. Profiler โ€” measures param count, size, and FLOPs before/after each stage
  4. Result โ€” holds the compressed model, profiles, and stage history; can save and summarize
  5. Recipes โ€” YAML configs that build pipelines from named stages

The original model is never mutated.

Examples

See examples/ for complete, runnable scripts:

Script What it does
resnet50_edge.py ResNet50 pruned + quantized for edge deployment
bert_quantize.py BERT-base quantized for faster CPU inference
whisper_compress.py Whisper small compressed for on-device transcription
yolo_edge.py YOLOv8n compressed for real-time edge inference
mobilevit_compress.py MobileViT further compressed for ultra-constrained devices

Each script is self-contained and falls back to a dummy model if optional dependencies aren't installed. Note: not all of these are fully tested yet.

Roadmap

  • v0.1 โ€” Prune, Quantize, Pipeline, profiler, recipes, CLI
  • v0.2 โ€” Knowledge distillation (logit, feature)
  • v0.3 โ€” Sparsity and low-rank techniques.
  • v0.4 โ€” Hardware support (ONNX, profliing, and stuff like that)
  • v0.5 โ€” Search & automation (sweeps, finding lobotomization pipelines to hit given targets)

Over time: progressively support and wrap more techniques, layer types, tools, etc.

Contributing

Contributions welcome! Let's grow the lobotomization movement.

  1. Fork & clone
  2. pip install -e ".[dev]"
  3. pytest
  4. PR

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lobotomizer-0.2.0.tar.gz (34.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lobotomizer-0.2.0-py3-none-any.whl (30.9 kB view details)

Uploaded Python 3

File details

Details for the file lobotomizer-0.2.0.tar.gz.

File metadata

  • Download URL: lobotomizer-0.2.0.tar.gz
  • Upload date:
  • Size: 34.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for lobotomizer-0.2.0.tar.gz
Algorithm Hash digest
SHA256 50782d76afa9a913e90720cac5410b6f1f334d4fe260694d76097fdd76e67d56
MD5 17882e1b3a25a85caca809b1eec7b137
BLAKE2b-256 59ed5e97d843198cc264e24e247f0b2371866ddd8c415eebb2b0fb178a47c884

See more details on using hashes here.

Provenance

The following attestation bundles were made for lobotomizer-0.2.0.tar.gz:

Publisher: publish.yml on usmank13/lobotomizer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file lobotomizer-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: lobotomizer-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 30.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for lobotomizer-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 58f8c0a1c8ac7261875ff5d64c48743b411be523c35851bd96f1f6c21a220ca5
MD5 9d0bda7bb47d16bf98fb91437cb457d9
BLAKE2b-256 03459faa72aac6648d8bdcfe484f184842744e8155fe9601ce6d4f34a3b40616

See more details on using hashes here.

Provenance

The following attestation bundles were made for lobotomizer-0.2.0-py3-none-any.whl:

Publisher: publish.yml on usmank13/lobotomizer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page