A library for structured pruning & Bias visualization of large language models

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 3 - Alpha
Intended Audience
- Science/Research
License
- OSI Approved :: Apache Software License
Operating System
- OS Independent
Programming Language
- Python :: 3
Topic
- Scientific/Engineering :: Artificial Intelligence

Project description

OptiPFair

Optimize LLMs

DOCUMENTATION |

A Python library for structured pruning, and Bias visualization, of large language models, with a focus on GLU architectures and fairness analysis.

Overview

OptiPFair enables efficient pruning of large language models while maintaining their performance. It implements various structured pruning methods, starting with MLP pruning for GLU architectures (as used in models like LLaMA, Mistral, etc.).

Key features:

GLU architecture-aware pruning that preserves model structure
Multiple neuron importance calculation methods
Support for both pruning percentage and target expansion rate
Simple Python API and CLI interface
Progress tracking and detailed statistics
NEW: Bias visualization tools to analyze and understand fairness issues

Installation

# From PyPI 
pip install optipfair

# From source
git clone https://github.com/yourusername/optipfair.git
cd optipfair
pip install -e .

Quick Start

Python API

from transformers import AutoModelForCausalLM
from optipfair import prune_model

# Load a pre-trained model
model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.2-1B")

# Prune 20% of neurons from MLP layers using MAW method
pruned_model, stats = prune_model(
    model=model,
    pruning_type="MLP_GLU",
    neuron_selection_method="MAW",
    pruning_percentage=20,
    show_progress=True,
    return_stats=True
)

# Print pruning statistics
print(f"Original parameters: {stats['original_parameters']:,}")
print(f"Pruned parameters: {stats['pruned_parameters']:,}")
print(f"Reduction: {stats['reduction']:,} parameters ({stats['percentage_reduction']:.2f}%)")

# Save pruned model
pruned_model.save_pretrained("./pruned-llama-model")

Bias Visualization

from transformers import AutoModelForCausalLM, AutoTokenizer
from optipfair.bias import visualize_bias

# Load model and tokenizer
model_name = "meta-llama/Llama-3.2-1B"
model = AutoModelForCausalLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

# Define prompt pairs that differ in demographic attributes
prompt_pairs = [
    ("The white man walked through the neighborhood. The police officer thought he", 
     "The Black man walked through the neighborhood. The police officer thought he")
]

# Generate visualizations and get bias metrics
_, metrics = visualize_bias(
    model, 
    tokenizer,
    prompt_pairs=prompt_pairs,
    visualization_types=["mean_diff", "pca"],
    layers="first_middle_last",
    output_dir="./bias_analysis"
)

# Print overall bias metrics
overall = metrics["pair_1"]["metrics"]["overall_metrics"]
print(f"Mean activation difference: {overall['mean_difference']:.6f}")

Command-Line Interface

# Prune a model with default settings (10% pruning, MAW method)
optipfair prune --model-path meta-llama/Llama-3.2-1B --output-path ./pruned-model

# Prune with custom settings
optipfair prune \
  --model-path meta-llama/Llama-3.2-1B \
  --pruning-type MLP_GLU \
  --method MAW \
  --pruning-percentage 20 \
  --output-path ./pruned-model

# Use expansion rate instead of pruning percentage
optipfair prune \
  --model-path meta-llama/Llama-3.2-1B \
  --expansion-rate 140 \
  --output-path ./pruned-model

# Analyze a model's architecture
optipfair analyze --model-path meta-llama/Llama-3.2-1B

Neuron Selection Methods

OptiPFair supports three methods for calculating neuron importance:

MAW (Maximum Absolute Weight) - Default method that identifies influential neurons based on the magnitude of their connections. Typically provides the best pruning results.
VOW (Variance of Weights) - Identifies neurons based on the variance of their weights. May be useful for specific architectures.
PON (Product of Norms) - Uses the product of L1 norms to identify important neurons. This method may be applicable in certain contexts.

Documentation

Complete documentation is available at https://peremartra.github.io/optipfair/.

Supported Models

At his moment, OptiPFair is designed to work with transformer-based language models that use GLU architecture in their MLP layers, including:

LLaMA family (LLaMA, LLaMA-2, LLaMA-3, )
Mistral models, QWeN, Gemma...
And other models with similar GLU architecture

Expansion Rate vs Pruning Percentage

OptiPFair supports two ways to specify the pruning target:

Pruning Percentage - Directly specify what percentage of neurons to remove (e.g., 20%)
Expansion Rate - Specify the target expansion rate (ratio of intermediate size to hidden size) as a percentage (e.g., 140% instead of the default 400%)

The expansion rate approach is often more intuitive when comparing across different model scales.

Future Roadmap

Support for attention layer pruning
Whole block pruning
Integrated evaluation benchmarks
Bias visualizations.

Citation

If you use OptiPFair in your research, please cite:

@software{optipfair2025,
  author = {Pere Martra},
  title = {OptiPFair: A Library for Structured Pruning of Large Language Models},
  year = {2025},
  url = {https://github.com/yourusername/optipfair}
}

License

Apache 2.0

Contributing

Contributions are welcome! Please check out our contributing guidelines for details.

Project details

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 3 - Alpha
Intended Audience
- Science/Research
License
- OSI Approved :: Apache Software License
Operating System
- OS Independent
Programming Language
- Python :: 3
Topic
- Scientific/Engineering :: Artificial Intelligence

Release history Release notifications | RSS feed

0.4.0

Apr 17, 2026

0.3.0

Mar 2, 2026

0.2.4

Jan 10, 2026

0.2.3

Dec 4, 2025

0.2.2

Nov 26, 2025

0.2.1

Nov 24, 2025

0.2.0

Oct 27, 2025

0.1.5

Sep 24, 2025

0.1.4

Jul 18, 2025

This version

0.1.3

Apr 21, 2025

0.1.2

Apr 13, 2025

0.1.1

Apr 13, 2025

0.1.0

Apr 12, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

optipfair-0.1.3.tar.gz (44.4 kB view details)

Uploaded Apr 21, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

optipfair-0.1.3-py3-none-any.whl (34.5 kB view details)

Uploaded Apr 21, 2025 Python 3

File details

Details for the file optipfair-0.1.3.tar.gz.

File metadata

Download URL: optipfair-0.1.3.tar.gz
Upload date: Apr 21, 2025
Size: 44.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.8.13

File hashes

Hashes for optipfair-0.1.3.tar.gz
Algorithm	Hash digest
SHA256	`d278fa838e2ff05081ae20ed9b595569d538166dde47ba595fbdc3e4a39994a8`
MD5	`6bcb535c59f070c22805742db5b82ab2`
BLAKE2b-256	`7bf78efb79336f627f5249932f5faac8bdde5795a33bd13beeaf196c11c8dce9`

See more details on using hashes here.

File details

Details for the file optipfair-0.1.3-py3-none-any.whl.

File metadata

Download URL: optipfair-0.1.3-py3-none-any.whl
Upload date: Apr 21, 2025
Size: 34.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.8.13

File hashes

Hashes for optipfair-0.1.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5eb97aab0b65d1517f280bb10f5b6b0c41f1a0d3a6cb7de38f53b342e4f2bd85`
MD5	`ea9cac06255842b966a40ba7587c2ac5`
BLAKE2b-256	`db313a5d686b4015577cc579fa5a68eef8be14407d5e98325daf42299723f727`

See more details on using hashes here.

optipfair 0.1.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

OptiPFair

DOCUMENTATION |

Overview

Installation

Quick Start

Python API

Bias Visualization

Command-Line Interface

Neuron Selection Methods

Documentation

Supported Models

Expansion Rate vs Pruning Percentage

Future Roadmap

Citation

License

Contributing

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes