FMCHISEL is a library which aims at improving LLM training + inference effectiveness and efficiency from an algorithm perspective with quantization, pruning, optimizers, etc.

These details have been verified by PyPI

Project links

Homepage

GitHub Statistics

Maintainers

immrata kvignesh1420

Project description

fmchisel – Efficient Foundation Model Algorithms

State-of-the-art compression & distillation recipes for Large Language Models

✨ Overview

fmchisel (Foundation Model Chisel) is an open-source research library that makes it simple to:

Compress LLMs with cutting-edge pruning and quantization techniques.
Distill knowledge from larger models to smaller ones.
Accelerate inference on consumer hardware by combining sparse + low-bit weight formats.
Train efficiently with advanced optimizers such as schedule-free AdamW.
Prototype new compression ideas rapidly.

fmchisel is built on PyTorch and integrates seamlessly with 📚 🤗 Transformers.

📦 Installation

PyPi Package

pip install fmchisel[all]

Source

To install from source Linux is required (enforced by setup). Installing on macOS or Windows will fail at setup time:

# Clone the repo

git clone https://github.com/linkedin/fmchisel.git

cd fmchisel

# Base install
pip install -e .

# Optional extras
# - inference: pruning/quantization via llmcompressor
# - train: distillation (Lightning, liger-kernel)
# - all: both of the above
pip install -e ".[inference]"
pip install -e ".[train]"
# or
pip install -e ".[all]"

🚀 Quick Start

Ready-to-run recipes in examples/:

Distillation: bash examples/distillation/run.sh
Unstructured or N:M pruning (ALPS, SparseGPT, Wanda): bash examples/pruning/run.sh
Structured pruning (OSSCAR): bash examples/structured_pruning/run.sh
Quantization (QuantEase via YAML recipes): bash examples/quantization/run_quantization.sh

Tweak the scripts or pass flags to adjust models, datasets, and hyper-parameters.

🗂️ Project Structure

fmchisel/
│
├─ data/               # Calibration & data utilities
├─ distillation/       # Knowledge-distillation components
├─ pruning/            # ALPS + OSSCAR implementations; SparseGPT/Wanda via llmcompressor
├─ quantization/       # QuantEase & helpers
├─ optimizers/         # AdamW schedule-free implementation
├─ utils/              # Callbacks, training helpers
└─ config.py           # Global configuration
examples/              # End-to-end reproducible recipes
tests/                 # PyTest suite

🧪 Research Components

Area	Algorithm(s)	Implementation Module
Pruning	ALPS (unstructured, N:M)	`fmchisel.pruning.alps`
Structured	OSSCAR (MLP/attn-group drop)	`fmchisel.pruning.osscar`
Quantization	QuantEase (weight-only/group)	`fmchisel.quantization.quantease`
Distillation	Per-token KD (e.g., JSD)	`fmchisel.distillation.losses`
Optimization	AdamW Schedule-Free	`fmchisel.optimizers.adamw_schedulefree`

Notes:

SparseGPT and Wanda pruning are available through llmcompressor and wired up in examples/pruning/pruning_utils.py.
Quantization uses llmcompressor pipelines with a QuantEase modifier and YAML recipes.
To combine pruning and quantization, compose both modifiers in a single YAML recipe and pass it to llmcompressor.oneshot. See llmcompressor documentation for composing modifiers. Example composite recipes are not included in this repo.

Minimal Python usage (grounded in the repo)

Pruning (ALPS or SparseGPT/Wanda) via oneshot and HFCalibrationDataLoader:

from llmcompressor import oneshot
from transformers import AutoTokenizer
from fmchisel.data.calibration_datautil import HFCalibrationDataLoader
from fmchisel.pruning.alps.base import ALPSModifier

model_id = "Qwen/Qwen3-0.6B"
tokenizer = AutoTokenizer.from_pretrained(model_id)
dataset = HFCalibrationDataLoader(
    nsamples=1024,
    tokenizer=tokenizer,
    max_seq_length=tokenizer.model_max_length,
    dataset="allenai/c4",
    data_field="text",
    data_dir="en",
    data_split="train",
).get_tokenized_calibration()

recipe = ALPSModifier(sparsity=0.5, mask_structure="2:4", targets="__ALL_PRUNABLE__")
oneshot(model=model_id, dataset=dataset, recipe=recipe, output_dir="out/pruned")

Structured pruning (OSSCAR):

from llmcompressor import oneshot
from transformers import AutoTokenizer
from fmchisel.data.calibration_datautil import HFCalibrationDataLoader
from fmchisel.pruning.osscar.base import OSSCARModifier

model_id = "Qwen/Qwen3-0.6B"
tokenizer = AutoTokenizer.from_pretrained(model_id)
dataset = HFCalibrationDataLoader(
    nsamples=1024,
    tokenizer=tokenizer,
    max_seq_length=tokenizer.model_max_length,
    dataset="allenai/c4",
    data_field="text",
    data_dir="en",
    data_split="train",
).get_tokenized_calibration()

recipe = OSSCARModifier(num_drop_mlp_neuron=128, num_drop_attn_group=1)
oneshot(model=model_id, dataset=dataset, recipe=recipe, output_dir="out/structured")

Quantization (QuantEase) is driven by YAML recipes (see examples/quantization/recipes/*):

bash examples/quantization/run_quantization.sh

Distillation with JSD loss (Lightning + FSDP):

bash examples/distillation/run.sh

🛠️ Contributing

Fork & clone the repository.
Install dev deps: pip install -e ".[dev]" (note: A Linux system is required.)
Run linters/formatters: make checkstyle.
Execute tests: make test.
Open a pull request!

[!NOTE] Please open an issue first to discuss major changes.

🔒 License

See LICENSE for details.

📝 Citation

@software{behdin2025,
  author       = {Behdin, Kayhan and Fatahibaarzi, Ata and Yun, Dai and 
                  Song, Qingquan and Kothapalli, Vignesh and Tang, Shao and 
                  Sang, Hejian and Gupta, Aman and Wang, Zhipeng and 
                  Dexter, Gregory and Zhu, Sirou and Zhu, Siyu},
  title        = {fmchisel},
  year         = {2025},
}

Additional references

This library implements compression methods from the following papers:

@article{meng2024alps,
  title={Alps: Improved optimization for highly sparse one-shot pruning for large language models},
  author={Meng, Xiang and Behdin, Kayhan and Wang, Haoyue and Mazumder, Rahul},
  journal={Advances in Neural Information Processing Systems},
  volume={37},
  pages={37594--37625},
  year={2024}
}

@inproceedings{mengosscar,
  title={OSSCAR: One-Shot Structured Pruning in Vision and Language Models with Combinatorial Optimization},
  author={Meng, Xiang and Ibrahim, Shibal and Behdin, Kayhan and Hazimeh, Hussein and Ponomareva, Natalia and Mazumder, Rahul},
  booktitle={Forty-first International Conference on Machine Learning}
}

@article{behdin2023quantease,
  title={QuantEase: Optimization-based quantization for language models},
  author={Behdin, Kayhan and Acharya, Ayan and Gupta, Aman and Song, Qingquan and Zhu, Siyu and Keerthi, Sathiya and Mazumder, Rahul},
  journal={arXiv preprint arXiv:2309.01885},
  year={2023}
}

Project details

These details have been verified by PyPI

Project links

Homepage

GitHub Statistics

Maintainers

immrata kvignesh1420

Release history Release notifications | RSS feed

0.1.2.dev20251023200538 pre-release

Oct 23, 2025

This version

0.1.2.dev20251009060326 pre-release

Oct 9, 2025

0.1.2.dev20250906035053 pre-release

Sep 6, 2025

0.1.2.dev20250905201231 pre-release

Sep 5, 2025

0.1.2.dev20250905182440 pre-release

Sep 5, 2025

0.1.1.dev20250905182016 pre-release

Sep 5, 2025

0.1.1.dev20250905180027 pre-release

Sep 5, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fmchisel_nightly-0.1.2.dev20251009060326.tar.gz (77.4 kB view details)

Uploaded Oct 9, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

fmchisel_nightly-0.1.2.dev20251009060326-py3-none-any.whl (62.2 kB view details)

Uploaded Oct 9, 2025 Python 3

File details

Details for the file fmchisel_nightly-0.1.2.dev20251009060326.tar.gz.

File metadata

Download URL: fmchisel_nightly-0.1.2.dev20251009060326.tar.gz
Upload date: Oct 9, 2025
Size: 77.4 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for fmchisel_nightly-0.1.2.dev20251009060326.tar.gz
Algorithm	Hash digest
SHA256	`374f9fc3f1cc3ffa34385581b9a188cb565404e721048f09b9a2b69f0c5b5f53`
MD5	`4d7c3d17cf27367c4ab9347ad45722e7`
BLAKE2b-256	`3b7536a306407559534a4e752e68d1a98c4d01e46ce4d791d33c55ede1843f32`

See more details on using hashes here.

Provenance

The following attestation bundles were made for fmchisel_nightly-0.1.2.dev20251009060326.tar.gz:

Publisher: publish-nightly.yml on linkedin/fmchisel

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: fmchisel_nightly-0.1.2.dev20251009060326.tar.gz
- Subject digest: 374f9fc3f1cc3ffa34385581b9a188cb565404e721048f09b9a2b69f0c5b5f53
- Sigstore transparency entry: 597030199
- Sigstore integration time: Oct 9, 2025
Source repository:
- Permalink: linkedin/fmchisel@4f29551b6d6607b737cddcc4a59ee542877ebcfe
- Branch / Tag: refs/heads/main
- Owner: https://github.com/linkedin
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-nightly.yml@4f29551b6d6607b737cddcc4a59ee542877ebcfe
- Trigger Event: push

File details

Details for the file fmchisel_nightly-0.1.2.dev20251009060326-py3-none-any.whl.

File metadata

Download URL: fmchisel_nightly-0.1.2.dev20251009060326-py3-none-any.whl
Upload date: Oct 9, 2025
Size: 62.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for fmchisel_nightly-0.1.2.dev20251009060326-py3-none-any.whl
Algorithm	Hash digest
SHA256	`bf6130a2256752f898ad9d94afb1fe4023e2f887c14de435dc32ca5b191879c7`
MD5	`a2bad12e86200860324828a97e73bcad`
BLAKE2b-256	`f057dcc0e379563e1365516b48a69f6a2bd8e3f958457001c6495669c64b4a8a`

See more details on using hashes here.

Provenance

The following attestation bundles were made for fmchisel_nightly-0.1.2.dev20251009060326-py3-none-any.whl:

Publisher: publish-nightly.yml on linkedin/fmchisel

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: fmchisel_nightly-0.1.2.dev20251009060326-py3-none-any.whl
- Subject digest: bf6130a2256752f898ad9d94afb1fe4023e2f887c14de435dc32ca5b191879c7
- Sigstore transparency entry: 597030217
- Sigstore integration time: Oct 9, 2025
Source repository:
- Permalink: linkedin/fmchisel@4f29551b6d6607b737cddcc4a59ee542877ebcfe
- Branch / Tag: refs/heads/main
- Owner: https://github.com/linkedin
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-nightly.yml@4f29551b6d6607b737cddcc4a59ee542877ebcfe
- Trigger Event: push

fmchisel-nightly 0.1.2.dev20251009060326

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Project description

fmchisel – Efficient Foundation Model Algorithms

✨ Overview

📦 Installation

PyPi Package

Source

🚀 Quick Start

🗂️ Project Structure

🧪 Research Components

Minimal Python usage (grounded in the repo)

🛠️ Contributing

🔒 License

📝 Citation

Additional references

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance