Optimum Library is an extension of the Hugging Face Transformers library, providing a framework to integrate third-party libraries from Hardware Partners and interface with their specific functionality.

These details have not been verified by PyPI

Project links

Homepage

Project description

Hugging Face Optimum

🤗 Optimum is an extension of 🤗 Transformers, providing a set of performance optimization tools enabling maximum efficiency to train and run models on targeted hardware.

The AI ecosystem evolves quickly and more and more specialized hardware along with their own optimizations are emerging every day. As such, Optimum enables users to efficiently use any of these platforms with the same ease inherent to transformers.

Integration with Hardware Partners

🤗 Optimum aims at providing more diversity towards the kind of hardware users can target to train and finetune their models.

To achieve this, we are collaborating with the following hardware manufacturers in order to provide the best transformers integration:

Graphcore IPUs - IPUs are a completely new kind of massively parallel processor to accelerate machine intelligence. More information here.
Habana Gaudi Processor (HPU) - HPUs are designed to maximize training throughput and efficiency. More information here.
Intel - Enabling the usage of Intel tools to accelerate end-to-end pipelines on Intel architectures. More information about Neural Compressor and OpenVINO.
More to come soon! :star:

Optimizing models towards inference

Along with supporting dedicated AI hardware for training, Optimum also provides inference optimizations towards various frameworks and platforms.

Optimum enables the usage of popular compression techniques such as quantization and pruning by supporting ONNX Runtime along with Intel Neural Compressor.

Features	ONNX Runtime	Intel Neural Compressor
Post-training Dynamic Quantization	:heavy_check_mark:	:heavy_check_mark:
Post-training Static Quantization	:heavy_check_mark:	:heavy_check_mark:
Quantization Aware Training (QAT)	Stay tuned! :star:	:heavy_check_mark:
Pruning	N/A	:heavy_check_mark:

Installation

🤗 Optimum can be installed using pip as follows:

python -m pip install optimum

If you'd like to use the accelerator-specific features of 🤗 Optimum, you can install the required dependencies according to the table below:

Accelerator	Installation
ONNX Runtime	`python -m pip install optimum[onnxruntime]`
Intel Neural Compressor	`python -m pip install optimum[neural-compressor]`
OpenVINO	`python -m pip install optimum[openvino,nncf]`
Graphcore IPU	`python -m pip install optimum[graphcore]`
Habana Gaudi Processor (HPU)	`python -m pip install optimum[habana]`

If you'd like to play with the examples or need the bleeding edge of the code and can't wait for a new release, you can install the base library from source as follows:

python -m pip install git+https://github.com/huggingface/optimum.git

For the accelerator-specific features, you can install them by appending #egg=optimum[accelerator_type] to the pip command, e.g.

python -m pip install git+https://github.com/huggingface/optimum.git#egg=optimum[onnxruntime]

Quick tour

Check out the examples below to see how 🤗 Optimum can be used to train and run inference on various hardware accelerators.

Accelerated training

Optimum Graphcore

To train transformers on Graphcore's IPUs, 🤗 Optimum provides a IPUTrainer that is very similar to the 🤗 Transformers trainer. Here is a simple example:

- from transformers import Trainer, TrainingArguments
+ from optimum.graphcore import IPUConfig, IPUTrainer, IPUTrainingArguments

  # Download a pretrained model from the Hub
  model = AutoModelForXxx.from_pretrained("bert-base-uncased")

  # Define the training arguments
- training_args = TrainingArguments(
+ training_args = IPUTrainingArguments(
      output_dir="path/to/save/folder/",
+     ipu_config_name="Graphcore/bert-base-ipu", # Any IPUConfig on the Hub or stored locally
      ...
  )

  # Define the configuration to compile and put the model on the IPU
+ ipu_config = IPUConfig.from_pretrained(training_args.ipu_config_name)

  # Initialize the trainer
- trainer = Trainer(
+ trainer = IPUTrainer(
      model=model,
+     ipu_config=ipu_config
      args=training_args,
      train_dataset=train_dataset
      ...
  )

  # Use Graphcore IPU for training!
  trainer.train()

Optimum Habana

To train transformers on Habana's Gaudi processors, 🤗 Optimum provides a GaudiTrainer that is very similar to the 🤗 Transformers trainer. Here is a simple example:

- from transformers import Trainer, TrainingArguments
+ from optimum.habana import GaudiTrainer, GaudiTrainingArguments

  # Download a pretrained model from the Hub
  model = AutoModelForXxx.from_pretrained("bert-base-uncased")

  # Define the training arguments
- training_args = TrainingArguments(
+ training_args = GaudiTrainingArguments(
      output_dir="path/to/save/folder/",
+     use_habana=True,
+     use_lazy_mode=True,
+     gaudi_config_name="Habana/bert-base-uncased",
      ...
  )

  # Initialize the trainer
- trainer = Trainer(
+ trainer = GaudiTrainer(
      model=model,
      args=training_args,
      train_dataset=train_dataset,
      ...
  )

  # Use Habana Gaudi processor for training!
  trainer.train()

ONNX Runtime

To train transformers with ONNX Runtime's acceleration features, 🤗 Optimum provides a ORTTrainer that is very similar to the 🤗 Transformers trainer. Here is a simple example:

- from transformers import Trainer, TrainingArguments
+ from optimum.onnxruntime import ORTTrainer, ORTTrainingArguments

  # Download a pretrained model from the Hub
  model = AutoModelForSequenceClassification.from_pretrained("bert-base-uncased")

  # Define the training arguments
- training_args = TrainingArguments(
+ training_args = ORTTrainingArguments(
      output_dir="path/to/save/folder/",
      optim="adamw_ort_fused",
      ...
  )

  # Create a ONNX Runtime Trainer
- trainer = Trainer(
+ trainer = ORTTrainer(
      model=model,
      args=training_args,
      train_dataset=train_dataset,
+     feature="sequence-classification", # The model type to export to ONNX
      ...
  )

  # Use ONNX Runtime for training!
  trainer.train()

Accelerated inference

ONNX Runtime

To accelerate inference with ONNX Runtime, 🤗 Optimum uses configuration objects to define parameters for optimization. These objects are then used to instantiate dedicated optimizers and quantizers.

Before applying quantization or optimization, first export our model to the ONNX format:

from optimum.onnxruntime import ORTModelForSequenceClassification
from transformers import AutoTokenizer

model_checkpoint = "distilbert-base-uncased-finetuned-sst-2-english"
save_directory = "tmp/onnx/"
# Load a model from transformers and export it to ONNX
tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)
ort_model = ORTModelForSequenceClassification.from_pretrained(model_checkpoint, from_transformers=True)
# Save the onnx model and tokenizer
ort_model.save_pretrained(save_directory)
tokenizer.save_pretrained(save_directory)

Let's see now how we can apply dynamic quantization with ONNX Runtime:

from optimum.onnxruntime.configuration import AutoQuantizationConfig
from optimum.onnxruntime import ORTQuantizer

# Define the quantization methodology
qconfig = AutoQuantizationConfig.arm64(is_static=False, per_channel=False)
quantizer = ORTQuantizer.from_pretrained(ort_model)
# Apply dynamic quantization on the model
quantizer.quantize(save_dir=save_directory, quantization_config=qconfig)

In this example, we've quantized a model from the Hugging Face Hub, but it could also be a path to a local model directory. The result from applying the quantize() method is a model_quantized.onnx file that can be used to run inference. Here's an example of how to load an ONNX Runtime model and generate predictions with it:

from optimum.onnxruntime import ORTModelForSequenceClassification
from transformers import pipeline, AutoTokenizer

model = ORTModelForSequenceClassification.from_pretrained(save_directory, file_name="model_quantized.onnx")
tokenizer = AutoTokenizer.from_pretrained(save_directory)
classifier = pipeline("text-classification", model=model, tokenizer=tokenizer)
results = classifier("I love burritos!")

Optimum Intel

Here is an example on how to perform inference with the OpenVINO Runtime:

- from transformers import AutoModelForSequenceClassification
+ from optimum.intel.openvino import OVModelForSequenceClassification
  from transformers import AutoTokenizer, pipeline

  # Download a tokenizer and model from the Hub and convert to OpenVINO format
  tokenizer = AutoTokenizer.from_pretrained(model_id)
  model_id = "distilbert-base-uncased-finetuned-sst-2-english"
- model = AutoModelForSequenceClassification.from_pretrained(model_id)
+ model = OVModelForSequenceClassification.from_pretrained(model_id, from_transformers=True)

  # Run inference!
  classifier = pipeline("text-classification", model=model, tokenizer=tokenizer)
  results = classifier("He's a dreadful magician.")

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

2.1.0

Dec 19, 2025

2.0.0

Oct 9, 2025

1.27.0

Jul 30, 2025

1.26.1

Jun 13, 2025

1.26.0

Jun 13, 2025

1.25.3

May 16, 2025

1.25.2

May 15, 2025

1.25.1

May 15, 2025

1.25.0

May 13, 2025

1.24.0

Jan 30, 2025

1.23.3

Oct 29, 2024

1.23.2

Oct 22, 2024

1.23.1

Oct 11, 2024

1.23.0

Oct 10, 2024

1.22.0

Sep 10, 2024

1.21.4

Aug 16, 2024

1.21.3

Aug 6, 2024

1.21.2

Jul 5, 2024

1.21.1

Jul 2, 2024

1.21.0

Jul 2, 2024

1.20.0

May 29, 2024

1.19.2

May 9, 2024

1.19.1

Apr 24, 2024

1.19.0

Apr 16, 2024

1.18.1

Apr 9, 2024

1.18.0

Mar 25, 2024

1.17.1

Feb 18, 2024

1.17.0

Feb 16, 2024

1.16.2

Jan 19, 2024

1.16.1

Dec 15, 2023

1.16.0

Dec 13, 2023

1.15.0

Dec 6, 2023

1.14.1

Nov 14, 2023

1.14.0

Nov 6, 2023

1.13.3

Nov 3, 2023

1.13.2

Sep 21, 2023

1.13.1

Sep 8, 2023

1.13.0

Sep 8, 2023

1.12.0

Aug 23, 2023

1.11.2

Aug 17, 2023

1.11.1

Aug 11, 2023

1.11.0

Aug 3, 2023

1.10.1

Jul 27, 2023

1.10.0

Jul 25, 2023

1.9.1

Jul 7, 2023

1.9.0

Jun 30, 2023

1.8.8

Jun 16, 2023

1.8.7

Jun 10, 2023

1.8.6

May 18, 2023

1.8.5

May 11, 2023

1.8.4

May 7, 2023

1.8.3

Apr 28, 2023

1.8.2

Apr 17, 2023

1.8.1

Apr 17, 2023

1.8.0

Apr 17, 2023

1.7.3

Mar 23, 2023

1.7.2

Mar 23, 2023

1.7.1

Mar 3, 2023

1.7.0

Mar 2, 2023

1.6.4

Feb 13, 2023

1.6.3

Jan 25, 2023

1.6.2

Jan 25, 2023

1.6.1

Dec 23, 2022

1.6.0

Dec 23, 2022

1.5.2

Dec 19, 2022

1.5.1

Nov 24, 2022

This version

1.5.0

Nov 17, 2022

1.4.1

Oct 25, 2022

1.4.0

Sep 8, 2022

1.3.0

Jul 12, 2022

1.2.3

Jun 13, 2022

1.2.2

Jun 2, 2022

1.2.1

May 12, 2022

1.2.0

May 10, 2022

1.1.1

Apr 26, 2022

1.1.0

Apr 1, 2022

1.0.0

Feb 23, 2022

0.1.3

Dec 23, 2021

0.1.2

Dec 8, 2021

0.1.1

Nov 5, 2021

0.1.0

Nov 5, 2021

0.0.1

Sep 14, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

optimum-1.5.0.tar.gz (150.6 kB view details)

Uploaded Nov 17, 2022 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

optimum-1.5.0-py3-none-any.whl (187.2 kB view details)

Uploaded Nov 17, 2022 Python 3

File details

Details for the file optimum-1.5.0.tar.gz.

File metadata

Download URL: optimum-1.5.0.tar.gz
Upload date: Nov 17, 2022
Size: 150.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.1 CPython/3.9.12

File hashes

Hashes for optimum-1.5.0.tar.gz
Algorithm	Hash digest
SHA256	`5b660fa64c33e44e7d9e3f670a9c7b04889f679ea780a7a6368079f847db5919`
MD5	`20ab4dd175f429aa63252935896ec605`
BLAKE2b-256	`5a85e0dda0c5b433f0fe23a8301b3b9397fa10dd54608e90c74a5cfddcaf361d`

See more details on using hashes here.

File details

Details for the file optimum-1.5.0-py3-none-any.whl.

File metadata

Download URL: optimum-1.5.0-py3-none-any.whl
Upload date: Nov 17, 2022
Size: 187.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.1 CPython/3.9.12

File hashes

Hashes for optimum-1.5.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`de642ec67cfa462f7acd7dfc8aa012e904ffff0ccfb6b5c2d0e9abf1e5a12665`
MD5	`b358e5449270ded490f270fcf6b8673d`
BLAKE2b-256	`3965c7e4b18f9afe055023d41ec121d4ac26097883876744a22e15dd35686daf`

See more details on using hashes here.

optimum 1.5.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Hugging Face Optimum

Integration with Hardware Partners

Optimizing models towards inference

Installation

Quick tour

Accelerated training

Optimum Graphcore

Optimum Habana

ONNX Runtime

Accelerated inference

ONNX Runtime

Optimum Intel

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes