Repository of Intel® Intel Extension for Transformers
Project description
Intel® Extension for Transformers: Accelerating Transformer-based Models on Intel Platforms
Intel® Extension for Transformers is an innovative toolkit to accelerate Transformer-based models on Intel platforms, in particular effective on 4th Intel Xeon Scalable processor Sapphire Rapids (codenamed Sapphire Rapids). The toolkit provides the key features and examples as below:
-
Seamless user experience of model compressions on Transformers-based models by extending Hugging Face transformers APIs and leveraging Intel® Neural Compressor
-
Advanced software optimizations and unique compression-aware runtime (released with NeurIPS 2022's paper Fast Distilbert on CPUs and QuaLA-MiniLM: a Quantized Length Adaptive MiniLM, and NeurIPS 2021's paper Prune Once for All: Sparse Pre-Trained Language Models)
-
Accelerated end-to-end Transformer-based applications such as Stable Diffusion, GPT-J-6B, BLOOM-176B, T5, and SetFit by leveraging Intel AI software such as Intel® Extension for PyTorch
Installation
Install from Pypi
pip install intel-extension-for-transformers
For more installation method, please refer to Installation Page
Getting Started
Sentiment Analysis with Quantization
Prepare Dataset
from datasets import load_dataset, load_metric
from transformers import AutoConfig,AutoModelForSequenceClassification,AutoTokenizer
raw_datasets = load_dataset("glue", "sst2")
tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english")
raw_datasets = raw_datasets.map(lambda e: tokenizer(e['sentence'], truncation=True, padding='max_length', max_length=128), batched=True)
Quantization
from intel_extension_for_transformers.optimization import QuantizationConfig, metrics, objectives
from intel_extension_for_transformers.optimization.trainer import NLPTrainer
config = AutoConfig.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english",num_labels=2)
model = AutoModelForSequenceClassification.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english",config=config)
model.config.label2id = {0: 0, 1: 1}
model.config.id2label = {0: 'NEGATIVE', 1: 'POSITIVE'}
# Replace transformers.Trainer with NLPTrainer
# trainer = transformers.Trainer(...)
trainer = NLPTrainer(model=model,
train_dataset=raw_datasets["train"],
eval_dataset=raw_datasets["validation"],
tokenizer=tokenizer
)
q_config = QuantizationConfig(metrics=[metrics.Metric(name="eval_loss", greater_is_better=False)])
model = trainer.quantize(quant_config=q_config)
input = tokenizer("I like Intel Extension for Transformers", return_tensors="pt")
output = model(**input).logits.argmax().item()
For more quick samples, please refer to Get Started Page. For more validated examples, please refer to Support Model Matrix
Documentation
OVERVIEW | |||||||
---|---|---|---|---|---|---|---|
Model Compression | Neural Engine | Kernel Libraries | Examples | ||||
MODEL COMPRESSION | |||||||
Quantization | Pruning | Distillation | Orchestration | ||||
Neural Architecture Search | Export | Metrics/Objectives | Pipeline | ||||
NEURAL ENGINE | |||||||
Model Compilation | Custom Pattern | Deployment | Profiling | ||||
KERNEL LIBRARIES | |||||||
Sparse GEMM Kernels | Custom INT8 Kernels | Profiling | Benchmark | ||||
ALGORITHMS | |||||||
Length Adaptive | Data Augmentation | ||||||
TUTORIALS AND RESULTS | |||||||
Tutorials | Supported Models | Model Performance | Kernel Performance |
Selected Publications/Events
- Blog published on Medium: MLefficiency — Optimizing transformer models for efficiency (Dec 2022)
- NeurIPS'2022: Fast Distilbert on CPUs (Nov 2022)
- NeurIPS'2022: QuaLA-MiniLM: a Quantized Length Adaptive MiniLM (Nov 2022)
- Blog published by Cohere: Top NLP Papers—November 2022 (Nov 2022)
- Blog published by Alibaba: Deep learning inference optimization for Address Purification (Aug 2022)
- NeurIPS'2021: Prune Once for All: Sparse Pre-Trained Language Models (Nov 2021)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for intel_extension_for_transformers-1.0.0.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2012224c156e6c144b628eb8019038c0d7de873a40f45f3e84d95c71469eda84 |
|
MD5 | 392783c954051cbc6ffe9c51a90b1c4f |
|
BLAKE2b-256 | 93a8473a9d44fc8aaf541c5181613efaec332c0e9365bb5f1b726777df25aef7 |
Hashes for intel_extension_for_transformers-1.0.0-cp310-cp310-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7973e10137b271d7731990312b0c3c3ce06c496f4bafbb7f7e4c31bf5c4ceb96 |
|
MD5 | da6ce95417c550b23e18c106e3f264d2 |
|
BLAKE2b-256 | bc0c05ad147a2df2a7cb224c820c2e475afb05fbd6abf024fb891f62a5086058 |
Hashes for intel_extension_for_transformers-1.0.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8a5dbe86bcc9ba360ceaad80b1ed7f07d12fe3bcbc1612d405d0dc2c2e0af44f |
|
MD5 | ea07cefa8722e9d63c7b32eadb575f77 |
|
BLAKE2b-256 | 69e2329711392e29d60525e2e237cabae66edf089c91ad301ef69febf8311968 |
Hashes for intel_extension_for_transformers-1.0.0-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5ffd8256df8ed90b24028cc972f4d1f43bd37bfc2819c142d21e269d35f9ecc1 |
|
MD5 | 65b973cb3f6882db9de94ad960d18a0d |
|
BLAKE2b-256 | 3b626428317d62114ad39fb9c850485b04bd6bda8820345cb4eff3300a57ae07 |
Hashes for intel_extension_for_transformers-1.0.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c3ed6f61348c8cda194d56716b102f47dfd3a1f2b24511b8417b389f1535c2b5 |
|
MD5 | 4b0b706be71a5847506f7cfa08800645 |
|
BLAKE2b-256 | 408e3c5725f0e788113d716d0628edd73dea1300a31dae4d1931fd2e61f26385 |
Hashes for intel_extension_for_transformers-1.0.0-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0dd49fdc26c6142060999c133637bbd5b8b09b625cf1298f6fbdadc8a50a197e |
|
MD5 | 6c9f418bf0983b996614d8942fabbe48 |
|
BLAKE2b-256 | 7936296d586160e1ae6f1a30074b1116d5ea1a14662c7c5329a9acdca0d77841 |
Hashes for intel_extension_for_transformers-1.0.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6d8fae410abd74787dd2d27b48ef6762b6276fa46868e1018268a3bab36fa5b0 |
|
MD5 | 4cb60340664755ab3a4ba27009b8b41c |
|
BLAKE2b-256 | 806e0d90f9ee8d60cfd8d6568a356676c35f0b2f0830d7608e459cead4d01dfd |
Hashes for intel_extension_for_transformers-1.0.0-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d14436f0e9e195f9d36b3c6669be591159aa2084b4a45cfded624fba8ce6ef63 |
|
MD5 | 223a9e803b63e0d6f1f5c9754de64500 |
|
BLAKE2b-256 | 945dd8fa991224f4fbad7edef921994a03970b04b4940b787986d2871e5b882b |
Hashes for intel_extension_for_transformers-1.0.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 85279978930f5135475d916c8ac5849f338dee3d236d7321d19cdf00744d0c4a |
|
MD5 | eb212b4c59b858dccc061569b1f96644 |
|
BLAKE2b-256 | 28c8202cf1b59392cbcad63d7e308e80b2f2de824e153fe287f150d1aabcf028 |
Hashes for intel_extension_for_transformers-1.0-cp310-cp310-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0bbadcfdf18b2952d42810cd651f9ab24e3d3cf90c8c1f6ee517087dd6b9f517 |
|
MD5 | 86054dc313400e561c4aacda02be6bba |
|
BLAKE2b-256 | 56b1ca2c301649bda45b676ad19c0cd143a3296504a21a30f41142796ff11d23 |
Hashes for intel_extension_for_transformers-1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3dedff542df0cd10c58a5d16f5d2fc18dc01684ec795fbd1dcf4cb3b5a80c3d7 |
|
MD5 | d395ab3e6b53a5f14d2e5be23857c655 |
|
BLAKE2b-256 | c4a56176cf02a2f21cf13c9ed25d846f95845821affdd3c69c9a2f52ad74e7c9 |
Hashes for intel_extension_for_transformers-1.0-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 06dbcc6841a2c13cf2775fadebcc3a1e96908a51a44c5cd5ee60717dfb42d1ef |
|
MD5 | e1236668334ce5bbfd2dd3647595114f |
|
BLAKE2b-256 | cf1a0400bc5b04acbedef56fd49652829b7d4bbb6773f58db3ce5ad780351ee6 |
Hashes for intel_extension_for_transformers-1.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | eac170e7e85ecf359291f68c1885fa409c6961221fbb291679c6edde83d1739c |
|
MD5 | 2dc763f9d8c0594f838ad1ec9415a927 |
|
BLAKE2b-256 | 5234d566d8e1acd84790baff5d06b48587b4e8de8593225fd7b3fd283e3466e1 |
Hashes for intel_extension_for_transformers-1.0-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8a41cf3bd25294b601baf9e6e721a55ec7393f52a5d1aff558f3fada40afa1e4 |
|
MD5 | 06063057126b7220afd5c486be9b8458 |
|
BLAKE2b-256 | 77747082e2464084fa5ce29f0bf9f2115b73404503863096a69135636342f9eb |
Hashes for intel_extension_for_transformers-1.0-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2763e678e53db857daf3cd54d6e5c538a9f2cc66a72bd1c5bedfce794dc6b8d5 |
|
MD5 | d655c3684230edfd0685e7608ac3ebe1 |
|
BLAKE2b-256 | 5e13b530202570dff667480e331a2f6c5fe76fc38c2a44da3abc64eb1613f32c |
Hashes for intel_extension_for_transformers-1.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fccd209014959a2dc0ebfe419c50d06ea7a5ffb2e9764177814cd1919f2f9d22 |
|
MD5 | 849110a07d27b10207c5083c9ab5e05c |
|
BLAKE2b-256 | bfa1631d2351f38dae5909a4d00ea8c742934ae435f652550dff749730e2ad8e |