Skip to main content

Intel® Extension for PyTorch*

Project description

Intel® Extension for PyTorch*

CPU 💻main branch   |   🌱Quick Start   |   📖Documentations   |   🏃Installation   |   💻LLM Example
GPU 💻main branch   |   🌱Quick Start   |   📖Documentations   |   🏃Installation   |   💻LLM Example

Intel® Extension for PyTorch* extends PyTorch* with up-to-date features optimizations for an extra performance boost on Intel hardware. Optimizations take advantage of Intel® Advanced Vector Extensions 512 (Intel® AVX-512) Vector Neural Network Instructions (VNNI) and Intel® Advanced Matrix Extensions (Intel® AMX) on Intel CPUs as well as Intel Xe Matrix Extensions (XMX) AI engines on Intel discrete GPUs. Moreover, Intel® Extension for PyTorch* provides easy GPU acceleration for Intel discrete GPUs through the PyTorch* xpu device.

ipex.llm - Large Language Models (LLMs) Optimization

In the current technological landscape, Generative AI (GenAI) workloads and models have gained widespread attention and popularity. Large Language Models (LLMs) have emerged as the dominant models driving these GenAI applications. Starting from 2.1.0, specific optimizations for certain LLM models are introduced in the Intel® Extension for PyTorch*. Check LLM optimizations for details.

Optimized Model List

MODEL FAMILY MODEL NAME (Huggingface hub) FP32 BF16 Static quantization INT8 Weight only quantization INT8 Weight only quantization INT4
LLAMA meta-llama/Llama-2-7b-hf 🟩 🟩 🟩 🟩 🟩
LLAMA meta-llama/Llama-2-13b-hf 🟩 🟩 🟩 🟩 🟩
LLAMA meta-llama/Llama-2-70b-hf 🟩 🟩 🟩 🟩 🟩
LLAMA meta-llama/Meta-Llama-3-8B 🟩 🟩 🟩 🟩 🟩
LLAMA meta-llama/Meta-Llama-3-70B 🟩 🟩 🟩 🟩 🟩
LLAMA meta-llama/Meta-Llama-3.1-8B-Instruct 🟩 🟩 🟩 🟩 🟩
LLAMA meta-llama/Llama-3.2-3B-Instruct 🟩 🟩 🟩 🟩 🟩
LLAMA meta-llama/Llama-3.2-11B-Vision-Instruct 🟩 🟩 🟩
GPT-J EleutherAI/gpt-j-6b 🟩 🟩 🟩 🟩 🟩
GPT-NEOX EleutherAI/gpt-neox-20b 🟩 🟩 🟩 🟩 🟩
DOLLY databricks/dolly-v2-12b 🟩 🟩 🟩 🟩 🟩
FALCON tiiuae/falcon-7b 🟩 🟩 🟩 🟩 🟩
FALCON tiiuae/falcon-11b 🟩 🟩 🟩 🟩 🟩
FALCON tiiuae/falcon-40b 🟩 🟩 🟩 🟩 🟩
OPT facebook/opt-30b 🟩 🟩 🟩 🟩 🟩
OPT facebook/opt-1.3b 🟩 🟩 🟩 🟩 🟩
Bloom bigscience/bloom-1b7 🟩 🟩 🟩 🟩 🟩
CodeGen Salesforce/codegen-2B-multi 🟩 🟩 🟩 🟩 🟩
Baichuan baichuan-inc/Baichuan2-7B-Chat 🟩 🟩 🟩 🟩 🟩
Baichuan baichuan-inc/Baichuan2-13B-Chat 🟩 🟩 🟩 🟩 🟩
Baichuan baichuan-inc/Baichuan-13B-Chat 🟩 🟩 🟩 🟩 🟩
ChatGLM THUDM/chatglm3-6b 🟩 🟩 🟩 🟩 🟩
ChatGLM THUDM/chatglm2-6b 🟩 🟩 🟩 🟩 🟩
GPTBigCode bigcode/starcoder 🟩 🟩 🟩 🟩 🟩
T5 google/flan-t5-xl 🟩 🟩 🟩 🟩
MPT mosaicml/mpt-7b 🟩 🟩 🟩 🟩 🟩
Mistral mistralai/Mistral-7B-v0.1 🟩 🟩 🟩 🟩 🟩
Mixtral mistralai/Mixtral-8x7B-v0.1 🟩 🟩 🟩 🟩
Stablelm stabilityai/stablelm-2-1_6b 🟩 🟩 🟩 🟩 🟩
Qwen Qwen/Qwen-7B-Chat 🟩 🟩 🟩 🟩 🟩
Qwen Qwen/Qwen2-7B 🟩 🟩 🟩 🟩 🟩
LLaVA liuhaotian/llava-v1.5-7b 🟩 🟩 🟩 🟩
GIT microsoft/git-base 🟩 🟩 🟩
Yuan IEITYuan/Yuan2-102B-hf 🟩 🟩 🟩
Phi microsoft/phi-2 🟩 🟩 🟩 🟩 🟩
Phi microsoft/Phi-3-mini-4k-instruct 🟩 🟩 🟩 🟩 🟩
Phi microsoft/Phi-3-mini-128k-instruct 🟩 🟩 🟩 🟩 🟩
Phi microsoft/Phi-3-medium-4k-instruct 🟩 🟩 🟩 🟩 🟩
Phi microsoft/Phi-3-medium-128k-instruct 🟩 🟩 🟩 🟩 🟩
Whisper openai/whisper-large-v2 🟩 🟩 🟩 🟩

Note: The above verified models (including other models in the same model family, like "codellama/CodeLlama-7b-hf" from LLAMA family) are well supported with all optimizations like indirect access KV cache, fused ROPE, and customized linear kernels. We are working in progress to better support the models in the tables with various data types. In addition, more models will be optimized in the future.

In addition, Intel® Extension for PyTorch* introduces module level optimization APIs (prototype feature) since release 2.3.0. The feature provides optimized alternatives for several commonly used LLM modules and functionalities for the optimizations of the niche or customized LLMs. Please read LLM module level optimization practice to better understand how to optimize your own LLM and achieve better performance.

Support

The team tracks bugs and enhancement requests using GitHub issues. Before submitting a suggestion or bug report, search the existing GitHub issues to see if your issue has already been reported.

License

Apache License, Version 2.0. As found in LICENSE file.

Security

See Intel's Security Center for information on how to report a potential security issue or vulnerability.

See also: Security Policy

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

File details

Details for the file intel_extension_for_pytorch-2.5.0-cp312-cp312-manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for intel_extension_for_pytorch-2.5.0-cp312-cp312-manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 564770c03790bf05450612da05008d44045339f5930cdadc97c733dd20f7bc68
MD5 bf6a6c4e4b3457e9c8ba4b2c9aaf80ad
BLAKE2b-256 7f78ab4cd7c7b471b6525f841d78b8fac8c69f53c1fa4f6f109db04f75441d44

See more details on using hashes here.

File details

Details for the file intel_extension_for_pytorch-2.5.0-cp311-cp311-manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for intel_extension_for_pytorch-2.5.0-cp311-cp311-manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 bfcc21b4771679532d65993e7f555212c169dec4c1f07574cf714aa0a43c3977
MD5 8d5c78ae0b29fcd9afaf0c7bbe219303
BLAKE2b-256 d2d66c10ce84424a1a35e6b0dcada645a0307308518aef8fefe0eab04b6cc2a1

See more details on using hashes here.

File details

Details for the file intel_extension_for_pytorch-2.5.0-cp310-cp310-manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for intel_extension_for_pytorch-2.5.0-cp310-cp310-manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 cf52a72245589a86917adea2c41a0639b66458be4a19d4b2127fc582b95be421
MD5 063825f1f12688777bfa6dc1dfeb5814
BLAKE2b-256 55e8e23c6127eeb339296db9e2a1501c8f836f207d90b6ae831336a3a908237c

See more details on using hashes here.

File details

Details for the file intel_extension_for_pytorch-2.5.0-cp39-cp39-manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for intel_extension_for_pytorch-2.5.0-cp39-cp39-manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 ae1a33a06c0ead41905f80ca0f485541f230781fe0705429d1d40c77b7d9982b
MD5 dd2027dcef5d8c87dfabef3fe195166c
BLAKE2b-256 f81aab251341f5088afc61b62cb51fb57ec0faf263622bf838859a9d49e52ec0

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page