Skip to main content

Intel® Extension for PyTorch*

Project description

Intel® Extension for PyTorch*

CPU 💻main branch   |   🌱Quick Start   |   📖Documentations   |   🏃Installation   |   💻LLM Example
GPU 💻main branch   |   🌱Quick Start   |   📖Documentations   |   🏃Installation   |   💻LLM Example

Intel® Extension for PyTorch* extends PyTorch* with up-to-date features optimizations for an extra performance boost on Intel hardware. Optimizations take advantage of Intel® Advanced Vector Extensions 512 (Intel® AVX-512) Vector Neural Network Instructions (VNNI) and Intel® Advanced Matrix Extensions (Intel® AMX) on Intel CPUs as well as Intel Xe Matrix Extensions (XMX) AI engines on Intel discrete GPUs. Moreover, Intel® Extension for PyTorch* provides easy GPU acceleration for Intel discrete GPUs through the PyTorch* xpu device.

ipex.llm - Large Language Models (LLMs) Optimization

In the current technological landscape, Generative AI (GenAI) workloads and models have gained widespread attention and popularity. Large Language Models (LLMs) have emerged as the dominant models driving these GenAI applications. Starting from 2.1.0, specific optimizations for certain LLM models are introduced in the Intel® Extension for PyTorch*. Check LLM optimizations for details.

Optimized Model List

We have supported a long list of LLMs, including the most notable open-source models like Llama series, Qwen series, Phi-3/Phi-4 series, and the phenomenal high-quality reasoning model DeepSeek-R1.

MODEL FAMILY MODEL NAME (Huggingface hub) FP32 BF16 Weight only quantization INT8 Weight only quantization INT4
LLAMA meta-llama/Llama-2-7b-hf
LLAMA meta-llama/Llama-2-13b-hf
LLAMA meta-llama/Llama-2-70b-hf
LLAMA meta-llama/Meta-Llama-3-8B
LLAMA meta-llama/Meta-Llama-3-70B
LLAMA meta-llama/Meta-Llama-3.1-8B-Instruct
LLAMA meta-llama/Llama-3.2-3B-Instruct
LLAMA meta-llama/Llama-3.2-11B-Vision-Instruct
GPT-J EleutherAI/gpt-j-6b
GPT-NEOX EleutherAI/gpt-neox-20b
DOLLY databricks/dolly-v2-12b
FALCON tiiuae/falcon-7b
FALCON tiiuae/falcon-11b
FALCON tiiuae/falcon-40b
FALCON tiiuae/Falcon3-7B-Instruct
OPT facebook/opt-30b
OPT facebook/opt-1.3b
Bloom bigscience/bloom-1b7
CodeGen Salesforce/codegen-2B-multi
Baichuan baichuan-inc/Baichuan2-7B-Chat
Baichuan baichuan-inc/Baichuan2-13B-Chat
Baichuan baichuan-inc/Baichuan-13B-Chat
ChatGLM THUDM/chatglm3-6b
ChatGLM THUDM/chatglm2-6b
GPTBigCode bigcode/starcoder
T5 google/flan-t5-xl
MPT mosaicml/mpt-7b
Mistral mistralai/Mistral-7B-v0.1
Mixtral mistralai/Mixtral-8x7B-v0.1
Stablelm stabilityai/stablelm-2-1_6b
Qwen Qwen/Qwen-7B-Chat
Qwen Qwen/Qwen2-7B
Qwen Qwen/Qwen2.5-7B-Instruct
Qwen Qwen/Qwen3-14B
Qwen Qwen/Qwen3-30B-A3B
LLaVA liuhaotian/llava-v1.5-7b
GIT microsoft/git-base
Yuan IEITYuan/Yuan2-102B-hf
Phi microsoft/phi-2
Phi microsoft/Phi-3-mini-4k-instruct
Phi microsoft/Phi-3-mini-128k-instruct
Phi microsoft/Phi-3-medium-4k-instruct
Phi microsoft/Phi-3-medium-128k-instruct
Phi microsoft/Phi-4-mini-instruct
Phi microsoft/Phi-4-multimodal-instruct
Whisper openai/whisper-large-v2
Whisper openai/whisper-large-v3
Maira microsoft/maira-2
Jamba ai21labs/Jamba-v0.1
DeepSeek deepseek-ai/DeepSeek-V2.5-1210
DeepSeek meituan/DeepSeek-R1-Channel-INT8

Note: The above verified models (including other models in the same model family, like "codellama/CodeLlama-7b-hf" from LLAMA family) are well supported with all optimizations like indirect access KV cache, fused ROPE, and customized linear kernels. We are working in progress to better support the models in the tables with various data types. In addition, more models will be optimized in the future.

In addition, Intel® Extension for PyTorch* introduces module level optimization APIs (prototype feature) since release 2.3.0. The feature provides optimized alternatives for several commonly used LLM modules and functionalities for the optimizations of the niche or customized LLMs. Please read LLM module level optimization practice to better understand how to optimize your own LLM and achieve better performance.

Support

The team tracks bugs and enhancement requests using GitHub issues. Before submitting a suggestion or bug report, search the existing GitHub issues to see if your issue has already been reported.

License

Apache License, Version 2.0. As found in LICENSE file.

Security

See Intel's Security Center for information on how to report a potential security issue or vulnerability.

See also: Security Policy

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

intel_extension_for_pytorch-2.8.0-cp313-cp313-manylinux_2_28_x86_64.whl (50.0 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

intel_extension_for_pytorch-2.8.0-cp312-cp312-manylinux_2_28_x86_64.whl (50.0 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

intel_extension_for_pytorch-2.8.0-cp311-cp311-manylinux_2_28_x86_64.whl (49.9 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

intel_extension_for_pytorch-2.8.0-cp310-cp310-manylinux_2_28_x86_64.whl (49.9 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

intel_extension_for_pytorch-2.8.0-cp39-cp39-manylinux_2_28_x86_64.whl (49.9 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.28+ x86-64

File details

Details for the file intel_extension_for_pytorch-2.8.0-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for intel_extension_for_pytorch-2.8.0-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 68a0114aaa793d87609a35b1b39ea4f6323cade48c8664626fb36bc45eb8015c
MD5 2fec319d5ece808315edabee01b14771
BLAKE2b-256 6cfd0da69378e8be2c3e867277d79928cc56bfc9638efe4582d57753d340ba32

See more details on using hashes here.

File details

Details for the file intel_extension_for_pytorch-2.8.0-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for intel_extension_for_pytorch-2.8.0-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 4c0d6ba2c61712add4ff7a168cd80326137f7279a5496f2f9ca2922c318e0cdf
MD5 dc21db8f8bbf6bae9123bbf74008b5da
BLAKE2b-256 642acb48cb2a6a0b2917c85f3489af3e4dcf8e26ca74e4ffb99b8ec80498c2d5

See more details on using hashes here.

File details

Details for the file intel_extension_for_pytorch-2.8.0-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for intel_extension_for_pytorch-2.8.0-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 e7b8a1daccf23ad4eccb33f4b2f6f110d65feddcecbf8aae70f351c2237fab70
MD5 0e3a36d5136895d5e635aecfefcf3431
BLAKE2b-256 dce0c7d9da40b767e02164ca4e846ee895ef71b3ee35b00eb620931b77907a9b

See more details on using hashes here.

File details

Details for the file intel_extension_for_pytorch-2.8.0-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for intel_extension_for_pytorch-2.8.0-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 216bbbc2f195c188890890868f9914f64825f08d831c7c63594cbcfc6c62aceb
MD5 c06e6fb2196b9efb1a7896f3d3c0e215
BLAKE2b-256 0f420e88a2bf7c1b9044524ec9d49fd49b097931bbd0d7f92e6eb9d6f1846d94

See more details on using hashes here.

File details

Details for the file intel_extension_for_pytorch-2.8.0-cp39-cp39-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for intel_extension_for_pytorch-2.8.0-cp39-cp39-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 7e0c48721463ffae7615b3a403ec7e21fdd8f000952240fe74b362b37295216d
MD5 40adb13f0784b80a1031d1d6edf7be81
BLAKE2b-256 408cf91bb72e3167738d01958fdb3022fb02fba9d144c40ceae657c0c2681b9e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page