Skip to main content

Intel® Extension for PyTorch*

Project description

Intel® Extension for PyTorch*

CPU 💻main branch   |   🌱Quick Start   |   📖Documentations   |   🏃Installation   |   💻LLM Example
GPU 💻main branch   |   🌱Quick Start   |   📖Documentations   |   🏃Installation   |   💻LLM Example

Intel® Extension for PyTorch* extends PyTorch* with up-to-date features optimizations for an extra performance boost on Intel hardware. Optimizations take advantage of Intel® Advanced Vector Extensions 512 (Intel® AVX-512) Vector Neural Network Instructions (VNNI) and Intel® Advanced Matrix Extensions (Intel® AMX) on Intel CPUs as well as Intel Xe Matrix Extensions (XMX) AI engines on Intel discrete GPUs. Moreover, Intel® Extension for PyTorch* provides easy GPU acceleration for Intel discrete GPUs through the PyTorch* xpu device.

ipex.llm - Large Language Models (LLMs) Optimization

In the current technological landscape, Generative AI (GenAI) workloads and models have gained widespread attention and popularity. Large Language Models (LLMs) have emerged as the dominant models driving these GenAI applications. Starting from 2.1.0, specific optimizations for certain LLM models are introduced in the Intel® Extension for PyTorch*. Check LLM optimizations for details.

Optimized Model List

We have supported a long list of LLMs, including the most notable open-source models like Llama series, Qwen series, Phi-3/Phi-4 series, and the phenomenal high-quality reasoning model DeepSeek-R1.

MODEL FAMILY MODEL NAME (Huggingface hub) FP32 BF16 Weight only quantization INT8 Weight only quantization INT4
LLAMA meta-llama/Llama-2-7b-hf
LLAMA meta-llama/Llama-2-13b-hf
LLAMA meta-llama/Llama-2-70b-hf
LLAMA meta-llama/Meta-Llama-3-8B
LLAMA meta-llama/Meta-Llama-3-70B
LLAMA meta-llama/Meta-Llama-3.1-8B-Instruct
LLAMA meta-llama/Llama-3.2-3B-Instruct
LLAMA meta-llama/Llama-3.2-11B-Vision-Instruct
GPT-J EleutherAI/gpt-j-6b
GPT-NEOX EleutherAI/gpt-neox-20b
DOLLY databricks/dolly-v2-12b
FALCON tiiuae/falcon-7b
FALCON tiiuae/falcon-11b
FALCON tiiuae/falcon-40b
FALCON tiiuae/Falcon3-7B-Instruct
OPT facebook/opt-30b
OPT facebook/opt-1.3b
Bloom bigscience/bloom-1b7
CodeGen Salesforce/codegen-2B-multi
Baichuan baichuan-inc/Baichuan2-7B-Chat
Baichuan baichuan-inc/Baichuan2-13B-Chat
Baichuan baichuan-inc/Baichuan-13B-Chat
ChatGLM THUDM/chatglm3-6b
ChatGLM THUDM/chatglm2-6b
GPTBigCode bigcode/starcoder
T5 google/flan-t5-xl
MPT mosaicml/mpt-7b
Mistral mistralai/Mistral-7B-v0.1
Mixtral mistralai/Mixtral-8x7B-v0.1
Stablelm stabilityai/stablelm-2-1_6b
Qwen Qwen/Qwen-7B-Chat
Qwen Qwen/Qwen2-7B
Qwen Qwen/Qwen2.5-7B-Instruct
LLaVA liuhaotian/llava-v1.5-7b
GIT microsoft/git-base
Yuan IEITYuan/Yuan2-102B-hf
Phi microsoft/phi-2
Phi microsoft/Phi-3-mini-4k-instruct
Phi microsoft/Phi-3-mini-128k-instruct
Phi microsoft/Phi-3-medium-4k-instruct
Phi microsoft/Phi-3-medium-128k-instruct
Phi microsoft/Phi-4-mini-instruct
Phi microsoft/Phi-4-multimodal-instruct
Whisper openai/whisper-large-v2
Maira microsoft/maira-2
Jamba ai21labs/Jamba-v0.1
DeepSeek deepseek-ai/DeepSeek-V2.5-1210
DeepSeek meituan/DeepSeek-R1-Channel-INT8

Note: The above verified models (including other models in the same model family, like "codellama/CodeLlama-7b-hf" from LLAMA family) are well supported with all optimizations like indirect access KV cache, fused ROPE, and customized linear kernels. We are working in progress to better support the models in the tables with various data types. In addition, more models will be optimized in the future.

In addition, Intel® Extension for PyTorch* introduces module level optimization APIs (prototype feature) since release 2.3.0. The feature provides optimized alternatives for several commonly used LLM modules and functionalities for the optimizations of the niche or customized LLMs. Please read LLM module level optimization practice to better understand how to optimize your own LLM and achieve better performance.

Support

The team tracks bugs and enhancement requests using GitHub issues. Before submitting a suggestion or bug report, search the existing GitHub issues to see if your issue has already been reported.

License

Apache License, Version 2.0. As found in LICENSE file.

Security

See Intel's Security Center for information on how to report a potential security issue or vulnerability.

See also: Security Policy

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

intel_extension_for_pytorch-2.7.0-cp313-cp313-manylinux_2_28_x86_64.whl (104.7 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

intel_extension_for_pytorch-2.7.0-cp312-cp312-manylinux_2_28_x86_64.whl (104.7 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

intel_extension_for_pytorch-2.7.0-cp311-cp311-manylinux_2_28_x86_64.whl (104.7 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

intel_extension_for_pytorch-2.7.0-cp310-cp310-manylinux_2_28_x86_64.whl (104.7 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

intel_extension_for_pytorch-2.7.0-cp39-cp39-manylinux_2_28_x86_64.whl (104.7 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.28+ x86-64

File details

Details for the file intel_extension_for_pytorch-2.7.0-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for intel_extension_for_pytorch-2.7.0-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 e2dd12f4e102ac3825a68076c6a901e570d3b4ff9a0588582c80ace9e5c8cb31
MD5 b06c7ee74667627313346eac9d6245c0
BLAKE2b-256 da23e4f28f9935bb344b2ecedb17b7c63dcbcd5148df43196239c2fb4b8e024a

See more details on using hashes here.

File details

Details for the file intel_extension_for_pytorch-2.7.0-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for intel_extension_for_pytorch-2.7.0-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 1ea073477c1910633ecbd3dd45fd12a243aec491c7d27d493c63a4bb91823b58
MD5 d050e53612b563b41a9913619bec33b1
BLAKE2b-256 67cce0218398c3a3aecbe9db285e79fdfad4927b5abbb3df4b52d94ad117e2c3

See more details on using hashes here.

File details

Details for the file intel_extension_for_pytorch-2.7.0-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for intel_extension_for_pytorch-2.7.0-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 7f11ddc6bfd4d845ef7215d5c61b3298ca7d9f23a76cfb1e686537cf035b47ec
MD5 70cbf804f2e39a2cb31cfa8ee5002ad8
BLAKE2b-256 5eba04cf1c89cef2ac88dfdf4609badccf49de6a69abebdb92ec88e227eaa61d

See more details on using hashes here.

File details

Details for the file intel_extension_for_pytorch-2.7.0-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for intel_extension_for_pytorch-2.7.0-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 aa0a47d0d7def0d0ef7f6f1a2e4978266e62fa54e753685d8533cf98c214b525
MD5 28eeac848db397af2987b4a1fa8a0a04
BLAKE2b-256 d9e068713c90fba13c59838086ec6e8c95d24f956e2018515d80b89069399407

See more details on using hashes here.

File details

Details for the file intel_extension_for_pytorch-2.7.0-cp39-cp39-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for intel_extension_for_pytorch-2.7.0-cp39-cp39-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 88e0f6684f0088beb10123c87a07054cef88a85d643ccffa355b0783aaddca35
MD5 7a3ac82634bee7dce72acc00d4955912
BLAKE2b-256 c152f15bcd530e19fa8a67a1def4a39659141024b6d127b0e56397a5c6dab906

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page