Intel(R) Extension for Scikit-learn is a seamless way to speed up your Scikit-learn application.
Project description
Intel(R) Extension for Scikit-learn*
Intel(R) Extension for Scikit-learn is a seamless way to speed up your Scikit-learn application. The acceleration is achieved through the use of the Intel(R) oneAPI Data Analytics Library (oneDAL). Patching scikit-learn makes it a well-suited machine learning framework for dealing with real-life problems.
⚠️Intel(R) Extension for Scikit-learn contains scikit-learn patching functionality that was originally available in daal4py package. All future updates for the patches will be available only in Intel(R) Extension for Scikit-learn. We recommend you to use scikit-learn-intelex package instead of daal4py. You can learn more about daal4py in daal4py documentation.
Running the latest scikit-learn test suite with Intel(R) Extension for Scikit-learn:
👀 Follow us on Medium
We publish blogs on Medium, so follow us to learn tips and tricks for more efficient data analysis the help of Intel(R) Extension for Scikit-learn. Here are our latest blogs:
- Intel Gives Scikit-Learn the Performance Boost Data Scientists Need
- From Hours to Minutes: 600x Faster SVM
- Improve the Performance of XGBoost and LightGBM Inference
- Accelerate Kaggle Challenges Using Intel AI Analytics Toolkit
- Accelerate Your scikit-learn Applications
- Accelerate Linear Models for Machine Learning
- Accelerate K-Means Clustering
🔗 Important links
- Documentation
- scikit-learn API and patching
- Benchmark code
- Building from Sources
- About Intel(R) oneAPI Data Analytics Library
- About Intel(R) daal4py
💬 Support
Report issues, ask questions, and provide suggestions using:
You may reach out to project maintainers privately at onedal.maintainers@intel.com
🛠 Installation
Intel(R) Extension for Scikit-learn is available at the Python Package Index, on Anaconda Cloud in Conda-Forge channel and in Intel channel.
# PyPi (recommended by default)
pip install scikit-learn-intelex
# Anaconda Cloud from Conda-Forge channel (recommended for conda users by default)
conda install scikit-learn-intelex -c conda-forge
# Anaconda Cloud from Intel channel (recommended for Intel® Distribution for Python users)
conda install scikit-learn-intelex -c intel
[Click to expand] ℹ️ Supported configurations
📦 PyPi channel
OS / Python version | Python 3.6 | Python 3.7 | Python 3.8 | Python 3.9 |
---|---|---|---|---|
Linux | [CPU, GPU] | [CPU, GPU] | [CPU, GPU] | ❌ |
Windows | [CPU, GPU] | [CPU, GPU] | [CPU, GPU] | ❌ |
OsX | [CPU] | [CPU] | [CPU] | ❌ |
📦 Anaconda Cloud: Conda-Forge channel
OS / Python version | Python 3.6 | Python 3.7 | Python 3.8 | Python 3.9 |
---|---|---|---|---|
Linux | [CPU] | [CPU] | [CPU] | [CPU] |
Windows | [CPU] | [CPU] | [CPU] | [CPU] |
OsX | [CPU] | [CPU] | [CPU] | [CPU] |
📦 Anaconda Cloud: Intel channel
OS / Python version | Python 3.6 | Python 3.7 | Python 3.8 | Python 3.9 |
---|---|---|---|---|
Linux | [CPU, GPU] | [CPU, GPU] | [CPU, GPU] | ❌ |
Windows | [CPU, GPU] | [CPU, GPU] | [CPU, GPU] | ❌ |
OsX | [CPU] | [CPU] | [CPU] | ❌ |
You can build the package from sources as well.
⚡️ Get Started
Intel CPU optimizations patching
import numpy as np
from sklearnex import patch_sklearn
patch_sklearn()
from sklearn.cluster import DBSCAN
X = np.array([[1., 2.], [2., 2.], [2., 3.],
[8., 7.], [8., 8.], [25., 80.]], dtype=np.float32)
clustering = DBSCAN(eps=3, min_samples=2).fit(X)
Intel GPU optimizations patching
import numpy as np
from sklearnex import patch_sklearn
from daal4py.oneapi import sycl_context
patch_sklearn()
from sklearn.cluster import DBSCAN
X = np.array([[1., 2.], [2., 2.], [2., 3.],
[8., 7.], [8., 8.], [25., 80.]], dtype=np.float32)
with sycl_context("gpu"):
clustering = DBSCAN(eps=3, min_samples=2).fit(X)
🚀 Scikit-learn patching
Speedups of Intel(R) Extension for Scikit-learn over the original Scikit-learn |
---|
Technical details: float type: float64; HW: Intel(R) Xeon(R) Platinum 8280 CPU @ 2.70GHz, 2 sockets, 28 cores per socket; SW: scikit-learn 0.23.1, Intel® oneDAl (2021.1 Beta 10), benchmark code |
Intel(R) Extension for Scikit-learn patching affects performance of specific Scikit-learn functionality listed below. In cases when unsupported parameters are used, the package fallbacks into original Scikit-learn. These limitations described below. If the patching does not cover your scenarios, submit an issue on GitHub.
[Click to expand] 🔥 Applying the patching will impact the following existing scikit-learn algorithms:
Task | Functionality | Parameters support | Data support |
---|---|---|---|
Classification | SVC | All parameters except kernel = 'poly' and 'sigmoid'. |
No limitations. |
RandomForestClassifier | All parameters except warmstart = True and cpp_alpha != 0, criterion != 'gini'. |
Multi-output and sparse data is not supported. | |
KNeighborsClassifier | All parameters except metric != 'euclidean' or minkowski with p != 2. |
Multi-output and sparse data is not supported. | |
LogisticRegression / LogisticRegressionCV | All parameters except solver != 'lbfgs' or 'newton-cg', class_weight != None, sample_weight != None. |
Only dense data is supported. | |
Regression | RandomForestRegressor | All parameters except warmstart = True and cpp_alpha != 0, criterion != 'mse'. |
Multi-output and sparse data is not supported. |
KNeighborsRegressor | All parameters except metric != 'euclidean' or minkowski with p != 2. |
Sparse data is not supported. | |
LinearRegression | All parameters except normalize != False and sample_weight != None. |
Only dense data is supported, #observations should be >= #features . |
|
Ridge | All parameters except normalize != False, solver != 'auto' and sample_weight != None. |
Only dense data is supported, #observations should be >= #features . |
|
ElasticNet | All parameters except sample_weight != None. |
Multi-output and sparse data is not supported, #observations should be >= #features . |
|
Lasso | All parameters except sample_weight != None. |
Multi-output and sparse data is not supported, #observations should be >= #features . |
|
Clustering | KMeans | All parameters except precompute_distances and sample_weight != None. |
No limitations. |
DBSCAN | All parameters except metric != 'euclidean' or minkowski with p != 2, algorithm != brute or auto . |
Only dense data is supported. | |
Dimensionality reduction | PCA | All parameters except svd_solver != 'full'. |
No limitations. |
TSNE | All parameters except metric != 'euclidean' or minkowski with p != 2. |
Sparse data is not supported. | |
Unsupervised | NearestNeighbors | All parameters except metric != 'euclidean' or minkowski with p != 2. |
Sparse data is not supported. |
Other | train_test_split | All parameters are supported. | Only dense data is supported. |
assert_all_finite | All parameters are supported. | Only dense data is supported. | |
pairwise_distance | With metric ='cosine' and 'correlation'. |
Only dense data is supported. | |
roc_auc_score | Parameters average , sample_weight , max_fpr and multi_class are not supported. |
No limitations. |
⚠️ We support optimizations for the last four versions of scikit-learn. The latest release of Intel(R) Extension for Scikit-learn 2021.2.X supports scikit-learn 0.21.X, 0.22.X, 0.23.X and 0.24.X.
📜 Intel(R) Extension for Scikit-learn verbose
To find out which implementation of the algorithm is currently used (Intel(R) Extension for Scikit-learn or original Scikit-learn), set the environment variable:
- On Linux and Mac OS:
export SKLEARNEX_VERBOSE=INFO
- On Windows:
set SKLEARNEX_VERBOSE=INFO
For example, for DBSCAN you get one of these print statements depending on which implementation is used:
SKLEARNEX INFO: sklearn.cluster.DBSCAN.fit: running accelerated version on CPU
SKLEARNEX INFO: sklearn.cluster.DBSCAN.fit: fallback to original Scikit-learn
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Hashes for scikit_learn_intelex-2021.3.0-py38-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | cc137690deca0bc0d8c59507ff5f591e0b2512db88b979f938a22ba4ed9e80c0 |
|
MD5 | b9893d6a2705c31b66657a9ea986f067 |
|
BLAKE2b-256 | 035a75efc90540ba31e05cc184440841c6becef04d493a2ea80f4b1a6e594a6a |
Hashes for scikit_learn_intelex-2021.3.0-py38-none-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 08fe46cb1e312253ea1ab3376d5578f6b1d69afbc072e104b919c2d357c15e44 |
|
MD5 | 2042cf0020355cece6c6bccba14457e6 |
|
BLAKE2b-256 | 4aaf0dcdb9a86ff082952e9670acf21c9f100dc75331797dc2e326e16b520c7b |
Hashes for scikit_learn_intelex-2021.3.0-py38-none-macosx_10_15_x86_64.macosx_11_0_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4c813d094a50a2be63565cf73dcdc5d9d2adc95536857d0f748bb83b6050507b |
|
MD5 | 0fa18b4c2dd2016a155e52997880f8a2 |
|
BLAKE2b-256 | 960667d4aa7fb28b2f60116867acd954176d57f2aebb463f3b0bb71382e11837 |
Hashes for scikit_learn_intelex-2021.3.0-py37-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b5623dc0a25dd794f12dde9f77148b606f97cb89f41be46e57113d67b2af67ac |
|
MD5 | fb9880ee536bf180542747c879028e61 |
|
BLAKE2b-256 | c35be9f255a936db7703ff701e2fe9bb431e6f55a48a2e2fe0270baaeb9ea3d3 |
Hashes for scikit_learn_intelex-2021.3.0-py37-none-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | da921a872ad843779a948c57e398beb32d5796e798f33c60e472ba2e08f33bd8 |
|
MD5 | ef5c81e653fc5494a2d057665a209c03 |
|
BLAKE2b-256 | 04d7d77418803ea5f716b3c47367ade7bc562d251aeabe03c5e3b4fc304a0787 |
Hashes for scikit_learn_intelex-2021.3.0-py37-none-macosx_10_15_x86_64.macosx_11_0_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0a4ac17e23aaacaa92c4dfc5ce5a9b0d2218a0639ca96751a94b115e4b22b641 |
|
MD5 | bb494ca51cbd85026609b15f05d8fc93 |
|
BLAKE2b-256 | 63f440908b0e6d29c1a4a9983a28c144036acc006b5459b67cb30b0386c363fa |
Hashes for scikit_learn_intelex-2021.3.0-py36-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2b6e648e47813ad9a6c7a717bb35c6ea6988b335818b3a1d392f721788e36ca9 |
|
MD5 | cbe56cd9972eb37834533c0257feb3f4 |
|
BLAKE2b-256 | 1d6508267fe85df7b954054ed855a8a398d0261f861666b40e0d9027e0fdeded |
Hashes for scikit_learn_intelex-2021.3.0-py36-none-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ab9dd47bf33511ed6e2a89eb778507568f165445cb487f899608234c8a032fbb |
|
MD5 | 05595d83cf4cdf82b7de0feb4991a6dd |
|
BLAKE2b-256 | 69953250e3529a3e5f4425cf3a496819c609da28557e7515ea17a00d441b8bbc |
Hashes for scikit_learn_intelex-2021.3.0-py36-none-macosx_10_15_x86_64.macosx_11_0_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 63d19a29cc614067f34cb2e84de1e667e69fe29291123eb95a83734858b249b6 |
|
MD5 | 93b936ad96ab66541a532514c5308950 |
|
BLAKE2b-256 | 0c8aac9ca1e7de4019a6831cd073e8f58c662131075ed7fec8e1dbe0bbaa825f |