Skip to main content

Intel(R) Extension for Scikit-learn* speeds up scikit-learn beyond by providing drop-in patching. Acceleration is achieved through the use of the Intel(R) oneAPI Data Analytics Library (oneDAL) that allows for fast usage of the framework suited for Data Scientists or Machine Learning users.

Project description

Intel(R) Extension for Scikit-learn*

Join the community on GitHub Discussions

Intel(R) Extension for Scikit-learn speeds up scikit-learn beyond by providing drop-in patching. Acceleration is achieved through the use of the Intel(R) oneAPI Data Analytics Library (oneDAL) that allows for fast usage of the framework suited for Data Scientists or Machine Learning users.

⚠️Intel(R) Extension for Scikit-learn contains scikit-learn patching functionality originally available in daal4py package. All future updates for the patching will be available in Intel(R) Extension for Scikit-learn only. Please use the package instead of daal4py.

Running full the latest scikit-learn test suite with Intel(R) Extension for Scikit-learn: CircleCI

👀 Follow us on Medium

We publish blogs on Medium, so follow us to learn tips and tricks for more efficient data analysis the help of Intel(R) Extension for Scikit-learn. Here are our latest blogs:

🔗 Important links

💬 Support

Report issues, ask questions, and provide suggestions using:

You may reach out to project maintainers privately at onedal.maintainers@intel.com

🛠 Installation

Intel(R) Extension for Scikit-learn is available at the Python Package Index, and in Intel channel.

# PyPi (recommended by default)
pip install scikit-learn-intelex 
# Anaconda Cloud from Intel channel (recommended for Intel® Distribution for Python users)
conda install scikit-learn-intelex -c intel
[Click to expand] ℹ️ Supported configurations

📦 PyPi channel

OS / Python version Python 3.6 Python 3.7 Python 3.8 Python 3.9
Linux [CPU, GPU] [CPU, GPU] [CPU, GPU]
Windows [CPU, GPU] [CPU, GPU] [CPU, GPU]
OsX [CPU] [CPU] [CPU]

📦 Anaconda Cloud: Intel channel

OS / Python version Python 3.6 Python 3.7 Python 3.8 Python 3.9
Linux [CPU, GPU]
Windows [CPU, GPU]
OsX [CPU]

You can build the package from sources as well.

⚡️ Get Started

Intel CPU optimizations patching

import numpy as np
from sklearnex import patch_sklearn
patch_sklearn()

from sklearn.cluster import DBSCAN

X = np.array([[1., 2.], [2., 2.], [2., 3.],
              [8., 7.], [8., 8.], [25., 80.]], dtype=np.float32)
clustering = DBSCAN(eps=3, min_samples=2).fit(X)

Intel GPU optimizations patching

import numpy as np
from sklearnex import patch_sklearn
from daal4py.oneapi import sycl_context
patch_sklearn()

from sklearn.cluster import DBSCAN

X = np.array([[1., 2.], [2., 2.], [2., 3.],
              [8., 7.], [8., 8.], [25., 80.]], dtype=np.float32)
with sycl_context("gpu"):
    clustering = DBSCAN(eps=3, min_samples=2).fit(X)

🚀 Scikit-learn patching

Speedups of Intel(R) Extension for Scikit-learn over the original Scikit-learn
Technical details: float type: float64; HW: Intel(R) Xeon(R) Platinum 8280 CPU @ 2.70GHz, 2 sockets, 28 cores per socket; SW: scikit-learn 0.23.1, Intel® oneDAl (2021.1 Beta 10)

Intel(R) Extension for Scikit-learn patching affects performance of specific Scikit-learn functionality listed below. In cases when unsupported parameters are used, the package fallbacks into original Scikit-learn. These limitations described below. If the patching does not cover your scenarios, submit an issue on GitHub.

[Click to expand] 🔥 Applying the patching will impact the following existing scikit-learn algorithms:
Task Functionality Parameters support Data support
Classification SVC All parameters except kernel = 'poly' and 'sigmoid'. No limitations.
RandomForestClassifier All parameters except warmstart = True and cpp_alpha != 0, criterion != 'gini'. Multi-output and sparse data is not supported.
KNeighborsClassifier All parameters except metric != 'euclidean' or minkowski with p = 2. Multi-output and sparse data is not supported.
LogisticRegression / LogisticRegressionCV All parameters except solver != 'lbfgs' or 'newton-cg', class_weight != None, sample_weight != None. Only dense data is supported.
Regression RandomForestRegressor All parameters except warmstart = True and cpp_alpha != 0, criterion != 'mse'. Multi-output and sparse data is not supported.
KNeighborsRegressor All parameters except metric != 'euclidean' or minkowski with p = 2. Sparse data is not supported.
LinearRegression All parameters except normalize != False and sample_weight != None. Only dense data is supported, #observations should be >= #features.
Ridge All parameters except normalize != False, solver != 'auto' and sample_weight != None. Only dense data is supported, #observations should be >= #features.
ElasticNet All parameters except sample_weight != None. Multi-output and sparse data is not supported, #observations should be >= #features.
Lasso All parameters except sample_weight != None. Multi-output and sparse data is not supported, #observations should be >= #features.
Clustering KMeans All parameters except precompute_distances and sample_weight != None. No limitations.
DBSCAN All parameters except metric != 'euclidean' or minkowski with p = 2. Only dense data is supported.
Dimensionality reduction PCA All parameters except svd_solver != 'full'. No limitations.
TSNE All parameters except metric != 'euclidean' or minkowski with p = 2. Sparse data is not supported.
Unsupervised NearestNeighbors All parameters except metric != 'euclidean' or minkowski with p = 2. Sparse data is not supported.
Other train_test_split All parameters are supported. Only dense data is supported.
assert_all_finite All parameters are supported. Only dense data is supported.
pairwise_distance With metric='cosine' and 'correlation'. Only dense data is supported.
roc_auc_score Parameters average, sample_weight, max_fpr and multi_class are not supported. No limitations.

⚠️ We support optimizations for the last four versions of scikit-learn. The latest release of Intel(R) Extension for Scikit-learn 2021.2 supports scikit-learn 0.21.X, 0.22.X, 0.23.X and 0.24.X.

📜 Intel(R) Extension for Scikit-learn verbose

To find out which implementation of the algorithm is currently used (Intel(R) Extension for Scikit-learn or original Scikit-learn), set the environment variable:

  • On Linux and Mac OS: export SKLEARNEX_VERBOSE=INFO
  • On Windows: set SKLEARNEX_VERBOSE=INFO

For example, for DBSCAN you get one of these print statements depending on which implementation is used:

  • INFO: sklearn.cluster.DBSCAN.fit: uses Intel(R) oneAPI Data Analytics Library solver
  • INFO: sklearn.cluster.DBSCAN.fit: uses original Scikit-learn solver

Read more in the documentation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

scikit_learn_intelex-2021.2.2-py38-none-win_amd64.whl (24.6 kB view hashes)

Uploaded Python 3.8 Windows x86-64

scikit_learn_intelex-2021.2.2-py38-none-macosx_10_15_x86_64.whl (24.9 kB view hashes)

Uploaded Python 3.8 macOS 10.15+ x86-64

scikit_learn_intelex-2021.2.2-py37-none-win_amd64.whl (24.6 kB view hashes)

Uploaded Python 3.7 Windows x86-64

scikit_learn_intelex-2021.2.2-py37-none-macosx_10_15_x86_64.whl (24.9 kB view hashes)

Uploaded Python 3.7 macOS 10.15+ x86-64

scikit_learn_intelex-2021.2.2-py36-none-win_amd64.whl (24.6 kB view hashes)

Uploaded Python 3.6 Windows x86-64

scikit_learn_intelex-2021.2.2-py36-none-macosx_10_15_x86_64.whl (24.9 kB view hashes)

Uploaded Python 3.6 macOS 10.15+ x86-64

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page