Skip to main content

Intel(R) Extension for Scikit-learn is a seamless way to speed up your Scikit-learn application.

Project description

Intel(R) Extension for Scikit-learn*

Build Status Coverity Scan Build Status Join the community on GitHub Discussions PyPI Version Conda Version

Intel(R) Extension for Scikit-learn is a seamless way to speed up your Scikit-learn application. The acceleration is achieved through the use of the Intel(R) oneAPI Data Analytics Library (oneDAL). Patching scikit-learn makes it a well-suited machine learning framework for dealing with real-life problems.

⚠️Intel(R) Extension for Scikit-learn contains scikit-learn patching functionality that was originally available in daal4py package. All future updates for the patches will be available only in Intel(R) Extension for Scikit-learn. We recommend you to use scikit-learn-intelex package instead of daal4py. You can learn more about daal4py in daal4py documentation.

Running the latest scikit-learn test suite with Intel(R) Extension for Scikit-learn: CircleCI

👀 Follow us on Medium

We publish blogs on Medium, so follow us to learn tips and tricks for more efficient data analysis with the help of Intel(R) Extension for Scikit-learn. Here are our latest blogs:

🔗 Important links

💬 Support

Report issues, ask questions, and provide suggestions using:

You may reach out to project maintainers privately at onedal.maintainers@intel.com

🛠 Installation

Intel(R) Extension for Scikit-learn is available at the Python Package Index, on Anaconda Cloud in Conda-Forge channel and in Intel channel. Intel(R) Extension for Scikit-learn is also available as a part of Intel® oneAPI AI Analytics Toolkit (AI Kit).

  • PyPi (recommended by default)
pip install scikit-learn-intelex
  • Anaconda Cloud from Conda-Forge channel (recommended for conda users by default)
conda install scikit-learn-intelex -c conda-forge
  • Anaconda Cloud from Intel channel (recommended for Intel® Distribution for Python users)
conda install scikit-learn-intelex -c intel
[Click to expand] ℹ️ Supported configurations

📦 PyPi channel

OS / Python version Python 3.6 Python 3.7 Python 3.8 Python 3.9
Linux [CPU, GPU] [CPU, GPU] [CPU, GPU]
Windows [CPU, GPU] [CPU, GPU] [CPU, GPU]
OsX [CPU] [CPU] [CPU]

📦 Anaconda Cloud: Conda-Forge channel

OS / Python version Python 3.6 Python 3.7 Python 3.8 Python 3.9
Linux [CPU] [CPU] [CPU] [CPU]
Windows [CPU] [CPU] [CPU] [CPU]
OsX [CPU] [CPU] [CPU] [CPU]

📦 Anaconda Cloud: Intel channel

OS / Python version Python 3.6 Python 3.7 Python 3.8 Python 3.9
Linux [CPU, GPU] [CPU, GPU] [CPU, GPU]
Windows [CPU, GPU] [CPU, GPU] [CPU, GPU]
OsX [CPU] [CPU] [CPU]

⚠️ Note: GPU support is an optional dependency. Required dependencies for GPU support will not be downloaded. You need to manually install dpcpp_cpp_rt package.

[Click to expand] ℹ️ How to install dpcpp_cpp_rt package
  • PyPi
pip install --upgrade dpcpp_cpp_rt
  • Anaconda Cloud
conda install dpcpp_cpp_rt -c intel

You can build the package from sources as well.

⚡️ Get Started

Intel CPU optimizations patching

import numpy as np
from sklearnex import patch_sklearn
patch_sklearn()

from sklearn.cluster import DBSCAN

X = np.array([[1., 2.], [2., 2.], [2., 3.],
              [8., 7.], [8., 8.], [25., 80.]], dtype=np.float32)
clustering = DBSCAN(eps=3, min_samples=2).fit(X)

Intel GPU optimizations patching

import numpy as np
from sklearnex import patch_sklearn
from daal4py.oneapi import sycl_context
patch_sklearn()

from sklearn.cluster import DBSCAN

X = np.array([[1., 2.], [2., 2.], [2., 3.],
              [8., 7.], [8., 8.], [25., 80.]], dtype=np.float32)
with sycl_context("gpu"):
    clustering = DBSCAN(eps=3, min_samples=2).fit(X)

🚀 Scikit-learn patching

Configurations:

  • HW: c5.24xlarge AWS EC2 Instance using an Intel Xeon Platinum 8275CL with 2 sockets and 24 cores per socket
  • SW: scikit-learn version 0.24.2, scikit-learn-intelex version 2021.2.3, Python 3.8

Benchmarks code

[Click to expand] ℹ️ Reproduce results
  • With Intel® Extension for Scikit-learn enabled:
python runner.py --configs configs/blogs/skl_conda_config.json –report
  • With the original Scikit-learn:
python runner.py --configs configs/blogs/skl_conda_config.json –report --no-intel-optimized

Intel(R) Extension for Scikit-learn patching affects performance of specific Scikit-learn functionality. Refer to the list of supported algorithms and parameters for details. In cases when unsupported parameters are used, the package fallbacks into original Scikit-learn. If the patching does not cover your scenarios, submit an issue on GitHub.

⚠️ We support optimizations for the last four versions of scikit-learn. The latest release of Intel(R) Extension for Scikit-learn 2021.3.X supports scikit-learn 0.22.X, 0.23.X, 0.24.X and 1.0.X.

📜 Intel(R) Extension for Scikit-learn verbose

To find out which implementation of the algorithm is currently used (Intel(R) Extension for Scikit-learn or original Scikit-learn), set the environment variable:

  • On Linux and Mac OS: export SKLEARNEX_VERBOSE=INFO
  • On Windows: set SKLEARNEX_VERBOSE=INFO

For example, for DBSCAN you get one of these print statements depending on which implementation is used:

  • SKLEARNEX INFO: sklearn.cluster.DBSCAN.fit: running accelerated version on CPU
  • SKLEARNEX INFO: sklearn.cluster.DBSCAN.fit: fallback to original Scikit-learn

Read more in the documentation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Built Distributions

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Huawei Huawei PSF Sponsor Microsoft Microsoft PSF Sponsor NVIDIA NVIDIA PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page