Skip to main content

Optimized Kernels for ML

Project description

KernelForge - Optimized Kernels for ML

CI PyPI version Python Versions Platform License: MIT

I really only care about writing optimized kernel code, so this project will be completed as I find additional time... XD

I'm reviving this project to finish an old project using random Fourier features for kernel ML.

Installation

Quick Start (Recommended)

For most users, install from PyPI:

pip install kernelforge

This installs pre-compiled wheels with optimized BLAS libraries:

  • Linux: OpenBLAS
  • macOS: Apple Accelerate framework

Requirements: Python 3.10+

Development Installation

Linux

# Create virtual environment with uv
uv venv
source .venv/bin/activate

# Install in editable mode with test dependencies
make install-linux

# Or manually:
CMAKE_ARGS="-DKF_USE_NATIVE=ON" uv pip install -e .[test] --verbose

macOS

macOS requires Homebrew LLVM for OpenMP support:

# Install dependencies
brew install llvm libomp

# Create virtual environment
uv venv
source .venv/bin/activate

# Install in editable mode
make install-macos

# Or manually:
CMAKE_ARGS="-DCMAKE_C_COMPILER=/opt/homebrew/opt/llvm/bin/clang -DCMAKE_CXX_COMPILER=/opt/homebrew/opt/llvm/bin/clang++ -DKF_USE_NATIVE=ON" uv pip install -e .[test] --verbose

Note: The -DKF_USE_NATIVE=ON flag enables -march=native/-mcpu=native optimizations for maximum performance on your specific CPU.

Advanced: Custom BLAS/LAPACK Libraries

Intel MKL (Linux)

# Install Intel oneAPI Base Toolkit
sudo apt install intel-basekit

# Set up environment
source /opt/intel/oneapi/setvars.sh

# Install (MKL will be auto-detected by CMake)
uv pip install -e .[test] --verbose

# Optional: Use Intel compilers
CC=icx CXX=icpx uv pip install -e .[test] --verbose

Note: In practice, GCC/G++ with OpenBLAS performs similarly to (or better than) Intel compilers with MKL. On macOS, LLVM with Accelerate framework is highly optimized for Apple Silicon.

Timings

I've rewritten a few of the kernels from the original QML code completely in C++. There are performance gains in most cases. These are primarily due to better use of BLAS routines for calculating, for example, Gramian sub-matrices with chunked DGEMM/DSYRK calls, etc. In the gradient and Hessian matrices there are also some algorithmic improvement and pre-computed terms. Memory usage might be a bit higher, but this could be optimized with more fine-graind chunking if needed. More is coming as I find the time ...

Some speedups vs the original QML code are shown below:

Benchmark QML [s] Kernelforge [s]
Upper triangle Gaussian kernel (16K x 16K) 1.82 0.64
1K FCHL19 descriptors (1K) ? 0.43
1K FCHL19 descriptors+jacobian (1K) ? 0.62
FCHL19 Local Gaussian scalar kernel (10K x 10K) 76.81 18.15
FCHL19 Local Gaussian gradient kernel (1K x 2700K) 32.54 1.52
FCHL19 Local Gaussian Hessian kernel (5400K x 5400K) 29.68 2.05

TODO list

The goal is to remove pain-points of existing QML libraries

  • Removal of Fortran dependencies
    • No Fortran-ordered arrays
    • No Fortran compilers needed
  • Simplified build system
    • No cooked F2PY/Meson build system, just CMake and Pybind11
  • Improved use of BLAS routines, with built-in chunking to avoid memory explosions
  • Better use of pre-computed terms for single-point inference/MD kernels
  • Low overhead with Pybind11 shims and better aligned memory?
  • Simplified entrypoints that are compatible with RDKit, ASE, Scikit-learn, etc.
    • A few high-level functions that do the most common tasks efficiently and correctly
  • Efficient FCHL19 out-of-the-box
    • Fast training with random Fourier features
    • With derivatives

Priority list for the next months:

  • Finish the inverse-distance kernel and its Jacobian

  • Make Pybind11 interface

    • Finalize the C++ interface
  • Finish the Gaussian kernel

  • Notebook with rMD17 example

  • Finish the Jacobian and Hessian kernels

  • Notebook with rMD17 forces example

  • FCHL19 support:

    • Add FCHL19 descriptors
    • Add FCHL19 kernels (local/elemental)
    • Add FCHL19 descriptor with derivatives
    • Add FCHL19 kernel Jacobian
    • Add FCHL19 kernel Hessian (GDML-style)
    • Improve FCHL19 kernel Jacobian performance (its poor)
  • Finish the random Fourier features kernel and its Jacobian

    • Parallel random basis sampler
    • RFF kernel for global descriptors
    • SVD and QR solvers for rectangular matrices
    • RFF kernel for local descriptors (FCHL19)
    • RFF kernels with Cholesky solver and chunked DSYRK kernel updates
    • RFF kernels with RFP format with chunked DSFRK kernel updates
    • RFF kernel Jacobian for global descriptors
    • RFF kernel Jacobian for local descriptors (FCHL19)
  • Notebook with rMD17 random Fourier features examples

  • Science:

    • Benchmark full kernel vs RFF on rMD17 and QM7b and QM9
    • Both FCHL19 and inverse-distance matrix

Todos:

  • Houskeeping:
    • Pybind11 bindings and CMake build system
    • Setup CI with GitHub Actions
    • Rewrite existing kernels to C++ (no Fortran)
    • Setup GHA to build PyPI wheels
    • Test Linux build matrices
    • Test MacOS build matrices
    • Test Windows build matrices No.
    • Add build for all Python version >=3.10
    • Plan structure for saving models for inference as .npz files
  • Ensure correct linking with optimized BLAS/LAPACK libraries:
    • OpenBLAS (Linux) <- also used in wheels
    • MKL (Linux)
    • Accelerate (MacOS)
  • Add global kernels:
    • Gaussian kernel
    • Jacobian/gradient kernel
    • Optimized kernel for single inference (for MD)
    • Hessian kernel
    • GDML-like kernel
    • Full GPR kernel
    • All kernels in RFP format
  • Add local kernels:
    • Gaussian kernel
    • Jacobian/gradient kernel
    • Optimized Jacobian kernel for single inference
    • Hessian kernel (GDML-style)
    • Full GPR kernel
    • Optimized GPR kernel with pre-computed terms for single inference/MD
  • Add random Fourier features kernel code:
    • Fourier-basis sampler in C++ with OpenMP parallelization
    • RFF kernel
    • RFF gradient kernel
    • RFF chunked DSYRK kernel
    • Optimized RFF gradient kernel for single inference/MD
    • The same as above, just for Hadamard features when I find the time?
  • GDML and sGDML kernels:
    • Inverse-distance matrix descriptor
    • Packed Jacobian for inverse-distance matrix
    • GDML kernel (brute-force implemented)
    • sGDML kernel (brute-force implemented)
    • Full GPR kernel
    • Optimized GPR kernel with pre-computed terms for single inference/MD
  • FCHL18 support:
    • Complete rewrite of FCHL18 analytical scalar kernel in C++
    • Stretch goal 1: Add new analytical FCHL18 kernel Jacobian
    • Stretch goal 2: Add new analytical FCHL18 kernel Hessian (+GPR/GDML-style)
    • Stretch goal 3: Attempt to optimize hyperparameters and cut-off functions
  • Add standard solvers:
    • Cholesky in-place solver
      • L2-reg kwarg
      • Toggle destructive vs non-destructive
    • RFP format in-place Cholesky solver
    • QR and/or SVD for non-square matrices
  • Add moleular descriptors with derivatives:
    • Coulomb matrix + misc variants without derivatives
    • FCHL19 + derivatives
    • GDML-like inverse-distance matrix + derivatives

Stretch goals:

  • Plan RDKit interface
  • Plan Scikit-learn interface
  • Plan ASE interface

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kernelforge-0.3.0.tar.gz (5.6 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

kernelforge-0.3.0-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (13.0 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

kernelforge-0.3.0-cp314-cp314-macosx_15_0_arm64.whl (1.0 MB view details)

Uploaded CPython 3.14macOS 15.0+ ARM64

kernelforge-0.3.0-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (13.0 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

kernelforge-0.3.0-cp313-cp313-macosx_15_0_arm64.whl (1.0 MB view details)

Uploaded CPython 3.13macOS 15.0+ ARM64

kernelforge-0.3.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (13.0 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

kernelforge-0.3.0-cp312-cp312-macosx_15_0_arm64.whl (1.0 MB view details)

Uploaded CPython 3.12macOS 15.0+ ARM64

kernelforge-0.3.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (13.0 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

kernelforge-0.3.0-cp311-cp311-macosx_15_0_arm64.whl (995.9 kB view details)

Uploaded CPython 3.11macOS 15.0+ ARM64

kernelforge-0.3.0-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (13.0 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

kernelforge-0.3.0-cp310-cp310-macosx_15_0_arm64.whl (987.1 kB view details)

Uploaded CPython 3.10macOS 15.0+ ARM64

File details

Details for the file kernelforge-0.3.0.tar.gz.

File metadata

  • Download URL: kernelforge-0.3.0.tar.gz
  • Upload date:
  • Size: 5.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for kernelforge-0.3.0.tar.gz
Algorithm Hash digest
SHA256 17f8524b5ac6f14c202e3340bc828652c361f748e4488cebe11abc2bc75e4b07
MD5 cd40f02b310ca1fa8557a08aafdf40e5
BLAKE2b-256 c28ab3e755eb77251a07ea844b62e536ef3dc093b8cbb0818469dbe06b3ce2cf

See more details on using hashes here.

Provenance

The following attestation bundles were made for kernelforge-0.3.0.tar.gz:

Publisher: release.yml on andersx/kernelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kernelforge-0.3.0-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for kernelforge-0.3.0-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 1187691a46a9b3ff6f00870add8b05b7fee3548df5852c363ed27df0bad6690a
MD5 849b51241cbbde0a107e5f2661e8b672
BLAKE2b-256 638974ab43f6f5e5a7e51561bb9d5a913674c2a0428a1206750e22c0c2667c3f

See more details on using hashes here.

Provenance

The following attestation bundles were made for kernelforge-0.3.0-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl:

Publisher: release.yml on andersx/kernelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kernelforge-0.3.0-cp314-cp314-macosx_15_0_arm64.whl.

File metadata

File hashes

Hashes for kernelforge-0.3.0-cp314-cp314-macosx_15_0_arm64.whl
Algorithm Hash digest
SHA256 2865c29eb6beddba34914bc93e780556c2a084eb3e0e4748fa7f2b1432be8101
MD5 50499397d12e9085bfbacd68799e58c9
BLAKE2b-256 b4d4c617fc2ab07a3808fa00f521f75cbacccf2004c9af9602a68662e52107aa

See more details on using hashes here.

Provenance

The following attestation bundles were made for kernelforge-0.3.0-cp314-cp314-macosx_15_0_arm64.whl:

Publisher: release.yml on andersx/kernelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kernelforge-0.3.0-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for kernelforge-0.3.0-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 e93f5b5a77e82b93fc72a7f5666e3e6c7cc07bff7041c295144352354ec5ac78
MD5 cb3b0281236cb05cb7018ad40ac9c4ba
BLAKE2b-256 48226d26b90d783adc1b408deb11aa1ce132fcf942fd94701810ef7e423b6a9d

See more details on using hashes here.

Provenance

The following attestation bundles were made for kernelforge-0.3.0-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl:

Publisher: release.yml on andersx/kernelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kernelforge-0.3.0-cp313-cp313-macosx_15_0_arm64.whl.

File metadata

File hashes

Hashes for kernelforge-0.3.0-cp313-cp313-macosx_15_0_arm64.whl
Algorithm Hash digest
SHA256 16bf2f50044eb7de9108e6ceb3731bd2c2d5d4ee4458c356a15fcb801f60777b
MD5 c942280af01ab70e44a842fd2169a3b3
BLAKE2b-256 589b9a87403b2656ef28d47c64c9c05b8cc1a991245e94e6a9f52f888ace216c

See more details on using hashes here.

Provenance

The following attestation bundles were made for kernelforge-0.3.0-cp313-cp313-macosx_15_0_arm64.whl:

Publisher: release.yml on andersx/kernelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kernelforge-0.3.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for kernelforge-0.3.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 74b89f3f10472499cd1264d9c39a9c22368272e44d1db769cce1fc10dfcc87d4
MD5 298b84482ac3aeb16c3f32d1cd291b66
BLAKE2b-256 e16741eb0e3e27f78d563daeef03401c7cd86791882801580157eb6915dd57c3

See more details on using hashes here.

Provenance

The following attestation bundles were made for kernelforge-0.3.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl:

Publisher: release.yml on andersx/kernelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kernelforge-0.3.0-cp312-cp312-macosx_15_0_arm64.whl.

File metadata

File hashes

Hashes for kernelforge-0.3.0-cp312-cp312-macosx_15_0_arm64.whl
Algorithm Hash digest
SHA256 cacdd6d1c57262a5b8ad027cd0d4ed880c9c1f25c0060a4d560ac33fc9a7afb2
MD5 0b48bada69e5a718efddd0a83f63e577
BLAKE2b-256 5902015dd58cc73e481ffb192f167575ada3621cb03cc3172000b714593b9433

See more details on using hashes here.

Provenance

The following attestation bundles were made for kernelforge-0.3.0-cp312-cp312-macosx_15_0_arm64.whl:

Publisher: release.yml on andersx/kernelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kernelforge-0.3.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for kernelforge-0.3.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 52a780dd4c7f1bb2fe889fe8c2ce56b13f002b2b834d209a10d3c1febc180238
MD5 66a3bed3a424929cddef574b08fc1b80
BLAKE2b-256 2ffb0545354b3fbc09b78cb8361d67e544955dc98c26d4924899190183aebba8

See more details on using hashes here.

Provenance

The following attestation bundles were made for kernelforge-0.3.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl:

Publisher: release.yml on andersx/kernelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kernelforge-0.3.0-cp311-cp311-macosx_15_0_arm64.whl.

File metadata

File hashes

Hashes for kernelforge-0.3.0-cp311-cp311-macosx_15_0_arm64.whl
Algorithm Hash digest
SHA256 07e1abd0ccda047e65e3deca35e104a71445981a115a08c34d5ea7ded0124716
MD5 2577cb8f0df9501f526346cf627cfb83
BLAKE2b-256 a30adaf37cbf45929ac591a53e60921c2a878ffb714a83093970aec18cee8a14

See more details on using hashes here.

Provenance

The following attestation bundles were made for kernelforge-0.3.0-cp311-cp311-macosx_15_0_arm64.whl:

Publisher: release.yml on andersx/kernelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kernelforge-0.3.0-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for kernelforge-0.3.0-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 33d8ad8225870fc9fd9e316e3fb91b3c4e04e8366546385402cf1035af9eefcc
MD5 9e039352246354a216bc9729eaa64424
BLAKE2b-256 b2b3b476d259e4a35f5d1e054b92a19e1239bbbfd208ae74b525fcfc8f933bc0

See more details on using hashes here.

Provenance

The following attestation bundles were made for kernelforge-0.3.0-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl:

Publisher: release.yml on andersx/kernelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kernelforge-0.3.0-cp310-cp310-macosx_15_0_arm64.whl.

File metadata

File hashes

Hashes for kernelforge-0.3.0-cp310-cp310-macosx_15_0_arm64.whl
Algorithm Hash digest
SHA256 0be179ff4cc5c6982556cb76749fe6b99731fa011f8d05f9b623ceaf1ae70c34
MD5 70bb34b37ced911fa2cdc683d28e76f7
BLAKE2b-256 036a78d721bad84e2ee0fb6c3b6f07a6ed081a5603794b5d58976dffcf8537ff

See more details on using hashes here.

Provenance

The following attestation bundles were made for kernelforge-0.3.0-cp310-cp310-macosx_15_0_arm64.whl:

Publisher: release.yml on andersx/kernelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page