Skip to main content

Optimized Kernels for ML

Project description

KernelForge - Optimized Kernels for ML

CI PyPI version Python Versions Platform License: MIT

I really only care about writing optimized kernel code, so this project will be completed as I find additional time... XD

I'm reviving this project to finish an old project using random Fourier features for kernel ML.

Installation

Quick Start (Recommended)

For most users, install from PyPI:

pip install kernelforge

This installs pre-compiled wheels with optimized BLAS libraries:

  • Linux: OpenBLAS
  • macOS: Apple Accelerate framework

Requirements: Python 3.10+

Development Installation

Linux

# Create virtual environment with uv
uv venv
source .venv/bin/activate

# Install in editable mode with test dependencies
make install-linux

# Or manually:
CMAKE_ARGS="-DKF_USE_NATIVE=ON" uv pip install -e .[test] --verbose

macOS

macOS requires Homebrew LLVM for OpenMP support:

# Install dependencies
brew install llvm libomp

# Create virtual environment
uv venv
source .venv/bin/activate

# Install in editable mode
make install-macos

# Or manually:
CMAKE_ARGS="-DCMAKE_C_COMPILER=/opt/homebrew/opt/llvm/bin/clang -DCMAKE_CXX_COMPILER=/opt/homebrew/opt/llvm/bin/clang++ -DKF_USE_NATIVE=ON" uv pip install -e .[test] --verbose

Note: The -DKF_USE_NATIVE=ON flag enables -march=native/-mcpu=native optimizations for maximum performance on your specific CPU.

Advanced: Custom BLAS/LAPACK Libraries

Intel MKL (Linux)

# Install Intel oneAPI Base Toolkit
sudo apt install intel-basekit

# Set up environment
source /opt/intel/oneapi/setvars.sh

# Install (MKL will be auto-detected by CMake)
uv pip install -e .[test] --verbose

# Optional: Use Intel compilers
CC=icx CXX=icpx uv pip install -e .[test] --verbose

Note: In practice, GCC/G++ with OpenBLAS performs similarly to (or better than) Intel compilers with MKL. On macOS, LLVM with Accelerate framework is highly optimized for Apple Silicon.

Timings

I've rewritten a few of the kernels from the original QML code completely in C++. There are performance gains in most cases. These are primarily due to better use of BLAS routines for calculating, for example, Gramian sub-matrices with chunked DGEMM/DSYRK calls, etc. In the gradient and Hessian matrices there are also some algorithmic improvement and pre-computed terms. Memory usage might be a bit higher, but this could be optimized with more fine-graind chunking if needed. More is coming as I find the time ...

Some speedups vs the original QML code are shown below:

Benchmark QML [s] Kernelforge [s]
Upper triangle Gaussian kernel (16K x 16K) 1.82 0.64
1K FCHL19 descriptors (1K) ? 0.43
1K FCHL19 descriptors+jacobian (1K) ? 0.62
FCHL19 Local Gaussian scalar kernel (10K x 10K) 76.81 18.15
FCHL19 Local Gaussian gradient kernel (1K x 2700K) 32.54 1.52
FCHL19 Local Gaussian Hessian kernel (5400K x 5400K) 29.68 2.05

TODO list

The goal is to remove pain-points of existing QML libraries

  • Removal of Fortran dependencies
    • No Fortran-ordered arrays
    • No Fortran compilers needed
  • Simplified build system
    • No cooked F2PY/Meson build system, just CMake and Pybind11
  • Improved use of BLAS routines, with built-in chunking to avoid memory explosions
  • Better use of pre-computed terms for single-point inference/MD kernels
  • Low overhead with Pybind11 shims and better aligned memory?
  • Simplified entrypoints that are compatible with RDKit, ASE, Scikit-learn, etc.
    • A few high-level functions that do the most common tasks efficiently and correctly
  • Efficient FCHL19 out-of-the-box
    • Fast training with random Fourier features
    • With derivatives

Priority list for the next months:

  • Finish the inverse-distance kernel and its Jacobian

  • Make Pybind11 interface

    • Finalize the C++ interface
  • Finish the Gaussian kernel

  • Notebook with rMD17 example

  • Finish the Jacobian and Hessian kernels

  • Notebook with rMD17 forces example

  • FCHL19 support:

    • Add FCHL19 descriptors
    • Add FCHL19 kernels (local/elemental)
    • Add FCHL19 descriptor with derivatives
    • Add FCHL19 kernel Jacobian
    • Add FCHL19 kernel Hessian (GDML-style)
    • Improve FCHL19 kernel Jacobian performance (its poor)
  • Finish the random Fourier features kernel and its Jacobian

    • Parallel random basis sampler
    • RFF kernel for global descriptors
    • SVD and QR solvers for rectangular matrices
    • RFF kernel for local descriptors (FCHL19)
    • RFF kernels with Cholesky solver and chunked DSYRK kernel updates
    • RFF kernels with RFP format with chunked DSFRK kernel updates
    • RFF kernel Jacobian for global descriptors
    • RFF kernel Jacobian for local descriptors (FCHL19)
  • Notebook with rMD17 random Fourier features examples

  • Science:

    • Benchmark full kernel vs RFF on rMD17 and QM7b and QM9
    • Both FCHL19 and inverse-distance matrix

Todos:

  • Houskeeping:
    • Pybind11 bindings and CMake build system
    • Setup CI with GitHub Actions
    • Rewrite existing kernels to C++ (no Fortran)
    • Setup GHA to build PyPI wheels
    • Test Linux build matrices
    • Test MacOS build matrices
    • Test Windows build matrices No.
    • Add build for all Python version >=3.10
    • Plan structure for saving models for inference as .npz files
  • Ensure correct linking with optimized BLAS/LAPACK libraries:
    • OpenBLAS (Linux) <- also used in wheels
    • MKL (Linux)
    • Accelerate (MacOS)
  • Add global kernels:
    • Gaussian kernel
    • Jacobian/gradient kernel
    • Optimized kernel for single inference (for MD)
    • Hessian kernel
    • GDML-like kernel
    • Full GPR kernel
    • All kernels in RFP format
  • Add local kernels:
    • Gaussian kernel
    • Jacobian/gradient kernel
    • Optimized Jacobian kernel for single inference
    • Hessian kernel (GDML-style)
    • Full GPR kernel
    • Optimized GPR kernel with pre-computed terms for single inference/MD
  • Add random Fourier features kernel code:
    • Fourier-basis sampler in C++ with OpenMP parallelization
    • RFF kernel
    • RFF gradient kernel
    • RFF chunked DSYRK kernel
    • Optimized RFF gradient kernel for single inference/MD
    • The same as above, just for Hadamard features when I find the time?
  • GDML and sGDML kernels:
    • Inverse-distance matrix descriptor
    • Packed Jacobian for inverse-distance matrix
    • GDML kernel (brute-force implemented)
    • sGDML kernel (brute-force implemented)
    • Full GPR kernel
    • Optimized GPR kernel with pre-computed terms for single inference/MD
  • FCHL18 support:
    • Complete rewrite of FCHL18 analytical scalar kernel in C++
    • Stretch goal 1: Add new analytical FCHL18 kernel Jacobian
    • Stretch goal 2: Add new analytical FCHL18 kernel Hessian (+GPR/GDML-style)
    • Stretch goal 3: Attempt to optimize hyperparameters and cut-off functions
  • Add standard solvers:
    • Cholesky in-place solver
      • L2-reg kwarg
      • Toggle destructive vs non-destructive
    • RFP format in-place Cholesky solver
    • QR and/or SVD for non-square matrices
  • Add moleular descriptors with derivatives:
    • Coulomb matrix + misc variants without derivatives
    • FCHL19 + derivatives
    • GDML-like inverse-distance matrix + derivatives

Stretch goals:

  • Plan RDKit interface
  • Plan Scikit-learn interface
  • Plan ASE interface

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kernelforge-0.2.1.tar.gz (4.3 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

kernelforge-0.2.1-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (12.7 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

kernelforge-0.2.1-cp314-cp314-macosx_15_0_arm64.whl (766.7 kB view details)

Uploaded CPython 3.14macOS 15.0+ ARM64

kernelforge-0.2.1-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (12.7 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

kernelforge-0.2.1-cp313-cp313-macosx_15_0_arm64.whl (765.4 kB view details)

Uploaded CPython 3.13macOS 15.0+ ARM64

kernelforge-0.2.1-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (12.7 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

kernelforge-0.2.1-cp312-cp312-macosx_15_0_arm64.whl (765.1 kB view details)

Uploaded CPython 3.12macOS 15.0+ ARM64

kernelforge-0.2.1-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (12.7 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

kernelforge-0.2.1-cp311-cp311-macosx_15_0_arm64.whl (758.8 kB view details)

Uploaded CPython 3.11macOS 15.0+ ARM64

kernelforge-0.2.1-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (12.7 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

kernelforge-0.2.1-cp310-cp310-macosx_15_0_arm64.whl (752.3 kB view details)

Uploaded CPython 3.10macOS 15.0+ ARM64

File details

Details for the file kernelforge-0.2.1.tar.gz.

File metadata

  • Download URL: kernelforge-0.2.1.tar.gz
  • Upload date:
  • Size: 4.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for kernelforge-0.2.1.tar.gz
Algorithm Hash digest
SHA256 f8f154b3c9359fe730c309440bd06e909f8b45860f569601cac53f7b8cc69f6b
MD5 5c6c8d37398ba31b62aaa78575013655
BLAKE2b-256 a6109bcdb1006dd57fbeef02a87667539a8d36eac4f28442c87cbd58f890ac3f

See more details on using hashes here.

Provenance

The following attestation bundles were made for kernelforge-0.2.1.tar.gz:

Publisher: release.yml on andersx/kernelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kernelforge-0.2.1-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for kernelforge-0.2.1-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 629cfe4dcba737e7b9d771a5bbd1f2c3f99003c867b7b84db3937f6764160838
MD5 1dbdf65a4fddf3872bf26774a38bc2e2
BLAKE2b-256 2b45cf443c5b5280208ac98dd5bb693ce2fbdd679c9054d81018d9b0e7afd0a8

See more details on using hashes here.

Provenance

The following attestation bundles were made for kernelforge-0.2.1-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl:

Publisher: release.yml on andersx/kernelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kernelforge-0.2.1-cp314-cp314-macosx_15_0_arm64.whl.

File metadata

File hashes

Hashes for kernelforge-0.2.1-cp314-cp314-macosx_15_0_arm64.whl
Algorithm Hash digest
SHA256 962015348192e0dd30fc52df69b74e25a609353527c2ca9b3988265d37333952
MD5 0f368484a48ed952386e22e306abb400
BLAKE2b-256 bdcb51f2875fd2df514352b203208a389ea09240d38231983677ae893f53c758

See more details on using hashes here.

Provenance

The following attestation bundles were made for kernelforge-0.2.1-cp314-cp314-macosx_15_0_arm64.whl:

Publisher: release.yml on andersx/kernelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kernelforge-0.2.1-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for kernelforge-0.2.1-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 acdedccb19a736d0af82bf2dee4d6ce4ac3d9577a5600d330d41a0c1e13be56f
MD5 b98c489d3c518779ca8233a3c17884da
BLAKE2b-256 8354514cb4211f5175dee7ae8f82401be544e19d94702e511832dafb1f87f39f

See more details on using hashes here.

Provenance

The following attestation bundles were made for kernelforge-0.2.1-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl:

Publisher: release.yml on andersx/kernelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kernelforge-0.2.1-cp313-cp313-macosx_15_0_arm64.whl.

File metadata

File hashes

Hashes for kernelforge-0.2.1-cp313-cp313-macosx_15_0_arm64.whl
Algorithm Hash digest
SHA256 ae7465f23e0ce0767d11d72b23d4ed000ea6503ee436757cddddb809b96a6b01
MD5 191c757a48aff44bc93a08f301d4f2bc
BLAKE2b-256 44d6408d29bb61fb13509c38e2ad55efc6ddf66df6597f486f458c7b42186e38

See more details on using hashes here.

Provenance

The following attestation bundles were made for kernelforge-0.2.1-cp313-cp313-macosx_15_0_arm64.whl:

Publisher: release.yml on andersx/kernelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kernelforge-0.2.1-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for kernelforge-0.2.1-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 c6bd0d41d80b5398b953242db2c3ac3d572070310a0c57a964af5ba2bcfd1864
MD5 dfbaf4dfbce3b4aa1aa7a2ca6d963182
BLAKE2b-256 89765b6ec4be737e1542afdf5e1e74963220d969f67c1ce60b792f74d0937258

See more details on using hashes here.

Provenance

The following attestation bundles were made for kernelforge-0.2.1-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl:

Publisher: release.yml on andersx/kernelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kernelforge-0.2.1-cp312-cp312-macosx_15_0_arm64.whl.

File metadata

File hashes

Hashes for kernelforge-0.2.1-cp312-cp312-macosx_15_0_arm64.whl
Algorithm Hash digest
SHA256 c26dbdc3e67918e78c068195a180c5f02a3b32f1351813bf974edb9270e02084
MD5 cef6b79696a63a293a37ac18a2258a40
BLAKE2b-256 e152c5cb843ab7ad68b1dba6f1d3721d888fccdc31ea08d6a135e6062593a21e

See more details on using hashes here.

Provenance

The following attestation bundles were made for kernelforge-0.2.1-cp312-cp312-macosx_15_0_arm64.whl:

Publisher: release.yml on andersx/kernelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kernelforge-0.2.1-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for kernelforge-0.2.1-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 a9c1d99eba996123ac98abcde71b093fa6b5a9a3b2e670c8f710df772079a064
MD5 d3328abddf33f701ef90c9408deeab08
BLAKE2b-256 c891226fcf6823dec57aabd735bfa3c690951ddca5a57da7dd28e7dbc5608ba4

See more details on using hashes here.

Provenance

The following attestation bundles were made for kernelforge-0.2.1-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl:

Publisher: release.yml on andersx/kernelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kernelforge-0.2.1-cp311-cp311-macosx_15_0_arm64.whl.

File metadata

File hashes

Hashes for kernelforge-0.2.1-cp311-cp311-macosx_15_0_arm64.whl
Algorithm Hash digest
SHA256 b98eeae59ca0d3ddb4f413d746f985b68840ea0bc4227f8a8220d98b19b4a93b
MD5 2782fbbd7d4b0c5c04acfb2a8df8a431
BLAKE2b-256 21552e253dc7326ed91d9e86e83e881ecf749ba919c248c371553ca239cdd345

See more details on using hashes here.

Provenance

The following attestation bundles were made for kernelforge-0.2.1-cp311-cp311-macosx_15_0_arm64.whl:

Publisher: release.yml on andersx/kernelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kernelforge-0.2.1-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for kernelforge-0.2.1-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 374e066b6ca3e03664afef648e6e10fd6b0e1e406dab8dd30dd930d73696d724
MD5 63e63b41f96ef67a574729dfb4d7f300
BLAKE2b-256 5c90a9baaf61cd6ef841d3bc186474e0ba0ad67ec81ab2d38d9af01534da460e

See more details on using hashes here.

Provenance

The following attestation bundles were made for kernelforge-0.2.1-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl:

Publisher: release.yml on andersx/kernelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kernelforge-0.2.1-cp310-cp310-macosx_15_0_arm64.whl.

File metadata

File hashes

Hashes for kernelforge-0.2.1-cp310-cp310-macosx_15_0_arm64.whl
Algorithm Hash digest
SHA256 4026edbb919b5ea1524e5418b9e54b6d599a4fc083b04b4622252c4aab57fa84
MD5 8c63f9dfc343d76a4632b0491a474383
BLAKE2b-256 2f0188b4d42abfd6d9d3556e151b899fc803cb628e914443a19f41131ca9922a

See more details on using hashes here.

Provenance

The following attestation bundles were made for kernelforge-0.2.1-cp310-cp310-macosx_15_0_arm64.whl:

Publisher: release.yml on andersx/kernelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page