Skip to main content

Optimized Kernels for ML

Project description

KernelForge - Optimized Kernels for ML

CI PyPI version Python Versions Platform License: MIT

I really only care about writing optimized kernel code, so this project will be completed as I find additional time... XD

I'm reviving this project to finish an old project using random Fourier features for kernel ML.

Installation

Quick Start (Recommended)

For most users, install from PyPI:

pip install kernelforge

This installs pre-compiled wheels with optimized BLAS libraries:

  • Linux: OpenBLAS
  • macOS: Apple Accelerate framework

Requirements: Python 3.10+

Development Installation

Linux

# Create virtual environment with uv
uv venv
source .venv/bin/activate

# Install in editable mode with test dependencies
make install-linux

# Or manually:
CMAKE_ARGS="-DKF_USE_NATIVE=ON" uv pip install -e .[test] --verbose

macOS

macOS requires Homebrew LLVM for OpenMP support:

# Install dependencies
brew install llvm libomp

# Create virtual environment
uv venv
source .venv/bin/activate

# Install in editable mode
make install-macos

# Or manually:
CMAKE_ARGS="-DCMAKE_C_COMPILER=/opt/homebrew/opt/llvm/bin/clang -DCMAKE_CXX_COMPILER=/opt/homebrew/opt/llvm/bin/clang++ -DKF_USE_NATIVE=ON" uv pip install -e .[test] --verbose

Note: The -DKF_USE_NATIVE=ON flag enables -march=native/-mcpu=native optimizations for maximum performance on your specific CPU.

Advanced: Custom BLAS/LAPACK Libraries

Intel MKL (Linux)

# Install Intel oneAPI Base Toolkit
sudo apt install intel-basekit

# Set up environment
source /opt/intel/oneapi/setvars.sh

# Install (MKL will be auto-detected by CMake)
uv pip install -e .[test] --verbose

# Optional: Use Intel compilers
CC=icx CXX=icpx uv pip install -e .[test] --verbose

Note: In practice, GCC/G++ with OpenBLAS performs similarly to (or better than) Intel compilers with MKL. On macOS, LLVM with Accelerate framework is highly optimized for Apple Silicon.

Timings

I've rewritten a few of the kernels from the original QML code completely in C++. There are performance gains in most cases. These are primarily due to better use of BLAS routines for calculating, for example, Gramian sub-matrices with chunked DGEMM/DSYRK calls, etc. In the gradient and Hessian matrices there are also some algorithmic improvement and pre-computed terms. Memory usage might be a bit higher, but this could be optimized with more fine-graind chunking if needed. More is coming as I find the time ...

Some speedups vs the original QML code are shown below:

Benchmark QML [s] Kernelforge [s]
Upper triangle Gaussian kernel (16K x 16K) 1.82 0.64
1K FCHL19 descriptors (1K) ? 0.43
1K FCHL19 descriptors+jacobian (1K) ? 0.62
FCHL19 Local Gaussian scalar kernel (10K x 10K) 76.81 18.15
FCHL19 Local Gaussian gradient kernel (1K x 2700K) 32.54 1.52
FCHL19 Local Gaussian Hessian kernel (5400K x 5400K) 29.68 2.05

TODO list

The goal is to remove pain-points of existing QML libraries

  • Removal of Fortran dependencies
    • No Fortran-ordered arrays
    • No Fortran compilers needed
  • Simplified build system
    • No cooked F2PY/Meson build system, just CMake and Pybind11
  • Improved use of BLAS routines, with built-in chunking to avoid memory explosions
  • Better use of pre-computed terms for single-point inference/MD kernels
  • Low overhead with Pybind11 shims and better aligned memory?
  • Simplified entrypoints that are compatible with RDKit, ASE, Scikit-learn, etc.
    • A few high-level functions that do the most common tasks efficiently and correctly
  • Efficient FCHL19 out-of-the-box
    • Fast training with random Fourier features
    • With derivatives

Priority list for the next months:

  • Finish the inverse-distance kernel and its Jacobian

  • Make Pybind11 interface

    • Finalize the C++ interface
  • Finish the Gaussian kernel

  • Notebook with rMD17 example

  • Finish the Jacobian and Hessian kernels

  • Notebook with rMD17 forces example

  • FCHL19 support:

    • Add FCHL19 descriptors
    • Add FCHL19 kernels (local/elemental)
    • Add FCHL19 descriptor with derivatives
    • Add FCHL19 kernel Jacobian
    • Add FCHL19 kernel Hessian (GDML-style)
    • Improve FCHL19 kernel Jacobian performance (its poor)
  • Finish the random Fourier features kernel and its Jacobian

    • Parallel random basis sampler
    • RFF kernel for global descriptors
    • SVD and QR solvers for rectangular matrices
    • RFF kernel for local descriptors (FCHL19)
    • RFF kernels with Cholesky solver and chunked DSYRK kernel updates
    • RFF kernels with RFP format with chunked DSFRK kernel updates
    • RFF kernel Jacobian for global descriptors
    • RFF kernel Jacobian for local descriptors (FCHL19)
  • Notebook with rMD17 random Fourier features examples

  • Science:

    • Benchmark full kernel vs RFF on rMD17 and QM7b and QM9
    • Both FCHL19 and inverse-distance matrix

Todos:

  • Houskeeping:
    • Pybind11 bindings and CMake build system
    • Setup CI with GitHub Actions
    • Rewrite existing kernels to C++ (no Fortran)
    • Setup GHA to build PyPI wheels
    • Test Linux build matrices
    • Test MacOS build matrices
    • Test Windows build matrices No.
    • Add build for all Python version >=3.10
    • Plan structure for saving models for inference as .npz files
  • Ensure correct linking with optimized BLAS/LAPACK libraries:
    • OpenBLAS (Linux) <- also used in wheels
    • MKL (Linux)
    • Accelerate (MacOS)
  • Add global kernels:
    • Gaussian kernel
    • Jacobian/gradient kernel
    • Optimized kernel for single inference (for MD)
    • Hessian kernel
    • GDML-like kernel
    • Full GPR kernel
    • All kernels in RFP format
  • Add local kernels:
    • Gaussian kernel
    • Jacobian/gradient kernel
    • Optimized Jacobian kernel for single inference
    • Hessian kernel (GDML-style)
    • Full GPR kernel
    • Optimized GPR kernel with pre-computed terms for single inference/MD
  • Add random Fourier features kernel code:
    • Fourier-basis sampler in C++ with OpenMP parallelization
    • RFF kernel
    • RFF gradient kernel
    • RFF chunked DSYRK kernel
    • Optimized RFF gradient kernel for single inference/MD
    • The same as above, just for Hadamard features when I find the time?
  • GDML and sGDML kernels:
    • Inverse-distance matrix descriptor
    • Packed Jacobian for inverse-distance matrix
    • GDML kernel (brute-force implemented)
    • sGDML kernel (brute-force implemented)
    • Full GPR kernel
    • Optimized GPR kernel with pre-computed terms for single inference/MD
  • FCHL18 support:
    • Complete rewrite of FCHL18 analytical scalar kernel in C++
    • Stretch goal 1: Add new analytical FCHL18 kernel Jacobian
    • Stretch goal 2: Add new analytical FCHL18 kernel Hessian (+GPR/GDML-style)
    • Stretch goal 3: Attempt to optimize hyperparameters and cut-off functions
  • Add standard solvers:
    • Cholesky in-place solver
      • L2-reg kwarg
      • Toggle destructive vs non-destructive
    • RFP format in-place Cholesky solver
    • QR and/or SVD for non-square matrices
  • Add moleular descriptors with derivatives:
    • Coulomb matrix + misc variants without derivatives
    • FCHL19 + derivatives
    • GDML-like inverse-distance matrix + derivatives

Stretch goals:

  • Plan RDKit interface
  • Plan Scikit-learn interface
  • Plan ASE interface

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kernelforge-0.2.2.tar.gz (4.5 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

kernelforge-0.2.2-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (12.8 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

kernelforge-0.2.2-cp314-cp314-macosx_15_0_arm64.whl (798.9 kB view details)

Uploaded CPython 3.14macOS 15.0+ ARM64

kernelforge-0.2.2-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (12.8 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

kernelforge-0.2.2-cp313-cp313-macosx_15_0_arm64.whl (797.6 kB view details)

Uploaded CPython 3.13macOS 15.0+ ARM64

kernelforge-0.2.2-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (12.8 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

kernelforge-0.2.2-cp312-cp312-macosx_15_0_arm64.whl (797.3 kB view details)

Uploaded CPython 3.12macOS 15.0+ ARM64

kernelforge-0.2.2-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (12.8 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

kernelforge-0.2.2-cp311-cp311-macosx_15_0_arm64.whl (790.3 kB view details)

Uploaded CPython 3.11macOS 15.0+ ARM64

kernelforge-0.2.2-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (12.8 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

kernelforge-0.2.2-cp310-cp310-macosx_15_0_arm64.whl (783.7 kB view details)

Uploaded CPython 3.10macOS 15.0+ ARM64

File details

Details for the file kernelforge-0.2.2.tar.gz.

File metadata

  • Download URL: kernelforge-0.2.2.tar.gz
  • Upload date:
  • Size: 4.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for kernelforge-0.2.2.tar.gz
Algorithm Hash digest
SHA256 be4e72b128464abc7c9b374d50a071999b1da7bb064d6f589720de43a84d7aef
MD5 d097772be52ab11834dd6cb8556e9eb4
BLAKE2b-256 daf6aace5d2065b8f80266e2d87e6cd7b49feaf6b07727c285c103ed13a3b3a1

See more details on using hashes here.

Provenance

The following attestation bundles were made for kernelforge-0.2.2.tar.gz:

Publisher: release.yml on andersx/kernelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kernelforge-0.2.2-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for kernelforge-0.2.2-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 7c712f159089164d52aa65e3796d5d3fd9cdab27b533e36a70902e8b0030e238
MD5 64d64acdc0e529a069bf8438176afb95
BLAKE2b-256 26dbe869bb828746cf4e565e71f98655a8080a1948d201dc651e1d4b52364e4d

See more details on using hashes here.

Provenance

The following attestation bundles were made for kernelforge-0.2.2-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl:

Publisher: release.yml on andersx/kernelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kernelforge-0.2.2-cp314-cp314-macosx_15_0_arm64.whl.

File metadata

File hashes

Hashes for kernelforge-0.2.2-cp314-cp314-macosx_15_0_arm64.whl
Algorithm Hash digest
SHA256 a05f3f2c5cc7af99cb9bf274586d5e2051bc3053de5322dea9eec0fe1bbbec3b
MD5 a93bfea85f5dc6d0015f5c4af45e2192
BLAKE2b-256 1f072db9a9a3ac16ea36450e4f845300edbcb4106fe5cdd23cf810a0965f91d8

See more details on using hashes here.

Provenance

The following attestation bundles were made for kernelforge-0.2.2-cp314-cp314-macosx_15_0_arm64.whl:

Publisher: release.yml on andersx/kernelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kernelforge-0.2.2-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for kernelforge-0.2.2-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 f674b0c8d84d9a8ec693ac4ed3ea37c37a1b366416dc7524cf68c5ec9ee959f9
MD5 909c97c3eb48fd1e8cfb4eb32a184168
BLAKE2b-256 34b6ed6ff546c4fc52c5b1b3256f0d23db197118d6f03ae7f0c5a64844386982

See more details on using hashes here.

Provenance

The following attestation bundles were made for kernelforge-0.2.2-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl:

Publisher: release.yml on andersx/kernelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kernelforge-0.2.2-cp313-cp313-macosx_15_0_arm64.whl.

File metadata

File hashes

Hashes for kernelforge-0.2.2-cp313-cp313-macosx_15_0_arm64.whl
Algorithm Hash digest
SHA256 05571e9650e6345fcde036c951142300beae3a2788e46745b71e6242f1b2e8eb
MD5 96016515adea5a1c198717c5dabf7674
BLAKE2b-256 c5dd50866911bd30d47fec0e276277ffbfa99bd27f3fb0720c80cb47e271cd88

See more details on using hashes here.

Provenance

The following attestation bundles were made for kernelforge-0.2.2-cp313-cp313-macosx_15_0_arm64.whl:

Publisher: release.yml on andersx/kernelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kernelforge-0.2.2-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for kernelforge-0.2.2-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 226ec71b55f3f38d478d5868ece8a9b57d63a11dd6d231dccf0015e31e8f5095
MD5 f1e83fd59e7798e0872ffc899cb3103a
BLAKE2b-256 32eb1d1d505dd978459ae744dc15de62db87e48b3b10e29b41c5b3e53ba9bdff

See more details on using hashes here.

Provenance

The following attestation bundles were made for kernelforge-0.2.2-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl:

Publisher: release.yml on andersx/kernelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kernelforge-0.2.2-cp312-cp312-macosx_15_0_arm64.whl.

File metadata

File hashes

Hashes for kernelforge-0.2.2-cp312-cp312-macosx_15_0_arm64.whl
Algorithm Hash digest
SHA256 3c8e7b687821971575c2f8527b3cbf3b5bf16609b03b429b512f4cd4f47da9fc
MD5 bf40fdbd18ff4a0be9c81d790e30f9ae
BLAKE2b-256 e325f2501190d1e2262bbd65d9f1fd52c2e3d02e04abdd89576da9575b756026

See more details on using hashes here.

Provenance

The following attestation bundles were made for kernelforge-0.2.2-cp312-cp312-macosx_15_0_arm64.whl:

Publisher: release.yml on andersx/kernelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kernelforge-0.2.2-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for kernelforge-0.2.2-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 23b56ef10465ec45ec339903583c6dfaff0a0e92dda1609a55f50a63a0d05db6
MD5 46c78c1231754832879cd6d7268f01b0
BLAKE2b-256 e73cc309e46fc8196daa47155edc6944cbcdd79f02f28119f8c553573b7b3f5a

See more details on using hashes here.

Provenance

The following attestation bundles were made for kernelforge-0.2.2-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl:

Publisher: release.yml on andersx/kernelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kernelforge-0.2.2-cp311-cp311-macosx_15_0_arm64.whl.

File metadata

File hashes

Hashes for kernelforge-0.2.2-cp311-cp311-macosx_15_0_arm64.whl
Algorithm Hash digest
SHA256 8b56c4e994e278986698c9d35d5e681302eecced3a1ed6421431a6ca8f8325ac
MD5 3aba905bc4911cd41820d9b7332d5600
BLAKE2b-256 3f2b024866bc7d88113960f68870e83e6802c7353d54b039decc5ce7c68b5f88

See more details on using hashes here.

Provenance

The following attestation bundles were made for kernelforge-0.2.2-cp311-cp311-macosx_15_0_arm64.whl:

Publisher: release.yml on andersx/kernelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kernelforge-0.2.2-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for kernelforge-0.2.2-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 7af1edc60d13ea121ad17e3b38eb38a0d613d75dc7ce03484e3ffb3ff318b9f3
MD5 67d56f0d59b2406f24b94261fab2f7e0
BLAKE2b-256 dfd639ea3001fdbe149ba9d852a46764ad7784866ecc07714f4799c00c8c1e60

See more details on using hashes here.

Provenance

The following attestation bundles were made for kernelforge-0.2.2-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl:

Publisher: release.yml on andersx/kernelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kernelforge-0.2.2-cp310-cp310-macosx_15_0_arm64.whl.

File metadata

File hashes

Hashes for kernelforge-0.2.2-cp310-cp310-macosx_15_0_arm64.whl
Algorithm Hash digest
SHA256 9b2a24c8cf26fbc0688b314ebdc4b7c28f8b55a35eee3832c09ef6641d5ef771
MD5 2f0b3e536d3f6708416971b663eae6eb
BLAKE2b-256 0b191b5c3d49148d5742bf117169524475fc609f9ebaed969f9f822add21d138

See more details on using hashes here.

Provenance

The following attestation bundles were made for kernelforge-0.2.2-cp310-cp310-macosx_15_0_arm64.whl:

Publisher: release.yml on andersx/kernelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page