Skip to main content

Optimized Kernels for ML

Project description

KernelForge - Optimized Kernels for ML

CI PyPI version Python Versions Platform License: MIT

I really only care about writing optimized kernel code, so this project will be completed as I find additional time... XD

I'm reviving this project to finish an old project using random Fourier features for kernel ML.

Installation

Quick Start (Recommended)

For most users, install from PyPI:

pip install kernelforge

This installs pre-compiled wheels with optimized BLAS libraries:

  • Linux: OpenBLAS
  • macOS: Apple Accelerate framework

Requirements: Python 3.10+

Development Installation

Linux

# Create virtual environment with uv
uv venv
source .venv/bin/activate

# Install in editable mode with test dependencies
make install-linux

# Or manually:
CMAKE_ARGS="-DKF_USE_NATIVE=ON" uv pip install -e .[test] --verbose

macOS

macOS requires Homebrew LLVM for OpenMP support:

# Install dependencies
brew install llvm libomp

# Create virtual environment
uv venv
source .venv/bin/activate

# Install in editable mode
make install-macos

# Or manually:
CMAKE_ARGS="-DCMAKE_C_COMPILER=/opt/homebrew/opt/llvm/bin/clang -DCMAKE_CXX_COMPILER=/opt/homebrew/opt/llvm/bin/clang++ -DKF_USE_NATIVE=ON" uv pip install -e .[test] --verbose

Note: The -DKF_USE_NATIVE=ON flag enables -march=native/-mcpu=native optimizations for maximum performance on your specific CPU.

Advanced: Custom BLAS/LAPACK Libraries

Intel MKL (Linux)

# Install Intel oneAPI Base Toolkit
sudo apt install intel-basekit

# Set up environment
source /opt/intel/oneapi/setvars.sh

# Install (MKL will be auto-detected by CMake)
uv pip install -e .[test] --verbose

# Optional: Use Intel compilers
CC=icx CXX=icpx uv pip install -e .[test] --verbose

Note: In practice, GCC/G++ with OpenBLAS performs similarly to (or better than) Intel compilers with MKL. On macOS, LLVM with Accelerate framework is highly optimized for Apple Silicon.

Timings

I've rewritten a few of the kernels from the original QML code completely in C++. There are performance gains in most cases. These are primarily due to better use of BLAS routines for calculating, for example, Gramian sub-matrices with chunked DGEMM/DSYRK calls, etc. In the gradient and Hessian matrices there are also some algorithmic improvement and pre-computed terms. Memory usage might be a bit higher, but this could be optimized with more fine-graind chunking if needed. More is coming as I find the time ...

Some speedups vs the original QML code are shown below:

Benchmark QML [s] Kernelforge [s]
Upper triangle Gaussian kernel (16K x 16K) 1.82 0.64
1K FCHL19 descriptors (1K) ? 0.43
1K FCHL19 descriptors+jacobian (1K) ? 0.62
FCHL19 Local Gaussian scalar kernel (10K x 10K) 76.81 18.15
FCHL19 Local Gaussian gradient kernel (1K x 2700K) 32.54 1.52
FCHL19 Local Gaussian Hessian kernel (5400K x 5400K) 29.68 2.05

TODO list

The goal is to remove pain-points of existing QML libraries

  • Removal of Fortran dependencies
    • No Fortran-ordered arrays
    • No Fortran compilers needed
  • Simplified build system
    • No cooked F2PY/Meson build system, just CMake and Pybind11
  • Improved use of BLAS routines, with built-in chunking to avoid memory explosions
  • Better use of pre-computed terms for single-point inference/MD kernels
  • Low overhead with Pybind11 shims and better aligned memory?
  • Simplified entrypoints that are compatible with RDKit, ASE, Scikit-learn, etc.
    • A few high-level functions that do the most common tasks efficiently and correctly
  • Efficient FCHL19 out-of-the-box
    • Fast training with random Fourier features
    • With derivatives

Priority list for the next months:

  • Finish the inverse-distance kernel and its Jacobian

  • Make Pybind11 interface

    • Finalize the C++ interface
  • Finish the Gaussian kernel

  • Notebook with rMD17 example

  • Finish the Jacobian and Hessian kernels

  • Notebook with rMD17 forces example

  • FCHL19 support:

    • Add FCHL19 descriptors
    • Add FCHL19 kernels (local/elemental)
    • Add FCHL19 descriptor with derivatives
    • Add FCHL19 kernel Jacobian
    • Add FCHL19 kernel Hessian (GDML-style)
    • Improve FCHL19 kernel Jacobian performance (its poor)
  • Finish the random Fourier features kernel and its Jacobian

    • Parallel random basis sampler
    • RFF kernel for global descriptors
    • SVD and QR solvers for rectangular matrices
    • RFF kernel for local descriptors (FCHL19)
    • RFF kernels with Cholesky solver and chunked DSYRK kernel updates
    • RFF kernels with RFP format with chunked DSFRK kernel updates
    • RFF kernel Jacobian for global descriptors
    • RFF kernel Jacobian for local descriptors (FCHL19)
  • Notebook with rMD17 random Fourier features examples

  • Science:

    • Benchmark full kernel vs RFF on rMD17 and QM7b and QM9
    • Both FCHL19 and inverse-distance matrix

Todos:

  • Houskeeping:
    • Pybind11 bindings and CMake build system
    • Setup CI with GitHub Actions
    • Rewrite existing kernels to C++ (no Fortran)
    • Setup GHA to build PyPI wheels
    • Test Linux build matrices
    • Test MacOS build matrices
    • Test Windows build matrices No.
    • Add build for all Python version >=3.10
    • Plan structure for saving models for inference as .npz files
  • Ensure correct linking with optimized BLAS/LAPACK libraries:
    • OpenBLAS (Linux) <- also used in wheels
    • MKL (Linux)
    • Accelerate (MacOS)
  • Add global kernels:
    • Gaussian kernel
    • Jacobian/gradient kernel
    • Optimized kernel for single inference (for MD)
    • Hessian kernel
    • GDML-like kernel
    • Full GPR kernel
    • All kernels in RFP format
  • Add local kernels:
    • Gaussian kernel
    • Jacobian/gradient kernel
    • Optimized Jacobian kernel for single inference
    • Hessian kernel (GDML-style)
    • Full GPR kernel
    • Optimized GPR kernel with pre-computed terms for single inference/MD
  • Add random Fourier features kernel code:
    • Fourier-basis sampler in C++ with OpenMP parallelization
    • RFF kernel
    • RFF gradient kernel
    • RFF chunked DSYRK kernel
    • Optimized RFF gradient kernel for single inference/MD
    • The same as above, just for Hadamard features when I find the time?
  • GDML and sGDML kernels:
    • Inverse-distance matrix descriptor
    • Packed Jacobian for inverse-distance matrix
    • GDML kernel (brute-force implemented)
    • sGDML kernel (brute-force implemented)
    • Full GPR kernel
    • Optimized GPR kernel with pre-computed terms for single inference/MD
  • FCHL18 support:
    • Complete rewrite of FCHL18 analytical scalar kernel in C++
    • Stretch goal 1: Add new analytical FCHL18 kernel Jacobian
    • Stretch goal 2: Add new analytical FCHL18 kernel Hessian (+GPR/GDML-style)
    • Stretch goal 3: Attempt to optimize hyperparameters and cut-off functions
  • Add standard solvers:
    • Cholesky in-place solver
      • L2-reg kwarg
      • Toggle destructive vs non-destructive
    • RFP format in-place Cholesky solver
    • QR and/or SVD for non-square matrices
  • Add moleular descriptors with derivatives:
    • Coulomb matrix + misc variants without derivatives
    • FCHL19 + derivatives
    • GDML-like inverse-distance matrix + derivatives

Stretch goals:

  • Plan RDKit interface
  • Plan Scikit-learn interface
  • Plan ASE interface

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kernelforge-0.2.3.tar.gz (4.5 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

kernelforge-0.2.3-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (12.8 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

kernelforge-0.2.3-cp314-cp314-macosx_15_0_arm64.whl (797.9 kB view details)

Uploaded CPython 3.14macOS 15.0+ ARM64

kernelforge-0.2.3-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (12.8 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

kernelforge-0.2.3-cp313-cp313-macosx_15_0_arm64.whl (796.8 kB view details)

Uploaded CPython 3.13macOS 15.0+ ARM64

kernelforge-0.2.3-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (12.8 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

kernelforge-0.2.3-cp312-cp312-macosx_15_0_arm64.whl (796.5 kB view details)

Uploaded CPython 3.12macOS 15.0+ ARM64

kernelforge-0.2.3-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (12.8 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

kernelforge-0.2.3-cp311-cp311-macosx_15_0_arm64.whl (789.6 kB view details)

Uploaded CPython 3.11macOS 15.0+ ARM64

kernelforge-0.2.3-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (12.8 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

kernelforge-0.2.3-cp310-cp310-macosx_15_0_arm64.whl (782.9 kB view details)

Uploaded CPython 3.10macOS 15.0+ ARM64

File details

Details for the file kernelforge-0.2.3.tar.gz.

File metadata

  • Download URL: kernelforge-0.2.3.tar.gz
  • Upload date:
  • Size: 4.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for kernelforge-0.2.3.tar.gz
Algorithm Hash digest
SHA256 765e8952101d34c6b531afb652c9ac0aea18e6793515611cb8b688ad0a65ce46
MD5 228c48b744890b5e90c4c31a537e15e5
BLAKE2b-256 c2a4f99b8a996794fa2a2634aea7057ff10d20d6d736f2234d0e288196d2249f

See more details on using hashes here.

Provenance

The following attestation bundles were made for kernelforge-0.2.3.tar.gz:

Publisher: release.yml on andersx/kernelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kernelforge-0.2.3-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for kernelforge-0.2.3-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 a9bc215e3e8beb6c145190b1a58c4ac27f0aad4bc6e234782fd84b60401c64f8
MD5 1bcc1d39937f59d22c66b11bd729a5df
BLAKE2b-256 99845107a4da7438e7fbd1b8ecf21a8418fe522d7ebcd9efaa2c2f7a0a613918

See more details on using hashes here.

Provenance

The following attestation bundles were made for kernelforge-0.2.3-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl:

Publisher: release.yml on andersx/kernelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kernelforge-0.2.3-cp314-cp314-macosx_15_0_arm64.whl.

File metadata

File hashes

Hashes for kernelforge-0.2.3-cp314-cp314-macosx_15_0_arm64.whl
Algorithm Hash digest
SHA256 a61a36edd142f3eeddcbd33ffde635f6f0a9c5d7c749125cf6206fe5c0597d01
MD5 5a3ad7d5d45abfb469ff077f9a1685a5
BLAKE2b-256 1824fe6f911a4efc0fd0e8f0360d25471c19c54eb21b7d953a0b1175bd87cb9a

See more details on using hashes here.

Provenance

The following attestation bundles were made for kernelforge-0.2.3-cp314-cp314-macosx_15_0_arm64.whl:

Publisher: release.yml on andersx/kernelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kernelforge-0.2.3-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for kernelforge-0.2.3-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 6787b58b06dd7f86fd6cfd7e9e42a6049762552b6d2508eb8a70e23bf86dc3a0
MD5 aa79c7e5efb2466ee805cf2d8c1694ab
BLAKE2b-256 f90f3c399111c74dcb171bc488587bd73632ab06a162dd7cd6a84ddbe9a480a1

See more details on using hashes here.

Provenance

The following attestation bundles were made for kernelforge-0.2.3-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl:

Publisher: release.yml on andersx/kernelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kernelforge-0.2.3-cp313-cp313-macosx_15_0_arm64.whl.

File metadata

File hashes

Hashes for kernelforge-0.2.3-cp313-cp313-macosx_15_0_arm64.whl
Algorithm Hash digest
SHA256 7b43d5581ba0501a4b3323e7cb15f43067ef998b8ceb1fc86c38a7b0bb886445
MD5 dcb1ba39c798c2178ec47539cb129ef5
BLAKE2b-256 e196d5a9c2807a4c8fdeae0c25a456fad139a6fcefddc142dfc07f9189f395e9

See more details on using hashes here.

Provenance

The following attestation bundles were made for kernelforge-0.2.3-cp313-cp313-macosx_15_0_arm64.whl:

Publisher: release.yml on andersx/kernelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kernelforge-0.2.3-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for kernelforge-0.2.3-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 d1210c5135f66ed617ed27565843dce118a0f82e2d3cf3448e469f83d0e6eb59
MD5 931cbe598b7e42161a8888685ee137ef
BLAKE2b-256 ac1a41ac6222ca2d080637aaaa18bf10c0f8260cd58d1977f0b21103162d7214

See more details on using hashes here.

Provenance

The following attestation bundles were made for kernelforge-0.2.3-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl:

Publisher: release.yml on andersx/kernelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kernelforge-0.2.3-cp312-cp312-macosx_15_0_arm64.whl.

File metadata

File hashes

Hashes for kernelforge-0.2.3-cp312-cp312-macosx_15_0_arm64.whl
Algorithm Hash digest
SHA256 e03e5cb3ee8e2533a30bd990b51b3cab493497449f305535b950388eac3fc39a
MD5 97c7871d0d016759a0deae03ac8dbcba
BLAKE2b-256 ca18db8cb1b8782b3663a80b890d4450194455895e9930d53dfcd09377bdbe12

See more details on using hashes here.

Provenance

The following attestation bundles were made for kernelforge-0.2.3-cp312-cp312-macosx_15_0_arm64.whl:

Publisher: release.yml on andersx/kernelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kernelforge-0.2.3-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for kernelforge-0.2.3-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 cb72332c03aa9a31fbcacbb0c2e244133ba6e86b2c18256af890e8cb95ff8093
MD5 5a11e771c1d11669e4768e75c129ceac
BLAKE2b-256 7e99750c092b0fe236d3757a092cb5d2a28892e26fb1307c93ce2d6e819136a7

See more details on using hashes here.

Provenance

The following attestation bundles were made for kernelforge-0.2.3-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl:

Publisher: release.yml on andersx/kernelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kernelforge-0.2.3-cp311-cp311-macosx_15_0_arm64.whl.

File metadata

File hashes

Hashes for kernelforge-0.2.3-cp311-cp311-macosx_15_0_arm64.whl
Algorithm Hash digest
SHA256 863945c47c5b2972f7901b1970caed846ff0989a1fe7ac73de2d6a909dc83b79
MD5 ceb522610354a5ee5436259230cb0cc2
BLAKE2b-256 f62db6e0360f2661af88b08ffbea762f4d497770bac761ce831361631ce4f8de

See more details on using hashes here.

Provenance

The following attestation bundles were made for kernelforge-0.2.3-cp311-cp311-macosx_15_0_arm64.whl:

Publisher: release.yml on andersx/kernelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kernelforge-0.2.3-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for kernelforge-0.2.3-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 b34e41b3ac058659df4d43630f1f470bb3b452819d119493efd9b1a042a3a97a
MD5 42e511e7a4d32c7805994cb9e57522f8
BLAKE2b-256 00771f1a6948c9790fed4b3a44059d36b5badd7d03528d129689434161704f86

See more details on using hashes here.

Provenance

The following attestation bundles were made for kernelforge-0.2.3-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl:

Publisher: release.yml on andersx/kernelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kernelforge-0.2.3-cp310-cp310-macosx_15_0_arm64.whl.

File metadata

File hashes

Hashes for kernelforge-0.2.3-cp310-cp310-macosx_15_0_arm64.whl
Algorithm Hash digest
SHA256 aead096903a7671e3b33a268016a40e178e36616bc7f102e5f1ba78b8e4c2d91
MD5 3497eb2f95e982045f2949bcde4c7871
BLAKE2b-256 7e9b77d2172dcd5aa9db70967b2905a44c34a11a47fb3502e645ec30b82d54fa

See more details on using hashes here.

Provenance

The following attestation bundles were made for kernelforge-0.2.3-cp310-cp310-macosx_15_0_arm64.whl:

Publisher: release.yml on andersx/kernelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page