Skip to main content

Optimized Kernels for ML

Project description

KernelForge - Optimized Kernels for ML

CI PyPI version Python Versions Platform License: MIT

I really only care about writing optimized kernel code, so this project will be completed as I find additional time... XD

I'm reviving this project to finish an old project using random Fourier features for kernel ML.

Installation

Quick Start (Recommended)

For most users, install from PyPI:

pip install kernelforge

This installs pre-compiled wheels with optimized BLAS libraries:

  • Linux: OpenBLAS
  • macOS: Apple Accelerate framework

Requirements: Python 3.10+

Development Installation

Linux

# Create virtual environment with uv
uv venv
source .venv/bin/activate

# Install in editable mode with test dependencies
make install-linux

# Or manually:
CMAKE_ARGS="-DKF_USE_NATIVE=ON" uv pip install -e .[test] --verbose

macOS

macOS requires Homebrew LLVM for OpenMP support:

# Install dependencies
brew install llvm libomp

# Create virtual environment
uv venv
source .venv/bin/activate

# Install in editable mode
make install-macos

# Or manually:
CMAKE_ARGS="-DCMAKE_C_COMPILER=/opt/homebrew/opt/llvm/bin/clang -DCMAKE_CXX_COMPILER=/opt/homebrew/opt/llvm/bin/clang++ -DKF_USE_NATIVE=ON" uv pip install -e .[test] --verbose

Note: The -DKF_USE_NATIVE=ON flag enables -march=native/-mcpu=native optimizations for maximum performance on your specific CPU.

Advanced: Custom BLAS/LAPACK Libraries

Intel MKL (Linux)

# Install Intel oneAPI Base Toolkit
sudo apt install intel-basekit

# Set up environment
source /opt/intel/oneapi/setvars.sh

# Install (MKL will be auto-detected by CMake)
uv pip install -e .[test] --verbose

# Optional: Use Intel compilers
CC=icx CXX=icpx uv pip install -e .[test] --verbose

Note: In practice, GCC/G++ with OpenBLAS performs similarly to (or better than) Intel compilers with MKL. On macOS, LLVM with Accelerate framework is highly optimized for Apple Silicon.

Timings

I've rewritten a few of the kernels from the original QML code completely in C++. There are performance gains in most cases. These are primarily due to better use of BLAS routines for calculating, for example, Gramian sub-matrices with chunked DGEMM/DSYRK calls, etc. In the gradient and Hessian matrices there are also some algorithmic improvement and pre-computed terms. Memory usage might be a bit higher, but this could be optimized with more fine-graind chunking if needed. More is coming as I find the time ...

Some speedups vs the original QML code are shown below:

Benchmark QML [s] Kernelforge [s]
Upper triangle Gaussian kernel (16K x 16K) 1.82 0.64
1K FCHL19 descriptors (1K) ? 0.43
1K FCHL19 descriptors+jacobian (1K) ? 0.62
FCHL19 Local Gaussian scalar kernel (10K x 10K) 76.81 18.15
FCHL19 Local Gaussian gradient kernel (1K x 2700K) 32.54 1.52
FCHL19 Local Gaussian Hessian kernel (5400K x 5400K) 29.68 2.05

TODO list

The goal is to remove pain-points of existing QML libraries

  • Removal of Fortran dependencies
    • No Fortran-ordered arrays
    • No Fortran compilers needed
  • Simplified build system
    • No cooked F2PY/Meson build system, just CMake and Pybind11
  • Improved use of BLAS routines, with built-in chunking to avoid memory explosions
  • Better use of pre-computed terms for single-point inference/MD kernels
  • Low overhead with Pybind11 shims and better aligned memory?
  • Simplified entrypoints that are compatible with RDKit, ASE, Scikit-learn, etc.
    • A few high-level functions that do the most common tasks efficiently and correctly
  • Efficient FCHL19 out-of-the-box
    • Fast training with random Fourier features
    • With derivatives

Priority list for the next months:

  • Finish the inverse-distance kernel and its Jacobian

  • Make Pybind11 interface

    • Finalize the C++ interface
  • Finish the Gaussian kernel

  • Notebook with rMD17 example

  • Finish the Jacobian and Hessian kernels

  • Notebook with rMD17 forces example

  • FCHL19 support:

    • Add FCHL19 descriptors
    • Add FCHL19 kernels (local/elemental)
    • Add FCHL19 descriptor with derivatives
    • Add FCHL19 kernel Jacobian
    • Add FCHL19 kernel Hessian (GDML-style)
    • Improve FCHL19 kernel Jacobian performance (its poor)
  • Finish the random Fourier features kernel and its Jacobian

    • Parallel random basis sampler
    • RFF kernel for global descriptors
    • SVD and QR solvers for rectangular matrices
    • RFF kernel for local descriptors (FCHL19)
    • RFF kernels with Cholesky solver and chunked DSYRK kernel updates
    • RFF kernels with RFP format with chunked DSFRK kernel updates
    • RFF kernel Jacobian for global descriptors
    • RFF kernel Jacobian for local descriptors (FCHL19)
  • Notebook with rMD17 random Fourier features examples

  • Science:

    • Benchmark full kernel vs RFF on rMD17 and QM7b and QM9
    • Both FCHL19 and inverse-distance matrix

Todos:

  • Houskeeping:
    • Pybind11 bindings and CMake build system
    • Setup CI with GitHub Actions
    • Rewrite existing kernels to C++ (no Fortran)
    • Setup GHA to build PyPI wheels
    • Test Linux build matrices
    • Test MacOS build matrices
    • Test Windows build matrices No.
    • Add build for all Python version >=3.10
    • Plan structure for saving models for inference as .npz files
  • Ensure correct linking with optimized BLAS/LAPACK libraries:
    • OpenBLAS (Linux) <- also used in wheels
    • MKL (Linux)
    • Accelerate (MacOS)
  • Add global kernels:
    • Gaussian kernel
    • Jacobian/gradient kernel
    • Optimized kernel for single inference (for MD)
    • Hessian kernel
    • GDML-like kernel
    • Full GPR kernel
    • All kernels in RFP format
  • Add local kernels:
    • Gaussian kernel
    • Jacobian/gradient kernel
    • Optimized Jacobian kernel for single inference
    • Hessian kernel (GDML-style)
    • Full GPR kernel
    • Optimized GPR kernel with pre-computed terms for single inference/MD
  • Add random Fourier features kernel code:
    • Fourier-basis sampler in C++ with OpenMP parallelization
    • RFF kernel
    • RFF gradient kernel
    • RFF chunked DSYRK kernel
    • Optimized RFF gradient kernel for single inference/MD
    • The same as above, just for Hadamard features when I find the time?
  • GDML and sGDML kernels:
    • Inverse-distance matrix descriptor
    • Packed Jacobian for inverse-distance matrix
    • GDML kernel (brute-force implemented)
    • sGDML kernel (brute-force implemented)
    • Full GPR kernel
    • Optimized GPR kernel with pre-computed terms for single inference/MD
  • FCHL18 support:
    • Complete rewrite of FCHL18 analytical scalar kernel in C++
    • Stretch goal 1: Add new analytical FCHL18 kernel Jacobian
    • Stretch goal 2: Add new analytical FCHL18 kernel Hessian (+GPR/GDML-style)
    • Stretch goal 3: Attempt to optimize hyperparameters and cut-off functions
  • Add standard solvers:
    • Cholesky in-place solver
      • L2-reg kwarg
      • Toggle destructive vs non-destructive
    • RFP format in-place Cholesky solver
    • QR and/or SVD for non-square matrices
  • Add moleular descriptors with derivatives:
    • Coulomb matrix + misc variants without derivatives
    • FCHL19 + derivatives
    • GDML-like inverse-distance matrix + derivatives

Stretch goals:

  • Plan RDKit interface
  • Plan Scikit-learn interface
  • Plan ASE interface

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kernelforge-0.2.0.tar.gz (4.2 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

kernelforge-0.2.0-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (12.7 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

kernelforge-0.2.0-cp314-cp314-macosx_15_0_arm64.whl (744.5 kB view details)

Uploaded CPython 3.14macOS 15.0+ ARM64

kernelforge-0.2.0-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (12.7 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

kernelforge-0.2.0-cp313-cp313-macosx_15_0_arm64.whl (743.2 kB view details)

Uploaded CPython 3.13macOS 15.0+ ARM64

kernelforge-0.2.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (12.7 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

kernelforge-0.2.0-cp312-cp312-macosx_15_0_arm64.whl (743.0 kB view details)

Uploaded CPython 3.12macOS 15.0+ ARM64

kernelforge-0.2.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (12.7 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

kernelforge-0.2.0-cp311-cp311-macosx_15_0_arm64.whl (736.8 kB view details)

Uploaded CPython 3.11macOS 15.0+ ARM64

kernelforge-0.2.0-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (12.7 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

kernelforge-0.2.0-cp310-cp310-macosx_15_0_arm64.whl (730.3 kB view details)

Uploaded CPython 3.10macOS 15.0+ ARM64

File details

Details for the file kernelforge-0.2.0.tar.gz.

File metadata

  • Download URL: kernelforge-0.2.0.tar.gz
  • Upload date:
  • Size: 4.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for kernelforge-0.2.0.tar.gz
Algorithm Hash digest
SHA256 6f72e6f2c7d42960dc83450043a2e00ad2dee56a133c53e183f9068e6f6d93ad
MD5 880994e49decad02b722d2eed4e756a6
BLAKE2b-256 fcb81608864360664ed2f65b46bffac64df8e99353b7bdd830b0f040e332a7fa

See more details on using hashes here.

Provenance

The following attestation bundles were made for kernelforge-0.2.0.tar.gz:

Publisher: release.yml on andersx/kernelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kernelforge-0.2.0-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for kernelforge-0.2.0-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 1ce6332445242cc091936abe46501001e1682cd4490b38788e443df31c88ae14
MD5 df1071839b4fc24dcbb80cbbe2be8c06
BLAKE2b-256 0b00920ac1e02c515285deafd12bc480d36ca6b7b2def86fb1c9cd8badd6c7c0

See more details on using hashes here.

Provenance

The following attestation bundles were made for kernelforge-0.2.0-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl:

Publisher: release.yml on andersx/kernelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kernelforge-0.2.0-cp314-cp314-macosx_15_0_arm64.whl.

File metadata

File hashes

Hashes for kernelforge-0.2.0-cp314-cp314-macosx_15_0_arm64.whl
Algorithm Hash digest
SHA256 3cc130b1fbb45ae0fe0291868eb0190500c2073bbf655237bbbc395362b3a8c7
MD5 00faf3d15d5c62192816c3fa06bb5d90
BLAKE2b-256 de83df633d087424033e9d0ce7984eaea78d8fa4f3a85ecfe5975368f02e4eec

See more details on using hashes here.

Provenance

The following attestation bundles were made for kernelforge-0.2.0-cp314-cp314-macosx_15_0_arm64.whl:

Publisher: release.yml on andersx/kernelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kernelforge-0.2.0-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for kernelforge-0.2.0-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 f346b1fa179f06e6ecdc7aec62cddde42b8fc17ed3655d587a82d4f41de2d0f1
MD5 a0f19609e2819d2d7cd63006ab116549
BLAKE2b-256 f6cd58561fdc30296b4acfa92acceb035b64837ff2b2478fc0780d854bbbe433

See more details on using hashes here.

Provenance

The following attestation bundles were made for kernelforge-0.2.0-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl:

Publisher: release.yml on andersx/kernelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kernelforge-0.2.0-cp313-cp313-macosx_15_0_arm64.whl.

File metadata

File hashes

Hashes for kernelforge-0.2.0-cp313-cp313-macosx_15_0_arm64.whl
Algorithm Hash digest
SHA256 2224b148a1b7842b141d23904b8e0e9b0e7be4694ffab17009d1da324e38aa98
MD5 73a461ce51cb6cd326195453fc1ce498
BLAKE2b-256 16e7181e0dcf77653cfd4e9b983589543d0b447197a01c03298b157a73d720a1

See more details on using hashes here.

Provenance

The following attestation bundles were made for kernelforge-0.2.0-cp313-cp313-macosx_15_0_arm64.whl:

Publisher: release.yml on andersx/kernelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kernelforge-0.2.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for kernelforge-0.2.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 2eb89767761862209d20f1583983c22614dcb55f4a531903ece9b902e2ca5901
MD5 8a4085a04b4aa7afe59346c928c13913
BLAKE2b-256 474abb88fe7e1783e81009a1252c83c5f61da3a1b0652bb0142a53a776f43b64

See more details on using hashes here.

Provenance

The following attestation bundles were made for kernelforge-0.2.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl:

Publisher: release.yml on andersx/kernelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kernelforge-0.2.0-cp312-cp312-macosx_15_0_arm64.whl.

File metadata

File hashes

Hashes for kernelforge-0.2.0-cp312-cp312-macosx_15_0_arm64.whl
Algorithm Hash digest
SHA256 4c9b556a5eb88215a8126db18b64181677b018b017d5551ef2c8a667d100cbbb
MD5 9ff56cb0eddc031d17b5381ccbf154bf
BLAKE2b-256 c6cb5e8560c680cd8fe36e9163fd7e62a06fcc791bb34cff148c78f76ede40b2

See more details on using hashes here.

Provenance

The following attestation bundles were made for kernelforge-0.2.0-cp312-cp312-macosx_15_0_arm64.whl:

Publisher: release.yml on andersx/kernelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kernelforge-0.2.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for kernelforge-0.2.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 b576a5a9ec261f3584e50097a89f101cfa4ff1e28c8f363a261f5343fe2a69cb
MD5 d67a45eb832c7764806b68341ab3addd
BLAKE2b-256 28e45dc794b0dcd3db1a8c2b5541e350cc447263523b07f38f3ae2558f9f1117

See more details on using hashes here.

Provenance

The following attestation bundles were made for kernelforge-0.2.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl:

Publisher: release.yml on andersx/kernelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kernelforge-0.2.0-cp311-cp311-macosx_15_0_arm64.whl.

File metadata

File hashes

Hashes for kernelforge-0.2.0-cp311-cp311-macosx_15_0_arm64.whl
Algorithm Hash digest
SHA256 b9d50e7003343285137fc90e92b50592eb28f88eb2b29a477e3dfe1b405ec7e0
MD5 ef0576a8b504baaf7225f1d8c53b89c3
BLAKE2b-256 14961a978a413fcba43a89db582bc03c02cf9798b25f0fc7beaaf0267e8bcb9c

See more details on using hashes here.

Provenance

The following attestation bundles were made for kernelforge-0.2.0-cp311-cp311-macosx_15_0_arm64.whl:

Publisher: release.yml on andersx/kernelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kernelforge-0.2.0-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for kernelforge-0.2.0-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 c8cebdea407bec36d2f57a97870c47f043532ec06479c1ebd35d8118d5affaa3
MD5 3f19544e2b8d1c7d3dba050a598b77a3
BLAKE2b-256 6075466a2743613c2b590a2c966d2a4a478485175ccf5b4faa23e997206d46b3

See more details on using hashes here.

Provenance

The following attestation bundles were made for kernelforge-0.2.0-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl:

Publisher: release.yml on andersx/kernelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kernelforge-0.2.0-cp310-cp310-macosx_15_0_arm64.whl.

File metadata

File hashes

Hashes for kernelforge-0.2.0-cp310-cp310-macosx_15_0_arm64.whl
Algorithm Hash digest
SHA256 6821a9a2a50439669f8edbe810def1682d5c395cdd98d6db99662f1af56657c5
MD5 474bcc50855ec59b0feb5512c7012d87
BLAKE2b-256 f2780f8d0cad2e4e57cfe654da57e4bea79033948f409688e6de2f1fd714043a

See more details on using hashes here.

Provenance

The following attestation bundles were made for kernelforge-0.2.0-cp310-cp310-macosx_15_0_arm64.whl:

Publisher: release.yml on andersx/kernelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page