Skip to main content

Gradient-free machine learning for any numpy-compatible function

Project description

LambdaML

Gradient-free machine learning. Give it any function; it learns the parameters.

LambdaML lets you use any numpy-compatible function as your model and automatically fits its parameters using numerical (finite-difference) differentiation — no hand-derived gradients required. The "lambda" really can be anything: logistic regression, a neural network with custom activations, a physics equation, a learnable signal transform, or something entirely your own.


Quick-start

pip install lambdaml
import numpy as np
from lambdaml import LambdaClassifierModel, Optimizer, DiffMethod, LRSchedule

# 1. Write your model — anything numpy-compatible works
def my_model(x, p):
    return (np.tanh(p['w'].dot(x) + p['b']) + 1) / 2

# 2. Initial parameters (scalars or numpy arrays)
p = {'w': np.zeros(2), 'b': 0.0}

# 3. Create and fit
model = LambdaClassifierModel(
    f=my_model,
    p=p,
    diff_method=DiffMethod.COMPLEX_STEP,   # recommended
    l2_factor=0.001,
    optimizer=Optimizer.ADAM,
    lr_schedule=LRSchedule.cosine_annealing(T_max=100),
)
model.fit(X_train, Y_train, n_iter=100, lr=0.01,
          early_stopping=True, patience=10, verbose=True)

print(model.score(X_test, Y_test))       # accuracy
print(model.predict_proba(X_test))       # probabilities

For regression, swap in LambdaRegressorModel with loss='mse', 'mae', 'huber', or 'pseudo_huber'.

See the examples/ folder for runnable scripts and the LambdaML_Showcase.ipynb notebook for an interactive walkthrough with charts.


What is finite-difference differentiation?

The term you're looking for is finite-difference approximation (sometimes called numerical differentiation). Rather than deriving f′(θ) analytically, we estimate it by evaluating the function at nearby points:

f'(θ) ≈ [f(θ+h) - f(θ-h)] / (2h)     ← Central difference, O(h²)

LambdaML supports six methods with different accuracy/cost trade-offs:

Method Order f-evals/param Notes
Forward O(h) 1 Fast, low accuracy
Backward O(h) 1 Fast, low accuracy
Central O(h²) 2 Default — good balance
Five-Point O(h⁴) 4 High accuracy, smooth f
Complex-Step O(h²) 1 (complex) Recommended — no cancellation error
Richardson O(h⁴) 4 High accuracy, no complex inputs needed

Derivative methods comparison

Left: all six estimates on a known function. Right: absolute error vs step size h — complex-step never hits the cancellation-error floor.

Is it tractable? Yes, for models up to ~10k parameters. Each gradient step costs O(n_params) forward passes instead of O(1) for analytic backprop. For small-to-medium models on a CPU+numpy backend this is entirely practical.


The lambda can be any function

Six completely different model functions, one .fit() call:

Decision boundaries

From top-left: logistic regression, tanh, sine activation (non-standard), Gaussian RBF, softplus, and a physics-inspired decay+oscillation model σ(a·exp(−λ|x₀|)·cos(ω·x₁+φ)) — the kind of thing nobody derives analytically.


Neural network with numerically computed gradients

A 2-layer ELU network on non-linearly separable data, fitted entirely via finite-difference backprop. No autograd, no torch, no chain rule.

Neural network training

Clockwise from top-left: log-loss curve, final decision boundary, weight trajectories for hidden and output layers, bias evolution across epochs.


Regression — recovering true sine parameters

Starting from wrong values (a=2.5, ω=0.4, φ=1.8, c=−1) on outlier-corrupted data, the optimizer converges back to the true parameters using pseudo-Huber loss (complex-step safe).

Sine regression


Optimizer comparison

SGD vs Momentum vs RMSProp vs Adam on the same logistic task:

Optimizer comparison


Derivative method benchmark

All 6 methods on the same problem — speed, accuracy, and Pareto trade-off:

Diff method benchmark


Regularization — L1 vs L2

With the corrected L1 formula (Σ|θ| not Σθ — a bug in the original), L1 now induces true sparsity on a 10-feature problem where only features 0 and 1 matter:

Regularization


Learning rate schedules

Five schedules visualised and compared for convergence speed:

LR schedules


Gradient accuracy verification

Per-component absolute error vs an analytically known gradient — complex-step and Richardson hit near-machine-precision:

Gradient accuracy


API reference

LambdaClassifierModel(f, p, **kwargs)

Parameter Default Description
f Model: f(x, p) → float ∈ (0,1)
p Parameter dict (scalars or numpy arrays)
diff_method DiffMethod.CENTRAL Finite-difference method
diff_h None Custom step size (None = optimal default per method)
l1_factor 0.0 L1 regularization strength
l2_factor 0.01 L2 regularization strength
regularize_bias False Whether to regularize b* params
optimizer Optimizer.ADAM sgd, momentum, rmsprop, adam
lr_schedule None (constant) Learning rate schedule callable

Methods: .fit(X, Y, n_iter, lr, batch_size, early_stopping, patience, verbose, validation_data) · .predict(X) · .predict_proba(X) · .score(X, Y) · .compute_loss(X, Y) · .loss_history

LambdaRegressorModel(f, p, loss='mse', **kwargs)

Parameter Default Description
loss 'mse' 'mse', 'mae', 'huber', 'pseudo_huber'
huber_delta 1.0 Threshold for Huber / pseudo-Huber

Methods: .fit(...) · .predict(X) · .score(X, Y) (R²)

DiffMethod · Optimizer · LRSchedule

# Derivative methods
DiffMethod.FORWARD | BACKWARD | CENTRAL | FIVE_POINT | COMPLEX_STEP | RICHARDSON

# Optimizers
Optimizer.SGD | MOMENTUM | RMSPROP | ADAM

# LR schedules
LRSchedule.constant()
LRSchedule.step_decay(drop=0.5, epochs_drop=10)
LRSchedule.exponential_decay(k=0.01)
LRSchedule.cosine_annealing(T_max=100)
LRSchedule.warmup_cosine(warmup=10, T_max=100)

Bug fixes from the original library

Bug Original Fixed
Epsilon float16.eps ≈ 0.001 — catastrophically large Float64-optimal per method (~6e-6 for central)
L1 regularization Summed raw θ — negative weights reduced penalty Summed |θ| using smooth complex-safe approximation
Closure-in-loop Array gradient loop captured last index for all closures Fixed with factory functions
L1/L2 complex-step safety float() cast stripped imaginary part Uses v*v and sqrt(v*v+eps) to preserve imaginary parts
No test split Accuracy reported on training data Train/test split in all examples

Is LambdaML useful for Kaggling?

As a primary model for large nets — rarely. As a prototyping and ensembling tool — genuinely yes.

The core insight: LambdaML decouples your model definition from gradient computation. Anywhere you want a custom functional form but don't want to derive its gradients by hand, LambdaML fills that gap.

Konkret use cases for Kaggling: fitting domain equations with unknown parameters (physics-based pricing, pharmacokinetics, decay curves); directly optimising non-differentiable competition metrics (NDCG, F-beta, Cohen's kappa) as the loss function; building exotic meta-learners in stacking ensembles; small-data + custom hypothesis problems where sklearn doesn't have your model form.


Project structure

LambdaML/
├── lambdaml/                # Installable package (pip install lambdaml)
│   ├── __init__.py
│   ├── lambda_model.py      # LambdaClassifierModel, LambdaRegressorModel, Optimizer
│   └── lambda_utils.py      # NumericalDiff, GradientComputer, Regularization, LossFunctions, LRSchedule
├── pyproject.toml           # Package metadata
├── LambdaML_Showcase.ipynb  # Interactive notebook with all charts
├── examples/
│   ├── example_tanh_regression.py
│   ├── example_neural_network.py
│   ├── example_diff_methods.py
│   └── example_regressor.py
├── assets/
│   ├── fig_decision_boundaries.png
│   ├── fig_derivative_methods.png
│   ├── fig_diff_benchmark.png
│   ├── fig_gradient_accuracy.png
│   ├── fig_lr_schedules.png
│   ├── fig_neural_network.png
│   ├── fig_optimizers.png
│   ├── fig_regularization.png
│   └── fig_sine_regression.png
├── data/
│   └── circles.csv
└── legacy/                  # Original library files (pre-rewrite)

License

See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lambdaml-1.0.1.tar.gz (19.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lambdaml-1.0.1-py3-none-any.whl (17.1 kB view details)

Uploaded Python 3

File details

Details for the file lambdaml-1.0.1.tar.gz.

File metadata

  • Download URL: lambdaml-1.0.1.tar.gz
  • Upload date:
  • Size: 19.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.7

File hashes

Hashes for lambdaml-1.0.1.tar.gz
Algorithm Hash digest
SHA256 7aa4c5ae155a956bc8b83cd3fff6b50a769731bfbef47d1cdaadcfbc9849dc0a
MD5 bd87c5beb149ce468da3db71a37b1c1d
BLAKE2b-256 155f38790a00d3694a3b3e455146374505c5174d9611c46b771173731a502fa8

See more details on using hashes here.

File details

Details for the file lambdaml-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: lambdaml-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 17.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.7

File hashes

Hashes for lambdaml-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 fd72cbe1efdb064e34151f8aa29cfb0c44e24a529cf36cf0e8f029f8bda99888
MD5 9aaa29b431d49703f2946f06db5865ea
BLAKE2b-256 df9cfc12ea3dbfe1ba476d8850ea7c5480af9e1e84c20755e5d1dd326b5b3a12

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page