Levenberg-Marquardt Global Fitter for Python
Project description
lm_global_fit: Levenberg-Marquardt Global Fitter
A Python library for performing non-linear least squares curve fitting on multiple datasets simultaneously (global fitting) using the Levenberg-Marquardt algorithm. It supports parameter fixing, linking parameters across datasets using shared IDs, composite models (sum of multiple model functions per dataset), calculation of common goodness-of-fit statistics, confidence intervals (with bootstrap fallback), and model extrapolation.
This library is a high-fidelity translation and enhancement of the original GlobalFit.JS library, which itself was adapted from the Fortran project Savuka. This Python version leverages NumPy and SciPy for numerical operations and offers parallel processing capabilities.
Version: 1.0.0
Key Features & Enhancements
- Python Implementation: Runs natively in Python environments.
- NumPy/SciPy Backend: Utilizes NumPy for efficient array operations and SciPy for robust SVD (
scipy.linalg.svd) and statistical functions (scipy.stats.t). Usesnumpy.linalg.solve/numpy.linalg.pinvfor solving linear systems within LM. - Parallel Independent Fitting:
lm_fit_independentcan fit datasets in parallel using Python'smultiprocessingmodule, significantly speeding up analysis of large numbers of independent datasets. - Parallel Bootstrap CI: The bootstrap confidence interval calculation (used as a fallback) can also leverage
multiprocessingfor faster execution. - Vectorized Model Support: Encourages defining model functions that operate directly on NumPy arrays for improved performance, although it includes fallbacks for non-vectorized functions.
- Model Extrapolation: Added
model_x_rangeoption to calculate and plot fitted curves, component curves, and confidence intervals beyond the original data range. - Global Fitting: Fit multiple datasets simultaneously with shared or independent parameters.
- Levenberg-Marquardt Algorithm: Robust and widely used algorithm for non-linear least squares.
- Parameter Fixing: Keep specific parameters constant during the fit using a
fixMap. - Parameter Linking: Force parameters across different datasets or models to share the same fitted value using a
linkMapwith shared string/number IDs. - Composite Models: Define the model for a dataset as the sum of multiple individual model functions.
- Goodness-of-Fit Statistics: Calculates Degrees of Freedom, Reduced Chi-Squared, AIC, AICc, and BIC.
- Covariance & Errors: Returns the covariance matrix and standard errors for active parameters. Uses
abs()for variance if numerically negative and logs a warning. - Covariance Regularization: Applies a small regularization factor during covariance matrix calculation for improved numerical stability.
- Custom Logging & Progress: Provides options (
onLog,onProgresscallbacks) for users to handle verbose output and track progress. - Parameter Constraints: Supports simple box constraints (min/max) and custom constraint functions.
- Robust Cost Functions: Optional use of L1 (Absolute Residual) or Lorentzian cost functions for outlier resistance.
- Confidence Intervals: Calculates confidence intervals for fitted model curves using the covariance matrix and Student's t-distribution (Delta method).
- Bootstrap Fallback for Confidence Intervals: Automatically falls back to multiprocessing-enabled bootstrapping when the covariance matrix yields negative variances or fails inversion.
- Simulation Functionality: Generate synthetic datasets using
simulate_from_paramswith support for Gaussian and Poisson noise.
Advantages
- Improved Parameter Estimation: Global fitting uses information from all datasets simultaneously, often leading to more precise and reliable parameter estimates, especially for shared parameters.
- Model Discrimination: Allows testing hypotheses where certain parameters are expected to be the same across different experimental conditions (datasets).
- Flexibility: Handles complex scenarios with multiple model components contributing to the overall signal for each dataset.
- Performance: Leverages NumPy for vectorized calculations and
multiprocessingfor parallel execution of independent fits and bootstrap CIs. - Python Ecosystem: Integrates naturally with other scientific Python libraries (NumPy, SciPy, Matplotlib, etc.).
- Numerical Stability: Uses SVD-based inversion (via SciPy) for the covariance matrix with regularization.
- Fallback Mechanisms: Includes robust fallback mechanisms like bootstrapping for confidence intervals.
Installation
Install the package directly from PyPI:
pip install lm_global_fit
Usage Example
import numpy as np
from lm_global_fit import (
lm_fit_global,
lm_fit,
lm_fit_independent,
simulate_from_params
)
# --- 1. Define Model Functions ---
# Must accept params=np.array([...]) and x=np.array([...]), return np.array([...]) if vectorized
# Or accept params=np.array([...]) and x=np.array([xValue]), return np.array([yValue]) if not vectorized
def gaussian_model(params: np.ndarray, x: np.ndarray) -> np.ndarray:
"""Gaussian model: A * exp(-0.5 * ((x - xc) / w)^2) (Vectorized)"""
if len(params) != 3: raise ValueError("Gaussian model expects 3 parameters: [amp, center, stddev]")
amp, center, stddev = params
if stddev == 0: return np.full_like(x, np.nan)
exponent = -0.5 * ((x - center) / stddev)**2
with np.errstate(over='ignore', under='ignore'): result = amp * np.exp(exponent)
result[~np.isfinite(result)] = 0.0
result[np.abs(exponent) > 700] = 0.0
return result
def linear_model(params: np.ndarray, x: np.ndarray) -> np.ndarray:
"""Linear model through origin: y = m*x (Vectorized)"""
if len(params) != 1: raise ValueError("Linear model (y=m*x) expects 1 parameter: [slope]")
slope = params[0]
return slope * x
# --- 2. Prepare Data ---
data_in = {
'x': [
[1, 2, 3, 4, 5, 6], # Dataset 0
[0, 1, 2, 3, 4, 5, 6, 7] # Dataset 1
],
'y': [
[5.1, 8.2, 9.9, 10.1, 8.5, 5.3], # Noisy Gaussian
[1.9, 4.1, 5.9, 8.1, 10.0, 12.1, 13.8, 16.2] # Noisy Linear
],
'ye': [
[0.5, 0.5, 0.5, 0.5, 0.5, 0.5], # Errors for DS0
[0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2] # Errors for DS1
]
}
# --- 3. Define Model Structure ---
model_function_in = [
[gaussian_model], # Dataset 0: Gaussian only
[linear_model] # Dataset 1: Linear only
]
# --- 4. Initial Parameter Guesses ---
initial_parameters_in = [
[[9.0, 3.5, 1.0]], # DS0: [amp, center, stddev]
[[2.0]] # DS1: [slope]
]
# --- 5. Define Options ---
fit_options = {
'maxIterations': 200,
'logLevel': 'info',
'confidenceInterval': 0.95,
'calculateFittedModel': {'numPoints': 100},
'bootstrapFallback': True,
'numBootstrapSamples': 100
}
# --- 6. Run the Global Fit ---
result = lm_fit_global(data_in, model_function_in, initial_parameters_in, fit_options)
# --- 7. Process Results ---
if result.get('error'):
print(f"Fit failed: {result['error']}")
else:
print(f"Converged: {result['converged']} in {result['iterations']} iterations.")
print(f"Final Chi^2: {result['chiSquared']:.5e}")
print(f"Reduced Chi^2: {result['reducedChiSquared']:.5f}")
print(f"Active Parameters: {result['p_active']}")
API Reference
lm_fit_global(data, model_function, initial_parameters, options)
Performs the global fit. (Note: Current implementation is synchronous)
Parameters:
data(Dict[str, List[Sequence[float]]]): Dictionary containing the experimental data. Requires keys:'x': List where each element is a sequence (list, tuple, np.array) of independent variable values for a dataset.'y': List of sequences of dependent variable values.'ye': List of sequences of error/uncertainty values (std devs). Must not contain zeros or negative values.
model_function(List[List[Callable[[np.ndarray, np.ndarray], np.ndarray]]]): List of lists of model functions.model_function[dsIdx]is a list of functions for datasetdsIdx.- Each individual function
func = model_function[dsIdx][paramGroupIdx]represents a component of the total model for that dataset. - It will be called as
func(params_array, x_array), whereparams_arrayis the corresponding NumPy parameter array andx_arrayis a NumPy array of x-values. - Each function must return a NumPy array of calculated y-values corresponding to the input
x_array. Vectorized functions (operating on the wholex_arrayat once) are preferred for performance. Non-vectorized functions (expecting a single x-value inx_array) will work but trigger a slower loop-based evaluation for curve generation. - The results of all functions in
model_function[dsIdx]are summed internally to get the final model value for datasetdsIdx.
- Each individual function
initial_parameters(List[List[List[float]]]): Nested list of initial parameter guesses. Structure must align withmodel_function.initial_parameters[dsIdx][paramGroupIdx]is a list of numbers.options(Dict[str, Any], optional): Configuration object for the fit. SeeDEFAULT_OPTIONSfor keys and defaults. Key options include:fixMap(List[List[List[bool]]], optional): Defines fixed parameters (True = fixed). Structure matchesinitial_parameters.linkMap(List[List[List[Optional[Union[str, int]]]]], optional): Defines parameter linking using shared IDs. Structure matchesinitial_parameters.constraints(List[List[List[Optional[Dict[str, float]]]]], optional): Box constraints ({'min': val, 'max': val}). Structure matchesinitial_parameters.constraintFunction(Callable[[ParametersNpType], ParametersNpType], optional): Custom function applied after box constraints. Takes and returns the nested list of NumPy arrays structure.confidenceInterval(float, optional): Level for CIs (e.g., 0.95).bootstrapFallback(bool, default:True): Use bootstrap if standard CI fails.numBootstrapSamples(int, default:200): Number of bootstrap samples.calculateFittedModel(bool | Dict, default:False): Calculate smooth fitted curves. Use{'numPoints': N}to specify points.calculateComponentModels(bool, default:False): Calculate smooth component curves.model_x_range(List[Optional[Tuple[float, float]]], optional): List defining the calculation range(min, max)for each dataset's curves/CIs.Noneuses data range.num_workers(int, optional): Number of parallel workers for bootstrap CI. Defaults tocpu_count(). Set to 1 for sequential.onLog,onProgress,logLevel,maxIterations,errorTolerance, etc.
Returns:
Dict[str, Any]: A dictionary containing the fitting results (ResultType). Key fields:p_active(List[float]): Final active parameter values.p_reconstructed(List[List[List[float]]]): Full parameter structure with final values.finalParamErrors(List[List[List[Optional[float]]]]): Standard errors for all parameters (0 for fixed, propagated for slaves,Noneif NaN).chiSquared(Optional[float]): Final cost function value (Noneif NaN/Inf).covarianceMatrix(Optional[List[List[float]]]): Covariance matrix for active parameters (Noneif failed).parameterErrors(List[Optional[float]]): Standard errors for active parameters (Noneif NaN).iterations(int): Iterations performed.converged(bool): Convergence status.activeParamLabels(List[str]): Labels for active parameters.error(Optional[str]): Error message if fit failed.totalPoints,degreesOfFreedom,reducedChiSquared,aic,aicc,bic: Goodness-of-fit stats (Noneif calculation failed).residualsPerSeries(Optional[List[np.ndarray]]): List of residual arrays.fittedModelCurves(Optional[List[Dict[str, np.ndarray]]]): List of curve dictionaries ({'x': np.array, 'y': np.array}).ci_lower,ci_upper(Optional[List[Dict[str, np.ndarray]]]): Lists of CI bound dictionaries.fittedModelComponentCurves(Optional[List[List[Dict[str, np.ndarray]]]]): Nested list of component curve dictionaries.
lm_fit(data, model_function, initial_parameters, options)
Convenience wrapper for fitting a single dataset.
- Accepts
dataas{x: Sequence[float], y: Sequence[float], ye: Sequence[float]}. - Accepts
model_functionasCallable | List[Callable]. - Accepts
initial_parametersasSequence[float] | List[Sequence[float]]. - Accepts
optionslikelm_fit_global, but maps/constraints should be in single-dataset format (e.g.,fixMap = [[False, True], [False]]).model_x_rangeshould be a single tuple(min, max)orNone. - Returns the same result dictionary structure as
lm_fit_global.
lm_fit_independent(data, model_function, initial_parameters, options, num_workers)
Fits multiple datasets independently, potentially in parallel.
- Accepts
data,model_function,initial_parametersin the same multi-dataset format aslm_fit_global. - Accepts most
optionslikelm_fit_global.linkMapis ignored.fixMap,constraints,model_x_rangeapply per-dataset if provided in the full nested structure. num_workers(int, optional): Overridesoptions['num_workers']for parallel execution.- Returns a list of result dictionaries, one for each dataset fit.
simulate_from_params(data_x, model_functions, parameters, options)
Generates simulated data.
- Accepts
data_x(list of x-sequences),model_functions,parameters(list-of-list-of-list). optionscan includenoiseType('gaussian', 'poisson', 'none' or list) andnoiseStdDev(number or list).- Returns
Dict[str, List[np.ndarray]]containing keys'x'and'y'with lists of NumPy arrays.
Notes & Considerations
- Dependencies: Requires Python 3, NumPy, and SciPy. Matplotlib is needed for the example plots.
- Parallelism:
lm_fit_independentand bootstrap CI usemultiprocessing. Performance gains depend on the number of datasets/samples, the complexity of model evaluations, and system overhead. Ensure model functions and custom constraints are picklable if using parallelism. - Vectorization: Define model functions to accept and return NumPy arrays for best performance, especially during curve/CI generation.
- Error Estimation: Parameter errors are based on the covariance matrix derived from the Jacobian (first derivatives). Warnings are issued for negative variances (using
abs()) or non-finite results. - Covariance Matrix: Regularization (
covarianceLambda) is applied for stability. - Robust Cost Functions: Using
robustCostFunction: 1or2changes the meaning ofchiSquared. It's no longer strictly the sum of squared normalized residuals. The parameter values obtained will be Maximum Likelihood Estimates under the assumed noise distribution (double-exponential or Lorentzian, respectively), but interpreting the absolute value of the final "chiSquared" for goodness-of-fit requires care. Reduced Chi-Squared is less meaningful in these cases. AIC/BIC based on this modified chi-squared are still useful for comparing models fit with the same robust cost function. - Model Function Signature: Ensure model functions adhere to the expected signature
func(params_array, x_array)returning a NumPy array.
MIT License
Copyright (c) 2025 R. Paul Nobrega (Original JS), [Your Name/Year for Python Port]
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file lm_global_fit-1.0.0.tar.gz.
File metadata
- Download URL: lm_global_fit-1.0.0.tar.gz
- Upload date:
- Size: 73.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f22635dfe364e9dcc38e0a96b1403a60d7e49ac0e0e079dbd6729381c0d50c3f
|
|
| MD5 |
c27c317ba3c878bf723139d6a0e82acb
|
|
| BLAKE2b-256 |
5b6372e5e50f7c78027b3e56e56f71faab6abd4b5495a91e445d0d91688ec373
|
File details
Details for the file lm_global_fit-1.0.0-py3-none-any.whl.
File metadata
- Download URL: lm_global_fit-1.0.0-py3-none-any.whl
- Upload date:
- Size: 70.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b30c08ca5138cffb68075a3bff9875b88c5bfb06d27d1989a801867ef07a43b6
|
|
| MD5 |
18406d805d75c9976b9603120958fe2f
|
|
| BLAKE2b-256 |
3776573b0c852431dee21cb94b173dc943d7e64914d43369fdb15a19830910ea
|