Skip to main content

A comprehensive Python library for Near-Infrared Spectroscopy (NIRS) data analysis with ML/DL pipelines.

Project description

NIRS4ALL Logo CIRAD Logo

NIRS4ALL

A comprehensive Python library for Near-Infrared Spectroscopy data analysis

PyPI version Python 3.11+ License: CeCILL-2.1 Code style: ruff

DocumentationInstallationQuick StartExamplesContributing


Overview

NIRS4ALL bridges the gap between spectroscopic data and machine learning by providing a unified framework for data loading, preprocessing, model training, and evaluation. Built for researchers and practitioners working with Near-Infrared Spectroscopy data.

Performance Heatmap

Key Features

  • NIRS-Specific Preprocessing — SNV, MSC, Savitzky-Golay, derivatives, and 25+ spectral transforms
  • Multi-Backend ML — Seamless integration with scikit-learn, TensorFlow, PyTorch, and JAX
  • Declarative Pipelines — Define complex workflows with simple, readable syntax
  • Hyperparameter Tuning — Built-in Optuna integration for automated optimization
  • Rich Visualizations — Performance heatmaps, candlestick plots, SHAP explanations
  • Model Deployment — Export trained pipelines as portable .n4a bundles
  • sklearn CompatibleNIRSPipeline wrapper for SHAP, cross-validation, and more
Performance Heatmap Performance Distribution Regression Scatter Plot
Advanced visualization capabilities for model performance analysis

Installation

Basic Installation

pip install nirs4all

This installs the core library with scikit-learn support. Deep learning frameworks are optional.

With ML Backends

# TensorFlow
pip install nirs4all[tensorflow]

# PyTorch
pip install nirs4all[torch]

# JAX
pip install nirs4all[jax]

# All frameworks
pip install nirs4all[all]

# All frameworks with GPU support
pip install nirs4all[all-gpu]

Development Installation

git clone https://github.com/GBeurier/nirs4all.git
cd nirs4all
pip install -e ".[dev]"

Verify Installation

nirs4all --test-install      # Check dependencies
nirs4all --test-integration  # Run integration tests
nirs4all --version           # Check version

Quick Start

Simple API (Recommended)

import nirs4all
from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import ShuffleSplit
from sklearn.cross_decomposition import PLSRegression

# Define your pipeline
pipeline = [
    MinMaxScaler(),
    {"y_processing": MinMaxScaler()},
    ShuffleSplit(n_splits=3, test_size=0.25),
    {"model": PLSRegression(n_components=10)}
]

# Train and evaluate
result = nirs4all.run(
    pipeline=pipeline,
    dataset="path/to/your/data",
    name="MyPipeline",
    verbose=1
)

# Access results
print(f"Best RMSE: {result.best_rmse:.4f}")
print(f"Best R²: {result.best_r2:.4f}")

# Export for deployment
result.export("exports/best_model.n4a")

Session for Multiple Runs

import nirs4all
from sklearn.preprocessing import MinMaxScaler
from sklearn.cross_decomposition import PLSRegression
from sklearn.ensemble import RandomForestRegressor

with nirs4all.session(verbose=1, save_artifacts=True) as s:
    # Compare models with shared configuration
    pls_result = nirs4all.run(
        pipeline=[MinMaxScaler(), PLSRegression(n_components=10)],
        dataset="data/wheat.csv",
        name="PLS",
        session=s
    )

    rf_result = nirs4all.run(
        pipeline=[MinMaxScaler(), RandomForestRegressor(n_estimators=100)],
        dataset="data/wheat.csv",
        name="RandomForest",
        session=s
    )

    print(f"PLS: {pls_result.best_rmse:.4f} | RF: {rf_result.best_rmse:.4f}")

sklearn Integration with SHAP

import nirs4all
from nirs4all.sklearn import NIRSPipeline
import shap

# Train with nirs4all
result = nirs4all.run(pipeline, dataset)

# Wrap for sklearn compatibility
pipe = NIRSPipeline.from_result(result)

# Use with SHAP
explainer = shap.Explainer(pipe.predict, X_background)
shap_values = explainer(X_test)
shap.summary_plot(shap_values)

Pipeline Syntax

NIRS4ALL uses a declarative syntax for defining pipelines:

from nirs4all.operators.transforms import SNV, SavitzkyGolay, FirstDerivative

pipeline = [
    # Preprocessing
    MinMaxScaler(),
    SNV(),
    SavitzkyGolay(window_length=11, polyorder=2),

    # Target scaling
    {"y_processing": MinMaxScaler()},

    # Cross-validation
    ShuffleSplit(n_splits=5, test_size=0.2),

    # Models to compare
    {"model": PLSRegression(n_components=10)},
    {"model": RandomForestRegressor(n_estimators=100)},

    # Neural network with training parameters
    {
        "model": nicon,
        "name": "NICON-CNN",
        "train_params": {"epochs": 100, "patience": 20}
    }
]

Advanced Features

# Feature augmentation - generate preprocessing combinations
{
    "feature_augmentation": {
        "_or_": [SNV, FirstDerivative, SavitzkyGolay],
        "size": [1, (1, 2)],
        "count": 5
    }
}

# Hyperparameter optimization
{
    "model": PLSRegression(),
    "finetune_params": {
        "n_trials": 50,
        "model_params": {"n_components": ("int", 1, 30)}
    }
}

# Branching for parallel preprocessing paths
{
    "branch": [
        [SNV(), PLSRegression(n_components=10)],
        [MSC(), RandomForestRegressor()]
    ]
}

# Merge branch outputs (stacking)
{"merge": "predictions"}

Available Transforms

NIRS-Specific

Transform Description
SNV / StandardNormalVariate Standard Normal Variate normalization
RNV / RobustNormalVariate Robust Normal Variate (outlier-resistant)
MSC / MultiplicativeScatterCorrection Multiplicative Scatter Correction
SavitzkyGolay Smoothing and derivative computation
FirstDerivative / SecondDerivative Spectral derivatives
Detrend Remove linear/polynomial trends
Gaussian Gaussian smoothing
Haar Haar wavelet decomposition

Signal Processing

Transform Description
Baseline Baseline correction
ReflectanceToAbsorbance Convert R to A using Beer-Lambert
Resampler Wavelength interpolation
CARS / MCUVE Feature selection methods

Splitting Methods

Splitter Description
KennardStone Kennard-Stone algorithm
SPXY Sample set Partitioning based on X and Y
KMeansSplit K-means clustering based split

See Preprocessing Guide for complete reference.


Examples

The examples/ directory is organized by topic:

User Examples (examples/user/)

Category Examples
Getting Started Hello world, basic regression, classification, visualization
Data Handling Multi-source, data loading, metadata
Preprocessing SNV, MSC, derivatives, custom transforms
Models Multi-model, hyperparameter tuning, stacking, PLS variants
Cross-Validation KFold, group splits, nested CV
Deployment Export, prediction, workspace management
Explainability SHAP basics, sklearn integration, feature selection

Reference Examples (examples/reference/)

Complete syntax reference and advanced pipeline patterns.

Run examples:

cd examples
./run.sh              # Run all
./run.sh -i 1         # Run by index
./run.sh -n "U01*"    # Run by pattern

Documentation

Section Description
User Guide Preprocessing, API migration, augmentation
API Reference Module-level API, sklearn integration, data handling
Specifications Pipeline syntax, config format, metrics
Explanations SHAP, resampling, SNV theory

Full documentation: nirs4all.readthedocs.io


Research Applications

NIRS4ALL has been used in published research:

Houngbo, M. E., et al. (2024). Convolutional neural network allows amylose content prediction in yam (Dioscorea alata L.) flour using near infrared spectroscopy. Journal of the Science of Food and Agriculture, 104(8), 4915-4921. John Wiley & Sons, Ltd.


Citation

If you use NIRS4ALL in your research, please cite:

@software{beurier2025nirs4all,
  author = {Gregory Beurier and Denis Cornet and Lauriane Rouan},
  title = {NIRS4ALL: Open spectroscopy for everyone},
  url = {https://github.com/GBeurier/nirs4all},
  version = {0.6.2},
  year = {2026},
}

Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.


License

This project is licensed under the CeCILL-2.1 License — a French free software license compatible with GPL.


Acknowledgments

  • CIRAD for supporting this research
  • The open-source scientific Python community

Made for the spectroscopy community

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nirs4all-0.6.2.tar.gz (1.3 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nirs4all-0.6.2-py3-none-any.whl (1.6 MB view details)

Uploaded Python 3

File details

Details for the file nirs4all-0.6.2.tar.gz.

File metadata

  • Download URL: nirs4all-0.6.2.tar.gz
  • Upload date:
  • Size: 1.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for nirs4all-0.6.2.tar.gz
Algorithm Hash digest
SHA256 d8e759578598a1980e885848bfaafcd3a15243720f3f575672375713fe33a4a5
MD5 ca98a4f41f51d05a292e29141ba949e3
BLAKE2b-256 989b6af67dd21a0c47331393c9c78d623d79acbddeffa8440882f68d26936c06

See more details on using hashes here.

Provenance

The following attestation bundles were made for nirs4all-0.6.2.tar.gz:

Publisher: publish.yml on GBeurier/nirs4all

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file nirs4all-0.6.2-py3-none-any.whl.

File metadata

  • Download URL: nirs4all-0.6.2-py3-none-any.whl
  • Upload date:
  • Size: 1.6 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for nirs4all-0.6.2-py3-none-any.whl
Algorithm Hash digest
SHA256 345feefe5c24d334a80a60ea09e770cd72c78d0b12661277a6ce9a6ba14b080d
MD5 7b4e05cc24dd054f244db7dd01a23fc9
BLAKE2b-256 891f0a22ba5634647458f2146805b0b7111bcadba98b9c4e2ad103a57721e8ad

See more details on using hashes here.

Provenance

The following attestation bundles were made for nirs4all-0.6.2-py3-none-any.whl:

Publisher: publish.yml on GBeurier/nirs4all

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page