NIRS Analyses made easy.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

NIRS4ALL is a comprehensive machine learning library specifically designed for Near-Infrared Spectroscopy (NIRS) data analysis. It bridges the gap between spectroscopic data and machine learning by providing a unified framework for data loading, preprocessing, model training, and evaluation.

What is Near-Infrared Spectroscopy (NIRS)?

Near-Infrared Spectroscopy (NIRS) is a rapid and non-destructive analytical technique that uses the near-infrared region of the electromagnetic spectrum (approximately 700-2500 nm). NIRS measures how near-infrared light interacts with the molecular bonds in materials, particularly C-H, N-H, and O-H bonds, providing information about the chemical composition of samples.

Key advantages of NIRS:

Non-destructive analysis
Minimal sample preparation
Rapid results (seconds to minutes)
Potential for on-line/in-line implementation
Simultaneous measurement of multiple parameters

Common applications:

Agriculture: soil analysis, crop quality assessment
Food industry: quality control, authenticity verification
Pharmaceutical: raw material verification, process monitoring
Medical: tissue monitoring, brain imaging
Environmental: pollutant detection, water quality monitoring

Notes:

NIRS4All is in active development; APIs and docs are subject to change. Pre-1.0 notice: interfaces and documentation may change without notice.

Features

NIRS4ALL offers a wide range of functionalities:

Spectrum Preprocessing:
- Baseline correction
- Standard normal variate (SNV)
- Robust normal variate
- Savitzky-Golay filtering
- Normalization
- Detrending
- Multiplicative scatter correction
- Derivative computation
- Gaussian filtering
- Haar wavelet transformation
- And more...
Data Splitting Methods:
- Kennard Stone
- SPXY
- Random sampling
- Stratified sampling
- K-means
- And more...
Model Integration:
- Scikit-learn models
- TensorFlow/Keras models
- Pre-configured neural networks dedicated to the NIRS: nicon & decon (see publication below)
- PyTorch models (via extensions)
- JAX models (via extensions)
Model Fine-tuning:
- Hyperparameter optimization with Optuna
- Grid search and random search
- Cross-validation strategies
Visualization:
- Preprocessing effect visualization
- Model performance visualization
- Feature importance analysis
- Classification metrics
- Residual analysis

Advanced visualization capabilities for model performance analysis

Installation

Basic Installation

pip install nirs4all

Install TensorFlow cpu support by default

With Additional ML Frameworks

# With PyTorch support
pip install nirs4all[torch]

# With Keras support
pip install nirs4all[keras]

# With JAX support
pip install nirs4all[jax]

# With all ML frameworks
pip install nirs4all[all]

Development Installation

For developers who want to contribute:

git clone https://github.com/gbeurier/nirs4all.git
cd nirs4all
pip install -e .[dev]

Installation Testing

After installing nirs4all, you can verify your installation and environment using the built-in CLI test commands:

# Basic installation test: checks required dependencies and versions
nirs4all --test-install

# Integration test: runs sklearn, tensorflow, and optuna pipelines on sample data
nirs4all --test-integration

# Check version
nirs4all --version

Each command will print a summary of the test results and alert you to any missing dependencies or issues with your environment.

Quick Start

Basic Pipeline Example

from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import ShuffleSplit
from sklearn.cross_decomposition import PLSRegression
from sklearn.ensemble import RandomForestRegressor

from nirs4all.data import DatasetConfigs
from nirs4all.pipeline import PipelineConfigs, PipelineRunner
from nirs4all.operators.transforms import (
    StandardNormalVariate, SavitzkyGolay, MultiplicativeScatterCorrection
)

# Define your processing pipeline
pipeline = [
    MinMaxScaler(),                    # Scale features
    StandardNormalVariate(),           # Apply SNV transformation
    ShuffleSplit(n_splits=3),         # 3-fold cross-validation
    {"y_processing": MinMaxScaler()}, # Scale target values
    {"model": PLSRegression(n_components=10)},
    {"model": RandomForestRegressor(n_estimators=100)},
]

# Create configurations
pipeline_config = PipelineConfigs(pipeline, name="MyPipeline")
dataset_config = DatasetConfigs("path/to/your/data")

# Run the pipeline
runner = PipelineRunner(save_files=False, verbose=1)
predictions, predictions_per_datasets = runner.run(pipeline_config, dataset_config)

# Analyze results
top_models = predictions.top(n=5, rank_metric='rmse')
print("Top 5 models by RMSE:")
for i, model in enumerate(top_models):
    print(f"{i+1}. {model['model_name']}: RMSE = {model['rmse']:.4f}")

Advanced Pipeline with Feature Augmentation

from nirs4all.operators.transforms import (
    Detrend, FirstDerivative, Gaussian, Haar
)

# Define multiple preprocessing options
preprocessors = [Detrend, FirstDerivative, Gaussian, StandardNormalVariate]

# Advanced pipeline with feature augmentation
pipeline = [
    "chart_2d",  # Generate visualization
    MinMaxScaler(),
    {"y_processing": MinMaxScaler()},
    {
        "feature_augmentation": {
            "_or_": preprocessors,
            "size": [1, (1, 2)],  # Single and paired transformations
            "count": 7           # Generate 7 different combinations
        }
    },
    ShuffleSplit(n_splits=3, test_size=0.25),
]

# Add multiple PLS models with different components
for n_comp in range(5, 31, 5):
    pipeline.append({
        "name": f"PLS-{n_comp}_components",
        "model": PLSRegression(n_components=n_comp)
    })

# Run and analyze
pipeline_config = PipelineConfigs(pipeline, "AdvancedPipeline")
runner = PipelineRunner(save_files=False)
predictions, _ = runner.run(pipeline_config, dataset_config)

Neural Network Integration

from nirs4all.operators.models.tensorflow.nicon import nicon

# Pipeline with pre-configured neural network
pipeline = [
    MinMaxScaler(),
    StandardNormalVariate(),
    ShuffleSplit(n_splits=3),
    {"y_processing": MinMaxScaler()},
    {"model": PLSRegression(n_components=15)},
    {
        "model": nicon,  # Pre-configured convolutional neural network
        "name": "NICON-CNN",
        "train_params": {
            "epochs": 100,
            "patience": 20,
            "verbose": 1
        }
    }
]

pipeline_config = PipelineConfigs(pipeline, "NeuralNetworkPipeline")
runner = PipelineRunner(save_files=False, verbose=1)
predictions, _ = runner.run(pipeline_config, dataset_config)

# Compare neural network with traditional models
top_models = predictions.top(n=3, rank_metric='rmse')
for i, model in enumerate(top_models):
    print(f"{i+1}. {model['model_name']}: RMSE = {model['rmse']:.4f}")

Hyperparameter Optimization

# Pipeline with automated hyperparameter tuning
pipeline = [
    MinMaxScaler(),
    StandardNormalVariate(),
    ShuffleSplit(n_splits=3),
    {"y_processing": MinMaxScaler()},
    {
        "model": PLSRegression(),
        "name": "PLS-Optimized",
        "finetune_params": {
            "n_trials": 50,
            "verbose": 1,
            "approach": "single",  # "grouped" or "single"
            "model_params": {
                'n_components': ('int', 1, 30),
            },
        }
    }
]

pipeline_config = PipelineConfigs(pipeline, "OptimizedPipeline")
runner = PipelineRunner(save_files=False, verbose=1)
predictions, _ = runner.run(pipeline_config, dataset_config)

# Get the best optimized model
best_model = predictions.top(n=1, rank_metric='rmse')[0]
print(f"Best model: {best_model['model_name']} with RMSE: {best_model['rmse']:.4f}")

Visualization and Analysis

from nirs4all.data.prediction_analyzer import PredictionAnalyzer
import matplotlib.pyplot as plt

# Create analyzer for your predictions
analyzer = PredictionAnalyzer(predictions)

# Plot top performing models
fig1 = analyzer.plot_top_k_comparison(k=5, rank_metric='rmse')
plt.title('Top 5 Models Comparison')

# Create heatmap of model performance across preprocessing methods
fig2 = analyzer.plot_variable_heatmap(
    x_var="model_name",
    y_var="preprocessings",
    metric='rmse'
)
plt.title('Model Performance Heatmap')

# Candlestick plot for model variability
fig3 = analyzer.plot_variable_candlestick(
    filters={"partition": "test"},
    variable="model_name"
)
plt.title('Model Performance Variability')

plt.show(block=False)

Tutorials

NIRS4ALL provides comprehensive tutorials to help you master NIRS data analysis:

🚀 Tutorial 1: Beginner's Guide

Perfect for getting started with NIRS4ALL! This tutorial covers:

Basic PLS Regression - Your first NIRS pipeline
Enhanced Preprocessing - Spectral data preprocessing techniques
Classification - Random Forest classification examples
Model Persistence - Save and reuse trained models
Multiple Datasets - Cross-dataset validation and analysis
Data Visualization - Create meaningful plots and charts

Start here if you're new to NIRS analysis or the NIRS4ALL framework.

🔬 Tutorial 2: Advanced Analysis

For experienced users ready for sophisticated techniques:

Multi-Source Analysis - Multi-target regression with single datasets
Hyperparameter Optimization - Automated model tuning with Optuna
Custom Components - Build your own transformers and models
Configuration Generation - Dynamic pipeline customization
Advanced Visualizations - Professional-grade analysis dashboards
Neural Networks - Deep learning with pre-configured models (nicon, decon)
Complete Workflows - End-to-end professional analysis

These tutorials demonstrate real-world workflows and best practices for production-ready NIRS analysis.

Examples

Ready-to-run example scripts demonstrating common NIRS workflows:

Basic Examples

Q1_regression.py - Basic regression with PLS models and preprocessing combinations
Q1_classif.py - Classification pipeline with Random Forest and preprocessing
Q1_classif_tf.py - Classification with TensorFlow neural networks and confusion matrix visualization
Q1_groupsplit.py - Group-based data splitting for maintaining sample integrity

Advanced Pipeline Techniques

Q2_multimodel.py - Compare multiple model types (PLS, RF, SVM) in one run
Q3_finetune.py - Hyperparameter optimization with Optuna
Q4_multidatasets.py - Cross-dataset validation and transfer learning
Q11_flexible_inputs.py - All possible input formats for PipelineRunner (configs, dicts, arrays, paths)
Q12_sample_augmentation.py - Balanced sample augmentation for imbalanced classification datasets

Model Deployment & Prediction

Q5_predict.py - Load saved models and predict on new data
Q5_predict_NN.py - Prediction methods for neural network models
Q14_workspace.py - Workspace management, library export, and global predictions database

Data Processing & Analysis

Q6_multisource.py - Multi-target regression from single dataset
Q7_discretization.py - Convert continuous targets to categorical
Q8_shap.py - SHAP analysis for model interpretability
Q9_acp_spread.py - PCA-based dataset analysis and visualization
Q10_resampler.py - Wavelength resampling and interpolation techniques
Q13_nm_headers.py - Working with nanometer (nm) wavelength headers instead of wavenumbers (cm⁻¹)

Custom Models

custom_NN.py - Custom TensorFlow neural network architectures for NIRS
custom_nicon.py - Custom NICON (NIRS Convolutional Network) model implementations

Run any example with: python examples/<example_name>.py t

Documentation

User Guide

Preprocessing Guide - Complete reference of transformers (nirs4all, sklearn, scipy) with usage examples
Preprocessing Cheatsheet - Quick reference for preprocessing operations
Sample Augmentation Guide - Data augmentation techniques for NIRS

API Reference

Data Module - Dataset handling and data loading APIs
Pipeline Module - Pipeline configuration and execution APIs
Workspace Module - Workspace management and organization

Specifications

Pipeline Syntax - Complete pipeline configuration syntax
Config Format - Pipeline configuration file format and structure
Metrics - Available metrics and evaluation methods
Nested Cross-Validation - Nested CV for unbiased hyperparameter tuning
Cross-Dataset Metrics - Cross-dataset validation metrics
Group Split - Group-based data splitting strategies
Serialization - Pipeline serialization and deserialization

Explanations

SHAP Explanation - Model interpretability with SHAP values
Resampler - Wavelength resampling strategies
SNV Explanation - Standard Normal Variate transformation
PLS Study - Partial Least Squares regression analysis
Metadata Usage - Working with dataset metadata

Reference

Operator Catalog - Complete catalog of available operators
Combination Generator - Feature augmentation and preprocessing combinations
Writing Pipelines - Best practices for pipeline creation
Outputs vs Artifacts - Understanding pipeline outputs
Prediction Results - Understanding prediction results and metrics

Full documentation will be available at https://nirs4all.readthedocs.io/

Dependencies

numpy (>=1.20.0)
pandas (>=1.0.0)
scipy (>=1.5.0)
scikit-learn (>=0.24.0)
PyWavelets (>=1.1.0)
joblib (>=0.16.0)
jsonschema (>=3.2.0)
kennard-stone (>=0.5.0)
twinning (>=0.0.5)
optuna (>=2.0.0)

Optional Dependencies

tensorflow (>=2.10.0) - For TensorFlow models
torch (>=2.0.0) - For PyTorch models
keras (>=3.0.0) - For Keras models
jax (>=0.4.10) & jaxlib (>=0.4.10) - For JAX models

Research Applications

NIRS4ALL has been successfully used in published research:

Houngbo, M. E., Desfontaines, L., Diman, J. L., Arnau, G., Mestres, C., Davrieux, F., Rouan, L., Beurier, G., Marie‐Magdeleine, C., Meghar, K., Alamu, E. O., Otegbayo, B. O., & Cornet, D. (2024). Convolutional neural network allows amylose content prediction in yam (Dioscorea alata L.) flour using near infrared spectroscopy. Journal of the Science of Food and Agriculture, 104(8), 4915-4921. John Wiley & Sons, Ltd.

How to Cite

If you use NIRS4ALL in your research, please cite:

@software{beurier2025nirs4all,
  author = {Gregory Beurier and Denis Cornet and Camille Noûs and Lauriane Rouan},
  title = {nirs4all is all your nirs: Open spectroscopy for everyone},
  url = {https://github.com/gbeurier/nirs4all},
  version = {0.2.1},
  year = {2025},
}

License

This project is licensed under the CECILL-2.1 License - see the LICENSE file for details.

Acknowledgments

CIRAD for supporting this research
[LLMs] for providing fast documentation, nice charts, emojis in logs 😭, and plenty of useless tests, booby-trapped source code, and misleading specifications.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

gbeurier

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.9.1

Apr 17, 2026

0.9.0

Apr 16, 2026

0.8.11

Apr 15, 2026

0.8.10

Apr 15, 2026

0.8.9

Apr 13, 2026

0.8.7

Apr 1, 2026

0.8.6

Apr 1, 2026

0.8.5

Mar 31, 2026

0.8.4

Mar 27, 2026

0.8.3

Mar 26, 2026

0.8.2

Feb 25, 2026

0.8.1

Feb 25, 2026

0.8.0

Feb 21, 2026

0.7.1

Feb 8, 2026

0.6.2

Jan 2, 2026

0.6.1

Dec 31, 2025

0.6.0

Dec 31, 2025

0.5.1

Nov 25, 2025

0.5.0

Nov 24, 2025

0.4.2

Nov 21, 2025

This version

0.4.1

Nov 20, 2025

0.4.0

Oct 28, 2025

0.3.1

Oct 15, 2025

0.2.1

Oct 8, 2025

0.2.0

Oct 8, 2025

0.1.1

Oct 6, 2025

0.1.0

Oct 5, 2025

0.0.6

Sep 14, 2025

0.0.5

Sep 14, 2025

0.0.4

Sep 14, 2025

0.0.3

May 20, 2025

0.0.2

May 19, 2025

0.0.1

May 13, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nirs4all-0.4.1.tar.gz (369.6 kB view details)

Uploaded Nov 20, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

nirs4all-0.4.1-py3-none-any.whl (489.1 kB view details)

Uploaded Nov 20, 2025 Python 3

File details

Details for the file nirs4all-0.4.1.tar.gz.

File metadata

Download URL: nirs4all-0.4.1.tar.gz
Upload date: Nov 20, 2025
Size: 369.6 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for nirs4all-0.4.1.tar.gz
Algorithm	Hash digest
SHA256	`da4b7384a8056346a3f120eb1e6a6083a6c8946939494fdbfc06f358d0b7c6e9`
MD5	`1b8f79bf8b26d281c5bcb124f2dd8edc`
BLAKE2b-256	`e07166fa8af1f0adaf538f1776792c416cbb268abefba21774531313ff56c0a8`

See more details on using hashes here.

Provenance

The following attestation bundles were made for nirs4all-0.4.1.tar.gz:

Publisher: publish.yml on GBeurier/nirs4all

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: nirs4all-0.4.1.tar.gz
- Subject digest: da4b7384a8056346a3f120eb1e6a6083a6c8946939494fdbfc06f358d0b7c6e9
- Sigstore transparency entry: 709851248
- Sigstore integration time: Nov 20, 2025
Source repository:
- Permalink: GBeurier/nirs4all@714afcee9b6a0adbbf052f7a867a7f881f28411d
- Branch / Tag: refs/tags/0.4.1
- Owner: https://github.com/GBeurier
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@714afcee9b6a0adbbf052f7a867a7f881f28411d
- Trigger Event: release

File details

Details for the file nirs4all-0.4.1-py3-none-any.whl.

File metadata

Download URL: nirs4all-0.4.1-py3-none-any.whl
Upload date: Nov 20, 2025
Size: 489.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for nirs4all-0.4.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`08054e5b5f638915355f3dc2ef7a090ef51e6b962208ca9aefa67cee85829907`
MD5	`67bc476b702f28f35440b0d7e5585d7e`
BLAKE2b-256	`176df6fe3087294f3b1adcaa3aeb29aa8cadaf2086f66d501fa671a6a0625eb2`

See more details on using hashes here.

Provenance

The following attestation bundles were made for nirs4all-0.4.1-py3-none-any.whl:

Publisher: publish.yml on GBeurier/nirs4all

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: nirs4all-0.4.1-py3-none-any.whl
- Subject digest: 08054e5b5f638915355f3dc2ef7a090ef51e6b962208ca9aefa67cee85829907
- Sigstore transparency entry: 709851250
- Sigstore integration time: Nov 20, 2025
Source repository:
- Permalink: GBeurier/nirs4all@714afcee9b6a0adbbf052f7a867a7f881f28411d
- Branch / Tag: refs/tags/0.4.1
- Owner: https://github.com/GBeurier
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@714afcee9b6a0adbbf052f7a867a7f881f28411d
- Trigger Event: release

nirs4all 0.4.1

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

What is Near-Infrared Spectroscopy (NIRS)?

Key advantages of NIRS:

Common applications:

Notes:

Features

Installation

Basic Installation

Install TensorFlow cpu support by default

With Additional ML Frameworks

Development Installation

Installation Testing

Quick Start

Basic Pipeline Example

Advanced Pipeline with Feature Augmentation

Neural Network Integration

Hyperparameter Optimization

Visualization and Analysis

Tutorials

🚀 Tutorial 1: Beginner's Guide

🔬 Tutorial 2: Advanced Analysis

Examples

Basic Examples

Advanced Pipeline Techniques

Model Deployment & Prediction

Data Processing & Analysis

Custom Models

Documentation

User Guide

API Reference

Specifications

Explanations

Reference

Dependencies

Optional Dependencies

Research Applications

How to Cite

License

Acknowledgments

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance