Skip to main content

NIRS Analyses made easy.

Project description

NIRS4ALL: A Pipeline for NIRS Analysis ReloadeD

NIRS4ALL Logo

License: MIT Python 3.7+

NIRS4ALL is a comprehensive machine learning library specifically designed for Near-Infrared Spectroscopy (NIRS) data analysis. It bridges the gap between spectroscopic data and machine learning by providing a unified framework for data loading, preprocessing, model training, and evaluation.

NIRS4ALL Pipeline

What is Near-Infrared Spectroscopy (NIRS)?

Near-Infrared Spectroscopy (NIRS) is a rapid and non-destructive analytical technique that uses the near-infrared region of the electromagnetic spectrum (approximately 700-2500 nm). NIRS measures how near-infrared light interacts with the molecular bonds in materials, particularly C-H, N-H, and O-H bonds, providing information about the chemical composition of samples.

Key advantages of NIRS:

  • Non-destructive analysis
  • Minimal sample preparation
  • Rapid results (seconds to minutes)
  • Potential for on-line/in-line implementation
  • Simultaneous measurement of multiple parameters

Common applications:

  • Agriculture: soil analysis, crop quality assessment
  • Food industry: quality control, authenticity verification
  • Pharmaceutical: raw material verification, process monitoring
  • Medical: tissue monitoring, brain imaging
  • Environmental: pollutant detection, water quality monitoring

Features

NIRS4ALL offers a wide range of functionalities:

  1. Spectrum Preprocessing:

    • Baseline correction
    • Standard normal variate (SNV)
    • Robust normal variate
    • Savitzky-Golay filtering
    • Normalization
    • Detrending
    • Multiplicative scatter correction
    • Derivative computation
    • Gaussian filtering
    • Haar wavelet transformation
    • And more
  2. Data Splitting Methods:

    • Kennard Stone
    • SPXY
    • Random sampling
    • Stratified sampling
    • K-means
    • And more
  3. Model Integration:

    • Scikit-learn models
    • TensorFlow/Keras models
    • PyTorch models (via extensions)
    • JAX models (via extensions)
  4. Model Fine-tuning:

    • Hyperparameter optimization with Optuna
    • Grid search and random search
    • Cross-validation strategies
  5. Visualization:

    • Preprocessing effect visualization
    • Model performance visualization
    • Feature importance analysis
    • Classification metrics
    • Residual analysis

Installation

Basic Installation

pip install nirs4all

With Additional ML Frameworks

# With TensorFlow support
pip install nirs4all[tf]

# With PyTorch support
pip install nirs4all[torch]

# With Keras support
pip install nirs4all[keras]

# With JAX support
pip install nirs4all[jax]

# With all ML frameworks
pip install nirs4all[all]

Development Installation

For developers who want to contribute:

git clone https://github.com/gbeurier/nirs4all.git
cd nirs4all
pip install -e .[dev]

Quick Start

import numpy as np
import matplotlib.pyplot as plt
from nirs4all.data.dataset_loader import get_dataset
from nirs4all.transformations import StandardNormalVariate as SNV, SavitzkyGolay as SG
from nirs4all.core.runner import ExperimentRunner
from nirs4all.core.config import Config
from sklearn.model_selection import RepeatedKFold
from sklearn.preprocessing import MinMaxScaler, RobustScaler
from sklearn.cross_decomposition import PLSRegression

# Define a simple processing pipeline
pipeline = [
    RobustScaler(),  # Scale the data
    {"split": RepeatedKFold(n_splits=3, n_repeats=1)},  # Define cross-validation splits
    {"features": [None, SG, SNV, [SG, SNV]},  # Provide 4 versions of the spectra (original, Savitzky-Golay, SNV, Savgol then SNV)
    MinMaxScaler()  # Scale the data again after splitting
]

# Define scaler for y
y_scaler = MinMaxScaler()

# Create a configuration
config = Config("path/to/your/data", pipeline, y_scaler, PLSRegression(n_components=10), None, 42)

# Run the experiment
runner = ExperimentRunner(config)
datasets, predictions, scores, _ = runner.run()

# Print results
print("Model Performance:")
for i, score in enumerate(scores):
    print(f"Model {i+1}:")
    for j, fold_score in enumerate(score[:-3]):
        print(f"  Fold {j+1}: {fold_score}")
    print(f"  Mean: {score[-3]}")
    print(f"  Best: {score[-2]}")
    print(f"  Weighted Mean: {score[-1]}")

Advanced Usage

For more advanced usage, please refer to the comprehensive walkthrough notebook which covers:

  1. Data Loading and Exploration
  2. Basic Processing Pipeline
  3. Training scikit-learn Models
  4. Training TensorFlow Models
  5. Fine-tuning Models
  6. Advanced Pipeline with Custom Transformations
  7. Running Multiple Configurations in Parallel
  8. Advanced Data Visualization
  9. Transformation Effects Visualization
  10. Model Performance Analysis
  11. Feature Importance Analysis
  12. Prediction Visualization
  13. Classification Metrics
  14. Residual Analysis
  15. Model Deployment

Documentation

Detailed documentation is available at https://nirs4all.readthedocs.io/

Dependencies

  • numpy (>=1.20.0)
  • pandas (>=1.0.0)
  • scipy (>=1.5.0)
  • scikit-learn (>=0.24.0)
  • PyWavelets (>=1.1.0)
  • joblib (>=0.16.0)
  • jsonschema (>=3.2.0)
  • kennard-stone (>=0.5.0)
  • twinning (>=0.0.5)
  • optuna (>=2.0.0)

Optional Dependencies

  • tensorflow (>=2.10.0) - For TensorFlow models
  • torch (>=2.0.0) - For PyTorch models
  • keras (>=3.0.0) - For Keras models
  • jax (>=0.4.10) & jaxlib (>=0.4.10) - For JAX models

How to Cite

If you use NIRS4ALL in your research, please cite:

@software{beurier2025nirs4all,
  author = {Gregory Beurier},
  title = {NIRS4ALL: A Pipeline for NIRS Analysis ReloadeD},
  url = {https://github.com/gbeurier/nirs4all},
  version = {0.0.1},
  year = {2025},
}

License

This project is licensed under the CECILL-2.1 License - see the LICENSE file for details.

Acknowledgments

  • CIRAD for supporting this research

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nirs4all-0.0.1.tar.gz (116.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nirs4all-0.0.1-py3-none-any.whl (160.7 kB view details)

Uploaded Python 3

File details

Details for the file nirs4all-0.0.1.tar.gz.

File metadata

  • Download URL: nirs4all-0.0.1.tar.gz
  • Upload date:
  • Size: 116.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for nirs4all-0.0.1.tar.gz
Algorithm Hash digest
SHA256 0ba4388b25e51fddf5b622223be618e0e46275eb393d445ebf24537a8b562ca5
MD5 d9472992bb56d6239964d5053ea6cd84
BLAKE2b-256 aa796be30a523be9910c6d480608e02528774d00b08012661fe8c4b2c58ecf59

See more details on using hashes here.

Provenance

The following attestation bundles were made for nirs4all-0.0.1.tar.gz:

Publisher: publish.yml on GBeurier/nirs4all

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file nirs4all-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: nirs4all-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 160.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for nirs4all-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 393131fe75d95fc998b544dee71af0319ac979f7a8209a5e1f8d72149854a351
MD5 e15b5e6ccedb91355427e83bf2346943
BLAKE2b-256 a88e110210bca2286004d2b98f9fad48575e31f7cbce69f4a7539512d0791697

See more details on using hashes here.

Provenance

The following attestation bundles were made for nirs4all-0.0.1-py3-none-any.whl:

Publisher: publish.yml on GBeurier/nirs4all

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page